How to know killed by pg_terminate_backend
Hi,
If a backend killed by pg_terminate_backend(), the backend returns
57P01 which is identical to the one when it's killed by postmaster.
Problem is, pgpool-II needs to trigger failover if postmaster goes
down because apparently pgpool-II cannot use the PostgreSQL server
anymore.
On the otherhand, pg_terminate_backend() just terminates a backend. So
triggering failover is overkill.
Maybe we could make PostgreSQL a little bit smarter so that it returns
a different code than 57P01 when killed by pg_terminate_backend().
Comments?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
Tatsuo Ishii wrote:
If a backend killed by pg_terminate_backend(), the backend returns
57P01 which is identical to the one when it's killed by postmaster.Problem is, pgpool-II needs to trigger failover if postmaster goes
down because apparently pgpool-II cannot use the PostgreSQL server
anymore.On the otherhand, pg_terminate_backend() just terminates a backend. So
triggering failover is overkill.Maybe we could make PostgreSQL a little bit smarter so that it returns
a different code than 57P01 when killed by pg_terminate_backend().
Seems reasonable. Does the victim backend currently know why it has been
killed?
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Maybe we could make PostgreSQL a little bit smarter so that it returns
a different code than 57P01 when killed by pg_terminate_backend().Seems reasonable. Does the victim backend currently know why it has been
killed?
I don't think so.
One idea is postmaster sets a flag in the shared memory area
indicating it rceived SIGTERM before forwarding the signal to
backends.
Backend check the flag and if it's not set, it knows that the signal
has been sent by pg_terminate_backend(), not postmaster.
What about new error code:
#define ERRCODE_BACKEND_STOP_REQUEST MAKE_SQLSTATE('5','7', 'P','0','4')
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
On Thu, May 13, 2010 at 8:20 PM, Tatsuo Ishii <ishii@postgresql.org> wrote:
Maybe we could make PostgreSQL a little bit smarter so that it returns
a different code than 57P01 when killed by pg_terminate_backend().Seems reasonable. Does the victim backend currently know why it has been
killed?I don't think so.
One idea is postmaster sets a flag in the shared memory area
indicating it rceived SIGTERM before forwarding the signal to
backends.Backend check the flag and if it's not set, it knows that the signal
has been sent by pg_terminate_backend(), not postmaster.
Or it could also be sent by some other user process, like the user
running "kill" from the shell.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
Seems reasonable. Does the victim backend currently know why it has been
killed?I don't think so.
One idea is postmaster sets a flag in the shared memory area
indicating it rceived SIGTERM before forwarding the signal to
backends.Backend check the flag and if it's not set, it knows that the signal
has been sent by pg_terminate_backend(), not postmaster.Or it could also be sent by some other user process, like the user
running "kill" from the shell.
No problem (at least for pgpool-II).
If the flag is not set, postgres returns the same code as the one
killed by pg_terminate_backend(). The point is, backend is killed by
postmaster or not. Because if backend was killed by postmaster,
pgpool-II should not expect the PostgreSQL server is usable since
postmaster decided to shutdown.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
Seems reasonable. Does the victim backend currently know why it has been
killed?I don't think so.
One idea is postmaster sets a flag in the shared memory area
indicating it rceived SIGTERM before forwarding the signal to
backends.Backend check the flag and if it's not set, it knows that the signal
has been sent by pg_terminate_backend(), not postmaster.Or it could also be sent by some other user process, like the user
running "kill" from the shell.No problem (at least for pgpool-II).
If the flag is not set, postgres returns the same code as the one
killed by pg_terminate_backend(). The point is, backend is killed by
postmaster or not. Because if backend was killed by postmaster,
pgpool-II should not expect the PostgreSQL server is usable since
postmaster decided to shutdown.
Here is the patch to implement the feature.
1) pg_terminate_backend() sends SIGUSR1 signal rather than SIGTERM to
the target backend.
2) The infrastructure used for message passing is
storage/ipc/procsignal.c The new message type for ProcSignalReason
is "PROCSIG_TERMNINATE_BACKEND_INTERRUPT"
3) I assign new error code 57P04 which is returned from the backend
killed by pg_terminate_backend().
#define ERRCODE_TERMINATE_BACKEND MAKE_SQLSTATE('5','7', 'P','0','4')
Comments are welcome.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
Attachments:
pg_terminate_backend.patchtext/x-patch; charset=us-asciiDownload
*** a/src/backend/storage/ipc/procsignal.c
--- b/src/backend/storage/ipc/procsignal.c
***************
*** 279,284 **** procsignal_sigusr1_handler(SIGNAL_ARGS)
--- 279,287 ----
if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
+ if (CheckProcSignal(PROCSIG_TERMNINATE_BACKEND_INTERRUPT))
+ HandleTerminateBackendInterrupt();
+
latch_sigusr1_handler();
errno = save_errno;
*** a/src/backend/tcop/postgres.c
--- b/src/backend/tcop/postgres.c
***************
*** 184,189 **** static bool RecoveryConflictPending = false;
--- 184,195 ----
static bool RecoveryConflictRetryable = true;
static ProcSignalReason RecoveryConflictReason;
+ /*
+ * True if backend is being killed by pg_terminate_backend().
+ * Set by HandleTerminateBackendInterrupt() upon received SIGUSR1.
+ */
+ static bool TerminateBackendRequest = false;
+
/* ----------------------------------------------------------------
* decls for routines only used in this file
* ----------------------------------------------------------------
***************
*** 2875,2880 **** RecoveryConflictInterrupt(ProcSignalReason reason)
--- 2881,2924 ----
}
/*
+ * HandleTerminateBackendInterrupt: out-of-line portion of terminate backend
+ * handling following receipt of SIGUSR1. Designed to be similar to die().
+ * Called only by a normal user backend.
+ */
+ void
+ HandleTerminateBackendInterrupt(void)
+ {
+ int save_errno = errno;
+
+ /* Don't joggle the elbow of proc_exit */
+ if (!proc_exit_inprogress)
+ {
+ InterruptPending = true;
+ ProcDiePending = true;
+ TerminateBackendRequest = true;
+
+ /*
+ * If it's safe to interrupt, and we're waiting for input or a lock,
+ * service the interrupt immediately
+ */
+ if (ImmediateInterruptOK && InterruptHoldoffCount == 0 &&
+ CritSectionCount == 0)
+ {
+ /* bump holdoff count to make ProcessInterrupts() a no-op */
+ /* until we are done getting ready for it */
+ InterruptHoldoffCount++;
+ LockWaitCancel(); /* prevent CheckDeadLock from running */
+ DisableNotifyInterrupt();
+ DisableCatchupInterrupt();
+ InterruptHoldoffCount--;
+ ProcessInterrupts();
+ }
+ }
+
+ errno = save_errno;
+ }
+
+ /*
* ProcessInterrupts: out-of-line portion of CHECK_FOR_INTERRUPTS() macro
*
* If an interrupt condition is pending, and it's safe to service it,
***************
*** 2912,2917 **** ProcessInterrupts(void)
--- 2956,2966 ----
(errcode(ERRCODE_ADMIN_SHUTDOWN),
errmsg("terminating connection due to conflict with recovery"),
errdetail_recovery_conflict()));
+ else if (TerminateBackendRequest)
+ ereport(FATAL,
+ (errcode(ERRCODE_TERMINATE_BACKEND),
+ errmsg("terminating connection due to pg_terminate_backend")));
+
else
ereport(FATAL,
(errcode(ERRCODE_ADMIN_SHUTDOWN),
*** a/src/backend/utils/adt/misc.c
--- b/src/backend/utils/adt/misc.c
***************
*** 114,120 **** pg_cancel_backend(PG_FUNCTION_ARGS)
Datum
pg_terminate_backend(PG_FUNCTION_ARGS)
{
! PG_RETURN_BOOL(pg_signal_backend(PG_GETARG_INT32(0), SIGTERM));
}
Datum
--- 114,122 ----
Datum
pg_terminate_backend(PG_FUNCTION_ARGS)
{
! PG_RETURN_BOOL(
! SendProcSignal(PG_GETARG_INT32(0), PROCSIG_TERMNINATE_BACKEND_INTERRUPT,
! InvalidBackendId) == 0);
}
Datum
*** a/src/include/storage/procsignal.h
--- b/src/include/storage/procsignal.h
***************
*** 40,45 **** typedef enum
--- 40,47 ----
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN,
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK,
+ PROCSIG_TERMNINATE_BACKEND_INTERRUPT, /* terminate request from pg_terminate_backend() */
+
NUM_PROCSIGNALS /* Must be last! */
} ProcSignalReason;
***************
*** 55,58 **** extern int SendProcSignal(pid_t pid, ProcSignalReason reason,
--- 57,62 ----
extern void procsignal_sigusr1_handler(SIGNAL_ARGS);
+ extern void HandleTerminateBackendInterrupt(void);
+
#endif /* PROCSIGNAL_H */
*** a/src/include/utils/errcodes.h
--- b/src/include/utils/errcodes.h
***************
*** 332,337 ****
--- 332,338 ----
#define ERRCODE_ADMIN_SHUTDOWN MAKE_SQLSTATE('5','7', 'P','0','1')
#define ERRCODE_CRASH_SHUTDOWN MAKE_SQLSTATE('5','7', 'P','0','2')
#define ERRCODE_CANNOT_CONNECT_NOW MAKE_SQLSTATE('5','7', 'P','0','3')
+ #define ERRCODE_TERMINATE_BACKEND MAKE_SQLSTATE('5','7', 'P','0','4')
/* Class 58 - System Error (class borrowed from DB2) */
/* (we define this as errors external to PostgreSQL itself) */
Seems reasonable. Does the victim backend currently know why it has been
killed?I don't think so.
One idea is postmaster sets a flag in the shared memory area
indicating it rceived SIGTERM before forwarding the signal to
backends.Backend check the flag and if it's not set, it knows that the signal
has been sent by pg_terminate_backend(), not postmaster.Or it could also be sent by some other user process, like the user
running "kill" from the shell.No problem (at least for pgpool-II).
If the flag is not set, postgres returns the same code as the one
killed by pg_terminate_backend(). The point is, backend is killed by
postmaster or not. Because if backend was killed by postmaster,
pgpool-II should not expect the PostgreSQL server is usable since
postmaster decided to shutdown.
Here is the patch to implement the feature.
1) pg_terminate_backend() sends SIGUSR1 signal rather than SIGTERM to
the target backend.
2) The infrastructure used for message passing is
storage/ipc/procsignal.c The new message type for ProcSignalReason
is "PROCSIG_TERMNINATE_BACKEND_INTERRUPT"
3) I assign new error code 57P04 which is returned from the backend
killed by pg_terminate_backend().
#define ERRCODE_TERMINATE_BACKEND MAKE_SQLSTATE('5','7', 'P','0','4')
Comments are welcome.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
Attachments:
pg_terminate_backend.patchtext/x-patch; charset=us-asciiDownload
*** a/src/backend/storage/ipc/procsignal.c
--- b/src/backend/storage/ipc/procsignal.c
***************
*** 279,284 **** procsignal_sigusr1_handler(SIGNAL_ARGS)
--- 279,287 ----
if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
+ if (CheckProcSignal(PROCSIG_TERMNINATE_BACKEND_INTERRUPT))
+ HandleTerminateBackendInterrupt();
+
latch_sigusr1_handler();
errno = save_errno;
*** a/src/backend/tcop/postgres.c
--- b/src/backend/tcop/postgres.c
***************
*** 184,189 **** static bool RecoveryConflictPending = false;
--- 184,195 ----
static bool RecoveryConflictRetryable = true;
static ProcSignalReason RecoveryConflictReason;
+ /*
+ * True if backend is being killed by pg_terminate_backend().
+ * Set by HandleTerminateBackendInterrupt() upon received SIGUSR1.
+ */
+ static bool TerminateBackendRequest = false;
+
/* ----------------------------------------------------------------
* decls for routines only used in this file
* ----------------------------------------------------------------
***************
*** 2875,2880 **** RecoveryConflictInterrupt(ProcSignalReason reason)
--- 2881,2924 ----
}
/*
+ * HandleTerminateBackendInterrupt: out-of-line portion of terminate backend
+ * handling following receipt of SIGUSR1. Designed to be similar to die().
+ * Called only by a normal user backend.
+ */
+ void
+ HandleTerminateBackendInterrupt(void)
+ {
+ int save_errno = errno;
+
+ /* Don't joggle the elbow of proc_exit */
+ if (!proc_exit_inprogress)
+ {
+ InterruptPending = true;
+ ProcDiePending = true;
+ TerminateBackendRequest = true;
+
+ /*
+ * If it's safe to interrupt, and we're waiting for input or a lock,
+ * service the interrupt immediately
+ */
+ if (ImmediateInterruptOK && InterruptHoldoffCount == 0 &&
+ CritSectionCount == 0)
+ {
+ /* bump holdoff count to make ProcessInterrupts() a no-op */
+ /* until we are done getting ready for it */
+ InterruptHoldoffCount++;
+ LockWaitCancel(); /* prevent CheckDeadLock from running */
+ DisableNotifyInterrupt();
+ DisableCatchupInterrupt();
+ InterruptHoldoffCount--;
+ ProcessInterrupts();
+ }
+ }
+
+ errno = save_errno;
+ }
+
+ /*
* ProcessInterrupts: out-of-line portion of CHECK_FOR_INTERRUPTS() macro
*
* If an interrupt condition is pending, and it's safe to service it,
***************
*** 2912,2917 **** ProcessInterrupts(void)
--- 2956,2966 ----
(errcode(ERRCODE_ADMIN_SHUTDOWN),
errmsg("terminating connection due to conflict with recovery"),
errdetail_recovery_conflict()));
+ else if (TerminateBackendRequest)
+ ereport(FATAL,
+ (errcode(ERRCODE_TERMINATE_BACKEND),
+ errmsg("terminating connection due to pg_terminate_backend")));
+
else
ereport(FATAL,
(errcode(ERRCODE_ADMIN_SHUTDOWN),
*** a/src/backend/utils/adt/misc.c
--- b/src/backend/utils/adt/misc.c
***************
*** 114,120 **** pg_cancel_backend(PG_FUNCTION_ARGS)
Datum
pg_terminate_backend(PG_FUNCTION_ARGS)
{
! PG_RETURN_BOOL(pg_signal_backend(PG_GETARG_INT32(0), SIGTERM));
}
Datum
--- 114,122 ----
Datum
pg_terminate_backend(PG_FUNCTION_ARGS)
{
! PG_RETURN_BOOL(
! SendProcSignal(PG_GETARG_INT32(0), PROCSIG_TERMNINATE_BACKEND_INTERRUPT,
! InvalidBackendId) == 0);
}
Datum
*** a/src/include/storage/procsignal.h
--- b/src/include/storage/procsignal.h
***************
*** 40,45 **** typedef enum
--- 40,47 ----
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN,
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK,
+ PROCSIG_TERMNINATE_BACKEND_INTERRUPT, /* terminate request from pg_terminate_backend() */
+
NUM_PROCSIGNALS /* Must be last! */
} ProcSignalReason;
***************
*** 55,58 **** extern int SendProcSignal(pid_t pid, ProcSignalReason reason,
--- 57,62 ----
extern void procsignal_sigusr1_handler(SIGNAL_ARGS);
+ extern void HandleTerminateBackendInterrupt(void);
+
#endif /* PROCSIGNAL_H */
*** a/src/include/utils/errcodes.h
--- b/src/include/utils/errcodes.h
***************
*** 332,337 ****
--- 332,338 ----
#define ERRCODE_ADMIN_SHUTDOWN MAKE_SQLSTATE('5','7', 'P','0','1')
#define ERRCODE_CRASH_SHUTDOWN MAKE_SQLSTATE('5','7', 'P','0','2')
#define ERRCODE_CANNOT_CONNECT_NOW MAKE_SQLSTATE('5','7', 'P','0','3')
+ #define ERRCODE_TERMINATE_BACKEND MAKE_SQLSTATE('5','7', 'P','0','4')
/* Class 58 - System Error (class borrowed from DB2) */
/* (we define this as errors external to PostgreSQL itself) */
Tatsuo Ishii <ishii@postgresql.org> writes:
Comments are welcome.
This is a bad idea. It makes an already-poorly-tested code path
significantly more fragile, in return for nothing of value.
regards, tom lane
Tatsuo Ishii <ishii@postgresql.org> writes:
Comments are welcome.
This is a bad idea. It makes an already-poorly-tested code path
significantly more fragile, in return for nothing of value.
Are you saying that procsignal.c is the already-poorly-tested one? If
so, why?
As for "value", I have already explained why we need this in the
upthread.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
Here is the patch to implement the feature.
1) pg_terminate_backend() sends SIGUSR1 signal rather than SIGTERM to
the target backend.
2) The infrastructure used for message passing is
storage/ipc/procsignal.c The new message type for ProcSignalReason
is "PROCSIG_TERMNINATE_BACKEND_INTERRUPT"
3) I assign new error code 57P04 which is returned from the backend
killed by pg_terminate_backend().#define ERRCODE_TERMINATE_BACKEND MAKE_SQLSTATE('5','7', 'P','0','4')
Comments are welcome.
Anyone has better idea? Tom dislikes my patch but I don't know how to
deal with it.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
On Fri, Jan 21, 2011 at 13:56, Tatsuo Ishii <ishii@postgresql.org> wrote:
Here is the patch to implement the feature.
1) pg_terminate_backend() sends SIGUSR1 signal rather than SIGTERM to
the target backend.
2) The infrastructure used for message passing is
storage/ipc/procsignal.c The new message type for ProcSignalReason
is "PROCSIG_TERMNINATE_BACKEND_INTERRUPT"
3) I assign new error code 57P04 which is returned from the backend
killed by pg_terminate_backend().#define ERRCODE_TERMINATE_BACKEND MAKE_SQLSTATE('5','7', 'P','0','4')
Anyone has better idea? Tom dislikes my patch but I don't know how to
deal with it.
There was another design in the past discussion:
One idea is postmaster sets a flag in the shared memory area
indicating it rceived SIGTERM before forwarding the signal to
backends.
Is it enough for your purpose and do we think it is more robust way?
--
Itagaki Takahiro
Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:
On Fri, Jan 21, 2011 at 13:56, Tatsuo Ishii <ishii@postgresql.org> wrote:
Anyone has better idea? Tom dislikes my patch but I don't know how to
deal with it.
There was another design in the past discussion:
One idea is postmaster sets a flag in the shared memory area
indicating it rceived SIGTERM before forwarding the signal to
backends.
Is it enough for your purpose and do we think it is more robust way?
To put this as briefly as possible: I don't want to add even one line of
code to distinguish pg_terminate_backend from database-wide shutdown.
That function should be a last-ditch tool, not something used on a daily
basis. So I disagree with the premise as much as with any particular
implementation.
regards, tom lane
On Fri, Jan 21, 2011 at 10:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:
On Fri, Jan 21, 2011 at 13:56, Tatsuo Ishii <ishii@postgresql.org> wrote:
Anyone has better idea? Tom dislikes my patch but I don't know how to
deal with it.There was another design in the past discussion:
One idea is postmaster sets a flag in the shared memory area
indicating it rceived SIGTERM before forwarding the signal to
backends.Is it enough for your purpose and do we think it is more robust way?
To put this as briefly as possible: I don't want to add even one line of
code to distinguish pg_terminate_backend from database-wide shutdown.
That function should be a last-ditch tool, not something used on a daily
basis. So I disagree with the premise as much as with any particular
implementation.
Well, that seems awfully unfriendly.
Frequency of use is beside the point - people are trying to write
client applications - like pgpool-II - that understand the behavior of
PG. If we send the same error code in two different situations with
different behaviors, such applications have to do so silly workarounds
to figure out what really happened.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company