PostgreSQL shutdown modes

Started by Robert Haasalmost 4 years ago7 messages
#1Robert Haas
robertmhaas@gmail.com
1 attachment(s)

Hi,

I think it's pretty evident that the names we've chosen for the
various PostgreSQL shutdown modes are pretty terrible, and maybe we
should try to do something about that. There is nothing "smart" about
a smart shutdown. The usual result of attempting a smart shutdown is
that the server never shuts down at all, because typically there are
going to be some applications using connections that are kept open
more or less permanently. What ends up happening when you attempt a
"smart" shutdown is that you've basically put the server into a mode
where you're irreversibly committed to accepting no new connections,
but because you have a connection pooler or something that keeps
connections open forever, you never shut down either. It is in effect
a denial-of-service attack on the database you're supposed to be
administering.

Similarly, "fast" shutdowns are not in any way fast. It is pretty
common for a fast shutdown to take many minutes or even tens of
minutes to complete. This doesn't require some kind of extreme
workload to hit; I've run into it during casual benchmarking runs.
It's very easy to have enough dirty data in shared buffers, or enough
dirty in the operating system cache that will have to be fsync'd in
order to complete the shutdown checkpoint, to make things take an
extremely long time. In some ways, this is an even more effective
denial-of-service attack than a smart shutdown. True, the database
will at some point actually finish shutting down, but in the meantime
not only will we not accept new connections but we'll evict all of the
existing ones. Good luck maintaining five nines of availability if
waiting for a clean shutdown to complete is any part of the process.
It might be smarter to initiate a regular (non-shutdown) checkpoint
first, without cutting off connections, and then when that finishes,
proceed as we do now. The second checkpoint will complete a lot
faster, so while the overall operation still won't be fast, at least
we'd be refusing connections for a shorter period of time before the
system is actually shut down and you can do whatever maintenance you
need to do.

"immediate" shutdowns aren't as bad as the other two, but they're
still bad. One of the big problems is that I encounter in this area is
that Oracle uses the name "immediate" shutdown to mean a normal
shutdown with a checkpoint allowing for a clean restart. Users coming
from Oracle are sometimes extremely surprised to discover that an
immediate shutdown is actually a server crash that will require
recovery. Even if you don't come from Oracle, there's really nothing
about the name of this shutdown mode that intrinsically makes you
understand that it's something you should do only as a last resort.
Who doesn't like things that are immediate? The problem with this
theory is that you make the shutdown quicker at the price of startup
becoming much, much slower, because the crash recovery is very likely
going to take a whole lot longer than the shutdown checkpoint would
have done.

I attach herewith a modest patch to rename these shutdown modes to
more accurately correspond to their actual characteristics.

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachments:

v1-0001-Give-our-various-shutdown-types-more-appropriate-.patchapplication/octet-stream; name=v1-0001-Give-our-various-shutdown-types-more-appropriate-.patchDownload
From 8a851977db170d36fff79d694b68aa9411c6644f Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Fri, 1 Apr 2022 12:50:05 -0400
Subject: [PATCH v1] Give our various shutdown types more appropriate names.

---
 contrib/auto_explain/t/001_auto_explain.pl    |  2 +-
 doc/src/sgml/config.sgml                      |  4 +-
 doc/src/sgml/high-availability.sgml           |  2 +-
 doc/src/sgml/monitoring.sgml                  |  2 +-
 doc/src/sgml/ref/pg_ctl-ref.sgml              |  8 +--
 doc/src/sgml/runtime.sgml                     |  2 +-
 src/backend/postmaster/bgworker.c             |  2 +-
 src/backend/postmaster/postmaster.c           | 64 +++++++++----------
 src/bin/pg_ctl/pg_ctl.c                       | 40 ++++++------
 src/bin/pg_dump/t/002_pg_dump.pl              |  2 +-
 src/bin/pg_rewind/t/007_standby_source.pl     |  2 +-
 src/bin/pg_rewind/t/008_min_recovery_point.pl |  6 +-
 src/bin/pg_rewind/t/RewindTest.pm             |  2 +-
 src/bin/pg_upgrade/server.c                   |  2 +-
 src/test/modules/commit_ts/t/001_base.pl      |  2 +-
 src/test/modules/commit_ts/t/004_restart.pl   | 10 +--
 .../libpq_pipeline/t/001_libpq_pipeline.pl    |  2 +-
 .../ssl_passphrase_callback/t/001_testfunc.pl |  6 +-
 .../test_misc/t/001_constraint_validation.pl  |  2 +-
 src/test/modules/test_pg_dump/t/001_base.pl   |  2 +-
 src/test/perl/PostgreSQL/Test/Cluster.pm      |  8 +--
 src/test/perl/README                          |  2 +-
 .../t/010_logical_decoding_timelines.pl       |  2 +-
 src/test/recovery/t/011_crash_recovery.pl     |  2 +-
 src/test/recovery/t/014_unlogged_reinit.pl    |  2 +-
 src/test/recovery/t/015_promotion_pages.pl    |  2 +-
 src/test/recovery/t/016_min_consistency.pl    |  4 +-
 src/test/recovery/t/017_shm.pl                |  2 +-
 src/test/recovery/t/018_wal_optimize.pl       | 38 +++++------
 src/test/recovery/t/019_replslot_limit.pl     |  8 +--
 src/test/recovery/t/020_archive_status.pl     |  4 +-
 src/test/recovery/t/023_pitr_prepared_xact.pl |  2 +-
 .../recovery/t/026_overwrite_contrecord.pl    |  4 +-
 src/test/recovery/t/028_pitr_timelines.pl     |  2 +-
 src/test/subscription/t/001_rep_changes.pl    |  6 +-
 src/test/subscription/t/002_types.pl          |  4 +-
 src/test/subscription/t/003_constraints.pl    |  4 +-
 src/test/subscription/t/004_sync.pl           |  4 +-
 src/test/subscription/t/014_binary.pl         |  4 +-
 src/test/subscription/t/020_messages.pl       |  4 +-
 src/test/subscription/t/021_twophase.pl       | 16 ++---
 .../subscription/t/022_twophase_cascade.pl    |  6 +-
 .../subscription/t/023_twophase_stream.pl     |  8 +--
 src/test/subscription/t/024_add_drop_pub.pl   |  4 +-
 .../t/025_rep_changes_for_schema.pl           |  4 +-
 src/test/subscription/t/026_stats.pl          |  4 +-
 src/test/subscription/t/028_row_filter.pl     |  4 +-
 src/test/subscription/t/031_column_list.pl    |  4 +-
 src/test/subscription/t/100_bugs.pl           | 16 ++---
 49 files changed, 169 insertions(+), 169 deletions(-)

diff --git a/contrib/auto_explain/t/001_auto_explain.pl b/contrib/auto_explain/t/001_auto_explain.pl
index 82e4d9d15c..99ac817e68 100644
--- a/contrib/auto_explain/t/001_auto_explain.pl
+++ b/contrib/auto_explain/t/001_auto_explain.pl
@@ -28,7 +28,7 @@ $node->safe_psql("postgres", "SELECT * FROM pg_proc;");
 $node->safe_psql("postgres",
 	"SELECT * FROM pg_class WHERE relname = 'pg_class';");
 
-$node->stop('fast');
+$node->stop('slow');
 
 my $log = $node->logfile();
 
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 43e4ade83e..dec01f4ed7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2753,7 +2753,7 @@ include_dir 'conf.d'
         data to support WAL archiving and replication, including running
         read-only queries on a standby server. <literal>minimal</literal> removes all
         logging except the information required to recover from a crash or
-        immediate shutdown.  Finally,
+        a crappy shutdown.  Finally,
         <literal>logical</literal> adds information necessary to support logical
         decoding.  Each level includes the information logged at all lower
         levels.  This parameter can only be set at server start.
@@ -4032,7 +4032,7 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
        <para>
         Note that because <filename>recovery.signal</filename> will not be
         removed when <varname>recovery_target_action</varname> is set to <literal>shutdown</literal>,
-        any subsequent start will end with immediate shutdown unless the
+        any subsequent start will end with a crappy shutdown unless the
         configuration is changed or the <filename>recovery.signal</filename>
         file is removed manually.
        </para>
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index 81fa26f985..1c697f34c8 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -1176,7 +1176,7 @@ primary_slot_name = 'node_a_slot'
    </para>
 
    <para>
-    Users will stop waiting if a fast shutdown is requested.  However, as
+    Users will stop waiting if a slow shutdown is requested.  However, as
     when using asynchronous replication, the server will not fully
     shutdown until all outstanding WAL records are transferred to the currently
     connected standby servers.
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 3b9172f65b..d8278d07a2 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -211,7 +211,7 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
    When the server shuts down cleanly, a permanent copy of the statistics
    data is stored in the <filename>pg_stat</filename> subdirectory, so that
    statistics can be retained across server restarts.  When recovery is
-   performed at server start (e.g., after immediate shutdown, server crash,
+   performed at server start (e.g., after crappy shutdown, server crash,
    and point-in-time recovery), all statistics counters are reset.
   </para>
 
diff --git a/doc/src/sgml/ref/pg_ctl-ref.sgml b/doc/src/sgml/ref/pg_ctl-ref.sgml
index 3946fa52ea..bbd1ca10fc 100644
--- a/doc/src/sgml/ref/pg_ctl-ref.sgml
+++ b/doc/src/sgml/ref/pg_ctl-ref.sgml
@@ -311,9 +311,9 @@ PostgreSQL documentation
       <listitem>
        <para>
         Specifies the shutdown mode.  <replaceable>mode</replaceable>
-        can be <literal>smart</literal>, <literal>fast</literal>, or
-        <literal>immediate</literal>, or the first letter of one of
-        these three.  If this option is omitted, <literal>fast</literal> is
+        can be <literal>dumb</literal>, <literal>slow</literal>, or
+        <literal>crappy</literal>, or the first letter of one of
+        these three.  If this option is omitted, <literal>slow</literal> is
         the default.
        </para>
       </listitem>
@@ -657,7 +657,7 @@ PostgreSQL documentation
     The <option>-m</option> option allows control over
     <emphasis>how</emphasis> the server shuts down:
 <screen>
-<prompt>$</prompt> <userinput>pg_ctl stop -m smart</userinput>
+<prompt>$</prompt> <userinput>pg_ctl stop -m dumb</userinput>
 </screen></para>
   </refsect2>
 
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 1f021ea116..65afffdfeb 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -1556,7 +1556,7 @@ $ <userinput>cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages</userinp
        until online backup mode is no longer active.  While backup mode is
        active, new connections will still be allowed, but only to superusers
        (this exception allows a superuser to connect to terminate
-       online backup mode).  If the server is in recovery when a smart
+       online backup mode).  If the server is in recovery when a dumb
        shutdown is requested, recovery and streaming replication will be
        stopped only after all regular sessions have terminated.
       </para>
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 30682b63b3..768fb2d571 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -528,7 +528,7 @@ BackgroundWorkerStopNotifications(pid_t pid)
 /*
  * Cancel any not-yet-started worker requests that have waiting processes.
  *
- * This is called during a normal ("smart" or "fast") database shutdown.
+ * This is called during a normal ("dumb" or "slow") database shutdown.
  * After this point, no new background workers will be started, so anything
  * that might be waiting for them needs to be kicked off its wait.  We do
  * that by canceling the bgworker registration entirely, which is perhaps
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 80bb269599..c0b417260f 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -271,9 +271,9 @@ static StartupStatusEnum StartupStatus = STARTUP_NOT_RUNNING;
 
 /* Startup/shutdown state */
 #define			NoShutdown		0
-#define			SmartShutdown	1
-#define			FastShutdown	2
-#define			ImmediateShutdown	3
+#define			DumbShutdown	1
+#define			SlowShutdown	2
+#define			CrappyShutdown	3
 
 static int	Shutdown = NoShutdown;
 
@@ -340,7 +340,7 @@ typedef enum
 static PMState pmState = PM_INIT;
 
 /*
- * While performing a "smart shutdown", we restrict new connections but stay
+ * While performing a dumb shutdown, we restrict new connections but stay
  * in PM_RUN or PM_HOT_STANDBY state until all the client backends are gone.
  * connsAllowed is a sub-state indicator showing the active restriction.
  * It is of no interest unless pmState is PM_RUN or PM_HOT_STANDBY.
@@ -1904,7 +1904,7 @@ ServerLoop(void)
 		 *
 		 * Note we also do this during recovery from a process crash.
 		 */
-		if ((Shutdown >= ImmediateShutdown || (FatalError && !SendStop)) &&
+		if ((Shutdown >= CrappyShutdown || (FatalError && !SendStop)) &&
 			AbortStartTime != 0 &&
 			(now - AbortStartTime) >= SIGKILL_CHILDREN_AFTER_SECS)
 		{
@@ -1931,7 +1931,7 @@ ServerLoop(void)
 			if (!RecheckDataDirLockFile())
 			{
 				ereport(LOG,
-						(errmsg("performing immediate shutdown because data directory lock file is invalid")));
+						(errmsg("performing crappy shutdown because data directory lock file is invalid")));
 				kill(MyProcPid, SIGQUIT);
 			}
 			last_lockfile_recheck_time = now;
@@ -2769,7 +2769,7 @@ SIGHUP_handler(SIGNAL_ARGS)
 	PG_SETMASK(&BlockSig);
 #endif
 
-	if (Shutdown <= SmartShutdown)
+	if (Shutdown <= DumbShutdown)
 	{
 		ereport(LOG,
 				(errmsg("received SIGHUP, reloading configuration files")));
@@ -2864,11 +2864,11 @@ pmdie(SIGNAL_ARGS)
 			 *
 			 * Wait for children to end their work, then shut down.
 			 */
-			if (Shutdown >= SmartShutdown)
+			if (Shutdown >= DumbShutdown)
 				break;
-			Shutdown = SmartShutdown;
+			Shutdown = DumbShutdown;
 			ereport(LOG,
-					(errmsg("received smart shutdown request")));
+					(errmsg("received dumb shutdown request")));
 
 			/* Report status */
 			AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, PM_STATUS_STOPPING);
@@ -2905,16 +2905,16 @@ pmdie(SIGNAL_ARGS)
 		case SIGINT:
 
 			/*
-			 * Fast Shutdown:
+			 * Slow Shutdown:
 			 *
 			 * Abort all children with SIGTERM (rollback active transactions
 			 * and exit) and shut down when they are gone.
 			 */
-			if (Shutdown >= FastShutdown)
+			if (Shutdown >= SlowShutdown)
 				break;
-			Shutdown = FastShutdown;
+			Shutdown = SlowShutdown;
 			ereport(LOG,
-					(errmsg("received fast shutdown request")));
+					(errmsg("received slow shutdown request")));
 
 			/* Report status */
 			AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, PM_STATUS_STOPPING);
@@ -2946,17 +2946,17 @@ pmdie(SIGNAL_ARGS)
 		case SIGQUIT:
 
 			/*
-			 * Immediate Shutdown:
+			 * Crappy Shutdown:
 			 *
 			 * abort all children with SIGQUIT, wait for them to exit,
 			 * terminate remaining ones with SIGKILL, then exit without
 			 * attempt to properly shut down the data base system.
 			 */
-			if (Shutdown >= ImmediateShutdown)
+			if (Shutdown >= CrappyShutdown)
 				break;
-			Shutdown = ImmediateShutdown;
+			Shutdown = CrappyShutdown;
 			ereport(LOG,
-					(errmsg("received immediate shutdown request")));
+					(errmsg("received crappy shutdown request")));
 
 			/* Report status */
 			AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, PM_STATUS_STOPPING);
@@ -3035,7 +3035,7 @@ reaper(SIGNAL_ARGS)
 				ereport(LOG,
 						(errmsg("shutdown at recovery target")));
 				StartupStatus = STARTUP_NOT_RUNNING;
-				Shutdown = Max(Shutdown, SmartShutdown);
+				Shutdown = Max(Shutdown, DumbShutdown);
 				TerminateChildren(SIGTERM);
 				pmState = PM_WAIT_BACKENDS;
 				/* PostmasterStateMachine logic does the rest */
@@ -3528,12 +3528,12 @@ HandleChildCrash(int pid, int exitstatus, const char *procname)
 
 	/*
 	 * We only log messages and send signals if this is the first process
-	 * crash and we're not doing an immediate shutdown; otherwise, we're only
+	 * crash and we're not doing a crappy shutdown; otherwise, we're only
 	 * here to update postmaster's idea of live processes.  If we have already
 	 * signaled children, nonzero exit status is to be expected, so don't
 	 * clutter log.
 	 */
-	take_action = !FatalError && Shutdown != ImmediateShutdown;
+	take_action = !FatalError && Shutdown != CrappyShutdown;
 
 	if (take_action)
 	{
@@ -3749,7 +3749,7 @@ HandleChildCrash(int pid, int exitstatus, const char *procname)
 
 	/* We do NOT restart the syslogger */
 
-	if (Shutdown != ImmediateShutdown)
+	if (Shutdown != CrappyShutdown)
 		FatalError = true;
 
 	/* We now transit into a state of waiting for children to die */
@@ -3839,7 +3839,7 @@ LogChildExit(int lev, const char *procname, int pid, int exitstatus)
 static void
 PostmasterStateMachine(void)
 {
-	/* If we're doing a smart shutdown, try to advance that state. */
+	/* If we're doing a dumb shutdown, try to advance that state. */
 	if (pmState == PM_RUN || pmState == PM_HOT_STANDBY)
 	{
 		if (connsAllowed == ALLOW_SUPERUSER_CONNS)
@@ -3910,7 +3910,7 @@ PostmasterStateMachine(void)
 		 * PM_WAIT_BACKENDS state ends when we have no regular backends
 		 * (including autovac workers), no bgworkers (including unconnected
 		 * ones), and no walwriter, autovac launcher or bgwriter.  If we are
-		 * doing crash recovery or an immediate shutdown then we expect the
+		 * doing crash recovery or a crappy shutdown then we expect the
 		 * checkpointer to exit as well, otherwise not. The stats and
 		 * syslogger processes are disregarded since they are not connected to
 		 * shared memory; we also disregard dead_end children here. Walsenders
@@ -3922,11 +3922,11 @@ PostmasterStateMachine(void)
 			WalReceiverPID == 0 &&
 			BgWriterPID == 0 &&
 			(CheckpointerPID == 0 ||
-			 (!FatalError && Shutdown < ImmediateShutdown)) &&
+			 (!FatalError && Shutdown < CrappyShutdown)) &&
 			WalWriterPID == 0 &&
 			AutoVacPID == 0)
 		{
-			if (Shutdown >= ImmediateShutdown || FatalError)
+			if (Shutdown >= CrappyShutdown || FatalError)
 			{
 				/*
 				 * Start waiting for dead_end children to die.  This state
@@ -3936,7 +3936,7 @@ PostmasterStateMachine(void)
 
 				/*
 				 * We already SIGQUIT'd the archiver and stats processes, if
-				 * any, when we started immediate shutdown or entered
+				 * any, when we started a crappy shutdown or entered
 				 * FatalError state.
 				 */
 			}
@@ -4046,7 +4046,7 @@ PostmasterStateMachine(void)
 		{
 			/*
 			 * Terminate exclusive backup mode to avoid recovery after a clean
-			 * fast shutdown.  Since an exclusive backup can only be taken
+			 * slow shutdown.  Since an exclusive backup can only be taken
 			 * during normal running (and not, for example, while running
 			 * under Hot Standby) it only makes sense to do this if we reached
 			 * normal running. If we're still in recovery, the backup file is
@@ -4443,7 +4443,7 @@ BackendInitialize(Port *port)
 	/*
 	 * We arrange to do _exit(1) if we receive SIGTERM or timeout while trying
 	 * to collect the startup packet; while SIGQUIT results in _exit(2).
-	 * Otherwise the postmaster cannot shutdown the database FAST or IMMED
+	 * Otherwise the postmaster cannot shutdown the database SLOW or CRAPPY
 	 * cleanly if a buggy client fails to send the packet promptly.
 	 *
 	 * Exiting with _exit(1) is only possible because we have not yet touched
@@ -5319,7 +5319,7 @@ sigusr1_handler(SIGNAL_ARGS)
 	}
 
 	if (CheckPostmasterSignal(PMSIGNAL_START_AUTOVAC_LAUNCHER) &&
-		Shutdown <= SmartShutdown && pmState < PM_STOP_BACKENDS)
+		Shutdown <= DumbShutdown && pmState < PM_STOP_BACKENDS)
 	{
 		/*
 		 * Start one iteration of the autovacuum daemon, even if autovacuuming
@@ -5334,7 +5334,7 @@ sigusr1_handler(SIGNAL_ARGS)
 	}
 
 	if (CheckPostmasterSignal(PMSIGNAL_START_AUTOVAC_WORKER) &&
-		Shutdown <= SmartShutdown && pmState < PM_STOP_BACKENDS)
+		Shutdown <= DumbShutdown && pmState < PM_STOP_BACKENDS)
 	{
 		/* The autovacuum launcher wants us to start a worker process. */
 		StartAutovacuumWorker();
@@ -5691,7 +5691,7 @@ MaybeStartWalReceiver(void)
 	if (WalReceiverPID == 0 &&
 		(pmState == PM_STARTUP || pmState == PM_RECOVERY ||
 		 pmState == PM_HOT_STANDBY) &&
-		Shutdown <= SmartShutdown)
+		Shutdown <= DumbShutdown)
 	{
 		WalReceiverPID = StartWalReceiver();
 		if (WalReceiverPID != 0)
diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
index 3c182c97d4..413bfeb45d 100644
--- a/src/bin/pg_ctl/pg_ctl.c
+++ b/src/bin/pg_ctl/pg_ctl.c
@@ -41,9 +41,9 @@ typedef long pgpid_t;
 
 typedef enum
 {
-	SMART_MODE,
-	FAST_MODE,
-	IMMEDIATE_MODE
+	DUMB_MODE,
+	SLOW_MODE,
+	CRAPPY_MODE
 } ShutdownMode;
 
 typedef enum
@@ -80,7 +80,7 @@ static bool do_wait = true;
 static int	wait_seconds = DEFAULT_WAIT;
 static bool wait_seconds_arg = false;
 static bool silent_mode = false;
-static ShutdownMode shutdown_mode = FAST_MODE;
+static ShutdownMode shutdown_mode = SLOW_MODE;
 static int	sig = SIGINT;		/* default */
 static CtlCommand ctl_command = NO_COMMAND;
 static char *pg_data = NULL;
@@ -1060,11 +1060,11 @@ do_stop(void)
 	{
 		/*
 		 * If backup_label exists, an online backup is running. Warn the user
-		 * that smart shutdown will wait for it to finish. However, if the
+		 * that dumb shutdown will wait for it to finish. However, if the
 		 * server is in archive recovery, we're recovering from an online
 		 * backup instead of performing one.
 		 */
-		if (shutdown_mode == SMART_MODE &&
+		if (shutdown_mode == DUMB_MODE &&
 			stat(backup_file, &statbuf) == 0 &&
 			get_control_dbstate() != DB_IN_ARCHIVE_RECOVERY)
 		{
@@ -1079,7 +1079,7 @@ do_stop(void)
 			print_msg(_(" failed\n"));
 
 			write_stderr(_("%s: server does not shut down\n"), progname);
-			if (shutdown_mode == SMART_MODE)
+			if (shutdown_mode == DUMB_MODE)
 				write_stderr(_("HINT: The \"-m fast\" option immediately disconnects sessions rather than\n"
 							   "waiting for session-initiated disconnection.\n"));
 			exit(1);
@@ -1136,11 +1136,11 @@ do_restart(void)
 
 		/*
 		 * If backup_label exists, an online backup is running. Warn the user
-		 * that smart shutdown will wait for it to finish. However, if the
+		 * that dumb shutdown will wait for it to finish. However, if the
 		 * server is in archive recovery, we're recovering from an online
 		 * backup instead of performing one.
 		 */
-		if (shutdown_mode == SMART_MODE &&
+		if (shutdown_mode == DUMB_MODE &&
 			stat(backup_file, &statbuf) == 0 &&
 			get_control_dbstate() != DB_IN_ARCHIVE_RECOVERY)
 		{
@@ -1156,7 +1156,7 @@ do_restart(void)
 			print_msg(_(" failed\n"));
 
 			write_stderr(_("%s: server does not shut down\n"), progname);
-			if (shutdown_mode == SMART_MODE)
+			if (shutdown_mode == DUMB_MODE)
 				write_stderr(_("HINT: The \"-m fast\" option immediately disconnects sessions rather than\n"
 							   "waiting for session-initiated disconnection.\n"));
 			exit(1);
@@ -2135,12 +2135,12 @@ do_help(void)
 			 "                         (PostgreSQL server executable) or initdb\n"));
 	printf(_("  -p PATH-TO-POSTGRES    normally not necessary\n"));
 	printf(_("\nOptions for stop or restart:\n"));
-	printf(_("  -m, --mode=MODE        MODE can be \"smart\", \"fast\", or \"immediate\"\n"));
+	printf(_("  -m, --mode=MODE        MODE can be \"dumb\", \"slow\", or \"crappy\"\n"));
 
 	printf(_("\nShutdown modes are:\n"));
-	printf(_("  smart       quit after all clients have disconnected\n"));
-	printf(_("  fast        quit directly, with proper shutdown (default)\n"));
-	printf(_("  immediate   quit without complete shutdown; will lead to recovery on restart\n"));
+	printf(_("  dumb        quit after all clients have disconnected, if you're lucky\n"));
+	printf(_("  slow        quit eventually, with proper shutdown (default)\n"));
+	printf(_("  crappy      quit without complete shutdown; will lead to recovery on restart\n"));
 
 	printf(_("\nAllowed signal names for kill:\n"));
 	printf("  ABRT HUP INT KILL QUIT TERM USR1 USR2\n");
@@ -2166,19 +2166,19 @@ do_help(void)
 static void
 set_mode(char *modeopt)
 {
-	if (strcmp(modeopt, "s") == 0 || strcmp(modeopt, "smart") == 0)
+	if (strcmp(modeopt, "d") == 0 || strcmp(modeopt, "dumb") == 0)
 	{
-		shutdown_mode = SMART_MODE;
+		shutdown_mode = DUMB_MODE;
 		sig = SIGTERM;
 	}
-	else if (strcmp(modeopt, "f") == 0 || strcmp(modeopt, "fast") == 0)
+	else if (strcmp(modeopt, "s") == 0 || strcmp(modeopt, "slow") == 0)
 	{
-		shutdown_mode = FAST_MODE;
+		shutdown_mode = SLOW_MODE;
 		sig = SIGINT;
 	}
-	else if (strcmp(modeopt, "i") == 0 || strcmp(modeopt, "immediate") == 0)
+	else if (strcmp(modeopt, "c") == 0 || strcmp(modeopt, "crappy") == 0)
 	{
-		shutdown_mode = IMMEDIATE_MODE;
+		shutdown_mode = CRAPPY_MODE;
 		sig = SIGQUIT;
 	}
 	else
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index af5d6fa5a3..8f96355c79 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -4024,6 +4024,6 @@ foreach my $run (sort keys %pgdump_runs)
 #########################################
 # Stop the database instance, which will be removed at the end of the tests.
 
-$node->stop('fast');
+$node->stop('slow');
 
 done_testing();
diff --git a/src/bin/pg_rewind/t/007_standby_source.pl b/src/bin/pg_rewind/t/007_standby_source.pl
index 47320ea5a6..5dd6ba53dd 100644
--- a/src/bin/pg_rewind/t/007_standby_source.pl
+++ b/src/bin/pg_rewind/t/007_standby_source.pl
@@ -107,7 +107,7 @@ $node_c->safe_psql('postgres',
 my $node_c_pgdata = $node_c->data_dir;
 
 # Stop the node and be ready to perform the rewind.
-$node_c->stop('fast');
+$node_c->stop('slow');
 
 # Keep a temporary postgresql.conf or it would be overwritten during the rewind.
 copy(
diff --git a/src/bin/pg_rewind/t/008_min_recovery_point.pl b/src/bin/pg_rewind/t/008_min_recovery_point.pl
index e6a7177fb7..06e1643c22 100644
--- a/src/bin/pg_rewind/t/008_min_recovery_point.pl
+++ b/src/bin/pg_rewind/t/008_min_recovery_point.pl
@@ -74,7 +74,7 @@ $node_1->wait_for_catchup('node_3');
 #
 # Swap the roles of node_1 and node_3, so that node_1 follows node_3.
 #
-$node_1->stop('fast');
+$node_1->stop('slow');
 $node_3->promote;
 # Force a checkpoint after the promotion. pg_rewind looks at the control
 # file to determine what timeline the server is on, and that isn't updated
@@ -138,8 +138,8 @@ $node_2->poll_query_until('postgres',
 
 # At this point node_2 will shut down without a shutdown checkpoint,
 # but with WAL entries beyond the preceding shutdown checkpoint.
-$node_2->stop('fast');
-$node_3->stop('fast');
+$node_2->stop('slow');
+$node_3->stop('slow');
 
 my $node_2_pgdata  = $node_2->data_dir;
 my $node_1_connstr = $node_1->connstr;
diff --git a/src/bin/pg_rewind/t/RewindTest.pm b/src/bin/pg_rewind/t/RewindTest.pm
index 1e34768e27..44de92168f 100644
--- a/src/bin/pg_rewind/t/RewindTest.pm
+++ b/src/bin/pg_rewind/t/RewindTest.pm
@@ -236,7 +236,7 @@ sub run_pg_rewind
 		# Stop the primary and be ready to perform the rewind.  The cluster
 		# needs recovery to finish once, and pg_rewind makes sure that it
 		# happens automatically.
-		$node_primary->stop('immediate');
+		$node_primary->stop('crappy');
 	}
 
 	# At this point, the rewind processing is ready to run.
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index 265137e86b..17e8a74621 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -335,7 +335,7 @@ stop_postmaster(bool in_atexit)
 			  "\"%s/pg_ctl\" -w -D \"%s\" -o \"%s\" %s stop",
 			  cluster->bindir, cluster->pgconfig,
 			  cluster->pgopts ? cluster->pgopts : "",
-			  in_atexit ? "-m fast" : "-m smart");
+			  in_atexit ? "-m slow" : "-m dumb");
 
 	os_info.running_cluster = NULL;
 }
diff --git a/src/test/modules/commit_ts/t/001_base.pl b/src/test/modules/commit_ts/t/001_base.pl
index 3f0bb9e858..37ff41443c 100644
--- a/src/test/modules/commit_ts/t/001_base.pl
+++ b/src/test/modules/commit_ts/t/001_base.pl
@@ -27,7 +27,7 @@ my $ts = $node->safe_psql('postgres',
 );
 
 # Verify that we read the same TS after crash recovery
-$node->stop('immediate');
+$node->stop('crappy');
 $node->start;
 
 my $recovered_ts = $node->safe_psql('postgres',
diff --git a/src/test/modules/commit_ts/t/004_restart.pl b/src/test/modules/commit_ts/t/004_restart.pl
index 808164c34d..fb1c2e5fd3 100644
--- a/src/test/modules/commit_ts/t/004_restart.pl
+++ b/src/test/modules/commit_ts/t/004_restart.pl
@@ -57,7 +57,7 @@ my $before_restart_ts = $node_primary->safe_psql('postgres',
 ok($before_restart_ts ne '' && $before_restart_ts ne 'null',
 	'commit timestamp recorded');
 
-$node_primary->stop('immediate');
+$node_primary->stop('crappy');
 $node_primary->start;
 
 my $after_crash_ts = $node_primary->safe_psql('postgres',
@@ -65,7 +65,7 @@ my $after_crash_ts = $node_primary->safe_psql('postgres',
 is($after_crash_ts, $before_restart_ts,
 	'timestamps before and after crash are equal');
 
-$node_primary->stop('fast');
+$node_primary->stop('slow');
 $node_primary->start;
 
 my $after_restart_ts = $node_primary->safe_psql('postgres',
@@ -75,7 +75,7 @@ is($after_restart_ts, $before_restart_ts,
 
 # Now disable commit timestamps
 $node_primary->append_conf('postgresql.conf', 'track_commit_timestamp = off');
-$node_primary->stop('fast');
+$node_primary->stop('slow');
 
 # Start the server, which generates a XLOG_PARAMETER_CHANGE record where
 # the parameter change is registered.
@@ -134,10 +134,10 @@ like(
 # Re-enable, restart and ensure we can still get the old timestamps
 $node_primary->append_conf('postgresql.conf', 'track_commit_timestamp = on');
 
-# An immediate shutdown is used here.  At next startup recovery will
+# A crappy shutdown is used here.  At next startup recovery will
 # replay transactions which committed when track_commit_timestamp was
 # disabled, and the facility should be able to work properly.
-$node_primary->stop('immediate');
+$node_primary->stop('crappy');
 $node_primary->start;
 
 my $after_enable_ts = $node_primary->safe_psql('postgres',
diff --git a/src/test/modules/libpq_pipeline/t/001_libpq_pipeline.pl b/src/test/modules/libpq_pipeline/t/001_libpq_pipeline.pl
index cc79d96d47..e30178cecf 100644
--- a/src/test/modules/libpq_pipeline/t/001_libpq_pipeline.pl
+++ b/src/test/modules/libpq_pipeline/t/001_libpq_pipeline.pl
@@ -57,7 +57,7 @@ for my $testname (@tests)
 	}
 }
 
-$node->stop('fast');
+$node->stop('slow');
 
 done_testing();
 
diff --git a/src/test/modules/ssl_passphrase_callback/t/001_testfunc.pl b/src/test/modules/ssl_passphrase_callback/t/001_testfunc.pl
index 0429861b16..f324fbd91a 100644
--- a/src/test/modules/ssl_passphrase_callback/t/001_testfunc.pl
+++ b/src/test/modules/ssl_passphrase_callback/t/001_testfunc.pl
@@ -40,7 +40,7 @@ $node->start;
 # if the server is running we must have successfully transformed the passphrase
 ok(-e "$ddir/postmaster.pid", "postgres started");
 
-$node->stop('fast');
+$node->stop('slow');
 
 # should get a warning if ssl_passphrase_command is set
 my $log = $node->rotate_logfile();
@@ -50,7 +50,7 @@ $node->append_conf('postgresql.conf',
 
 $node->start;
 
-$node->stop('fast');
+$node->stop('slow');
 
 my $log_contents = slurp_file($log);
 
@@ -72,6 +72,6 @@ ok($ret,                       "pg_ctl fails with bad passphrase");
 ok(!-e "$ddir/postmaster.pid", "postgres not started with bad passphrase");
 
 # just in case
-$node->stop('fast');
+$node->stop('slow');
 
 done_testing();
diff --git a/src/test/modules/test_misc/t/001_constraint_validation.pl b/src/test/modules/test_misc/t/001_constraint_validation.pl
index 3b9fc66b8e..932f84dbd0 100644
--- a/src/test/modules/test_misc/t/001_constraint_validation.pl
+++ b/src/test/modules/test_misc/t/001_constraint_validation.pl
@@ -310,6 +310,6 @@ ok( $output =~
 	'updated partition constraint for default partition quuux_default1');
 run_sql_command('DROP TABLE quuux;');
 
-$node->stop('fast');
+$node->stop('slow');
 
 done_testing();
diff --git a/src/test/modules/test_pg_dump/t/001_base.pl b/src/test/modules/test_pg_dump/t/001_base.pl
index 84a35590b7..fd0e15ccc2 100644
--- a/src/test/modules/test_pg_dump/t/001_base.pl
+++ b/src/test/modules/test_pg_dump/t/001_base.pl
@@ -807,6 +807,6 @@ foreach my $run (sort keys %pgdump_runs)
 #########################################
 # Stop the database instance, which will be removed at the end of the tests.
 
-$node->stop('fast');
+$node->stop('slow');
 
 done_testing();
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index b4ebc99935..4a733744dc 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -67,7 +67,7 @@ PostgreSQL::Test::Cluster - class representing PostgreSQL server instance
   $other_node->start;
 
   # Stop the server
-  $node->stop('fast');
+  $node->stop('slow');
 
   # Find a free, unprivileged TCP port to bind some other service to
   my $port = PostgreSQL::Test::Cluster::get_free_port();
@@ -937,7 +937,7 @@ sub stop
 
 	local %ENV = $self->_get_env();
 
-	$mode = 'fast' unless defined $mode;
+	$mode = 'slow' unless defined $mode;
 	return 1 unless defined $self->{_pid};
 
 	print "### Stopping node \"$name\" using mode $mode\n";
@@ -1596,7 +1596,7 @@ END
 
 =item $node->teardown_node()
 
-Do an immediate stop of the node
+Do a crappy stop of the node
 
 =cut
 
@@ -1604,7 +1604,7 @@ sub teardown_node
 {
 	my $self = shift;
 
-	$self->stop('immediate');
+	$self->stop('crappy');
 	return;
 }
 
diff --git a/src/test/perl/README b/src/test/perl/README
index 4b160cce36..28ec684808 100644
--- a/src/test/perl/README
+++ b/src/test/perl/README
@@ -72,7 +72,7 @@ against them and evaluate the results. For example:
     my $ret = $node->safe_psql('postgres', 'SELECT 1');
     is($ret, '1', 'SELECT 1 returns 1');
 
-    $node->stop('fast');
+    $node->stop('slow');
 
 Each test script should end with:
 
diff --git a/src/test/recovery/t/010_logical_decoding_timelines.pl b/src/test/recovery/t/010_logical_decoding_timelines.pl
index 01ff31e61f..bd5a37f903 100644
--- a/src/test/recovery/t/010_logical_decoding_timelines.pl
+++ b/src/test/recovery/t/010_logical_decoding_timelines.pl
@@ -136,7 +136,7 @@ $node_primary->safe_psql('postgres', 'CHECKPOINT');
 $node_primary->wait_for_catchup($node_replica, 'write');
 
 # Boom, crash
-$node_primary->stop('immediate');
+$node_primary->stop('crappy');
 
 $node_replica->promote;
 
diff --git a/src/test/recovery/t/011_crash_recovery.pl b/src/test/recovery/t/011_crash_recovery.pl
index 1b57d01046..b91027a0f0 100644
--- a/src/test/recovery/t/011_crash_recovery.pl
+++ b/src/test/recovery/t/011_crash_recovery.pl
@@ -46,7 +46,7 @@ is($node->safe_psql('postgres', qq[SELECT pg_xact_status('$xid');]),
 	'in progress', 'own xid is in-progress');
 
 # Crash and restart the postmaster
-$node->stop('immediate');
+$node->stop('crappy');
 $node->start;
 
 # Make sure we really got a new xid
diff --git a/src/test/recovery/t/014_unlogged_reinit.pl b/src/test/recovery/t/014_unlogged_reinit.pl
index f3199fbd2e..1bbb9923e4 100644
--- a/src/test/recovery/t/014_unlogged_reinit.pl
+++ b/src/test/recovery/t/014_unlogged_reinit.pl
@@ -45,7 +45,7 @@ ok(-f "$pgdata/${ts1UnloggedPath}_init", 'init fork in tablespace exists');
 ok(-f "$pgdata/$ts1UnloggedPath",        'main fork in tablespace exists');
 
 # Crash the postmaster.
-$node->stop('immediate');
+$node->stop('crappy');
 
 # Write fake forks to test that they are removed during recovery.
 append_to_file("$pgdata/${baseUnloggedPath}_vm",  'TEST_VM');
diff --git a/src/test/recovery/t/015_promotion_pages.pl b/src/test/recovery/t/015_promotion_pages.pl
index 8d57b1b3d6..e78a5ecc75 100644
--- a/src/test/recovery/t/015_promotion_pages.pl
+++ b/src/test/recovery/t/015_promotion_pages.pl
@@ -77,7 +77,7 @@ $bravo->safe_psql('postgres',
 # Now crash-stop the promoted standby and restart.  This makes sure that
 # replay does not see invalid page references because of an invalid
 # minimum consistent recovery point.
-$bravo->stop('immediate');
+$bravo->stop('crappy');
 $bravo->start;
 
 # Check state of the table after full crash recovery.  All its data should
diff --git a/src/test/recovery/t/016_min_consistency.pl b/src/test/recovery/t/016_min_consistency.pl
index 5e0655c2a9..df84dc86b5 100644
--- a/src/test/recovery/t/016_min_consistency.pl
+++ b/src/test/recovery/t/016_min_consistency.pl
@@ -110,8 +110,8 @@ $standby->safe_psql('postgres', 'CHECKPOINT;');
 # process does not flush any pages on its side.  The standby is
 # cleanly stopped, which makes the checkpointer update minRecoveryPoint
 # with the restart point created at shutdown.
-$primary->stop('immediate');
-$standby->stop('fast');
+$primary->stop('crappy');
+$standby->stop('slow');
 
 # Check the data consistency of the instance while offline.  This is
 # done by directly scanning the on-disk relation blocks and what
diff --git a/src/test/recovery/t/017_shm.pl b/src/test/recovery/t/017_shm.pl
index 875657b4bb..90270830cf 100644
--- a/src/test/recovery/t/017_shm.pl
+++ b/src/test/recovery/t/017_shm.pl
@@ -204,7 +204,7 @@ sub poll_start
 		usleep(100_000);
 
 		# Clean up in case the start attempt just timed out or some such.
-		$node->stop('fast', fail_ok => 1);
+		$node->stop('slow', fail_ok => 1);
 
 		$attempts++;
 	}
diff --git a/src/test/recovery/t/018_wal_optimize.pl b/src/test/recovery/t/018_wal_optimize.pl
index 4700d49c10..81e8d30959 100644
--- a/src/test/recovery/t/018_wal_optimize.pl
+++ b/src/test/recovery/t/018_wal_optimize.pl
@@ -5,8 +5,8 @@
 #
 # These tests exercise code that once violated the mandate described in
 # src/backend/access/transam/README section "Skipping WAL for New
-# RelFileNode".  The tests work by committing some transactions, initiating an
-# immediate shutdown, and confirming that the expected data survives recovery.
+# RelFileNode".  The tests work by committing some transactions, initiating a
+# crappy shutdown, and confirming that the expected data survives recovery.
 # For many years, individual commands made the decision to skip WAL, hence the
 # frequent appearance of COPY in these tests.
 use strict;
@@ -74,7 +74,7 @@ wal_skip_threshold = 0
 		INSERT INTO originated VALUES (1);
 		CREATE UNIQUE INDEX ON originated(id) TABLESPACE other;
 		COMMIT;");
-	$node->stop('immediate');
+	$node->stop('crappy');
 	$node->start;
 	$result = $node->safe_psql('postgres', "SELECT count(*) FROM moved;");
 	is($result, qq(1), "wal_level = $wal_level, CREATE+SET TABLESPACE");
@@ -93,7 +93,7 @@ wal_skip_threshold = 0
 		CREATE TABLE trunc (id serial PRIMARY KEY);
 		TRUNCATE trunc;
 		COMMIT;");
-	$node->stop('immediate');
+	$node->stop('crappy');
 	$node->start;
 	$result = $node->safe_psql('postgres', "SELECT count(*) FROM trunc;");
 	is($result, qq(0), "wal_level = $wal_level, TRUNCATE with empty table");
@@ -108,7 +108,7 @@ wal_skip_threshold = 0
 		TRUNCATE trunc_ins;
 		INSERT INTO trunc_ins VALUES (DEFAULT);
 		COMMIT;");
-	$node->stop('immediate');
+	$node->stop('crappy');
 	$node->start;
 	$result = $node->safe_psql('postgres',
 		"SELECT count(*), min(id) FROM trunc_ins;");
@@ -125,7 +125,7 @@ wal_skip_threshold = 0
 		INSERT INTO twophase VALUES (DEFAULT);
 		PREPARE TRANSACTION 't';
 		COMMIT PREPARED 't';");
-	$node->stop('immediate');
+	$node->stop('crappy');
 	$node->start;
 	$result = $node->safe_psql('postgres',
 		"SELECT count(*), min(id) FROM trunc_ins;");
@@ -139,7 +139,7 @@ wal_skip_threshold = 0
 		CREATE TABLE noskip (id serial PRIMARY KEY);
 		INSERT INTO noskip (SELECT FROM generate_series(1, 20000) a) ;
 		COMMIT;");
-	$node->stop('immediate');
+	$node->stop('crappy');
 	$node->start;
 	$result = $node->safe_psql('postgres', "SELECT count(*) FROM noskip;");
 	is($result, qq(20000), "wal_level = $wal_level, end-of-xact WAL");
@@ -164,7 +164,7 @@ wal_skip_threshold = 0
 		COPY ins_trunc FROM '$copy_file' DELIMITER ',';
 		INSERT INTO ins_trunc (id, id2) VALUES (DEFAULT, 10000);
 		COMMIT;");
-	$node->stop('immediate');
+	$node->stop('crappy');
 	$node->start;
 	$result = $node->safe_psql('postgres', "SELECT count(*) FROM ins_trunc;");
 	is($result, qq(5), "wal_level = $wal_level, TRUNCATE COPY INSERT");
@@ -179,7 +179,7 @@ wal_skip_threshold = 0
 		TRUNCATE trunc_copy;
 		COPY trunc_copy FROM '$copy_file' DELIMITER ',';
 		COMMIT;");
-	$node->stop('immediate');
+	$node->stop('crappy');
 	$node->start;
 	$result =
 	  $node->safe_psql('postgres', "SELECT count(*) FROM trunc_copy;");
@@ -196,7 +196,7 @@ wal_skip_threshold = 0
 		  ALTER TABLE spc_abort SET TABLESPACE other; ROLLBACK TO s;
 		COPY spc_abort FROM '$copy_file' DELIMITER ',';
 		COMMIT;");
-	$node->stop('immediate');
+	$node->stop('crappy');
 	$node->start;
 	$result = $node->safe_psql('postgres', "SELECT count(*) FROM spc_abort;");
 	is($result, qq(3),
@@ -212,7 +212,7 @@ wal_skip_threshold = 0
 		SAVEPOINT s; ALTER TABLE spc_commit SET TABLESPACE other; RELEASE s;
 		COPY spc_commit FROM '$copy_file' DELIMITER ',';
 		COMMIT;");
-	$node->stop('immediate');
+	$node->stop('crappy');
 	$node->start;
 	$result =
 	  $node->safe_psql('postgres', "SELECT count(*) FROM spc_commit;");
@@ -236,7 +236,7 @@ wal_skip_threshold = 0
 		ROLLBACK TO s;
 		COPY spc_nest FROM '$copy_file' DELIMITER ',';
 		COMMIT;");
-	$node->stop('immediate');
+	$node->stop('crappy');
 	$node->start;
 	$result = $node->safe_psql('postgres', "SELECT count(*) FROM spc_nest;");
 	is($result, qq(3),
@@ -252,7 +252,7 @@ wal_skip_threshold = 0
 		SELECT * FROM spc_hint;  -- set hint bit
 		INSERT INTO spc_hint VALUES (2);
 		COMMIT;");
-	$node->stop('immediate');
+	$node->stop('crappy');
 	$node->start;
 	$result = $node->safe_psql('postgres', "SELECT count(*) FROM spc_hint;");
 	is($result, qq(2), "wal_level = $wal_level, SET TABLESPACE, hint bit");
@@ -266,7 +266,7 @@ wal_skip_threshold = 0
 		INSERT INTO idx_hint VALUES (1);  -- set index hint bit
 		INSERT INTO idx_hint VALUES (2);
 		COMMIT;");
-	$node->stop('immediate');
+	$node->stop('crappy');
 	$node->start;
 	$result = $node->psql('postgres',);
 	my ($ret, $stdout, $stderr) =
@@ -287,7 +287,7 @@ wal_skip_threshold = 0
 		UPDATE upd SET id2 = id2 + 1;
 		DELETE FROM upd;
 		COMMIT;");
-	$node->stop('immediate');
+	$node->stop('crappy');
 	$node->start;
 	$result = $node->safe_psql('postgres', "SELECT count(*) FROM upd;");
 	is($result, qq(0),
@@ -302,7 +302,7 @@ wal_skip_threshold = 0
 		INSERT INTO ins_copy VALUES (DEFAULT, 1);
 		COPY ins_copy FROM '$copy_file' DELIMITER ',';
 		COMMIT;");
-	$node->stop('immediate');
+	$node->stop('crappy');
 	$node->start;
 	$result = $node->safe_psql('postgres', "SELECT count(*) FROM ins_copy;");
 	is($result, qq(4), "wal_level = $wal_level, INSERT COPY");
@@ -342,7 +342,7 @@ wal_skip_threshold = 0
 		  FOR EACH ROW EXECUTE PROCEDURE ins_trig_after_row_trig();
 		COPY ins_trig FROM '$copy_file' DELIMITER ',';
 		COMMIT;");
-	$node->stop('immediate');
+	$node->stop('crappy');
 	$node->start;
 	$result = $node->safe_psql('postgres', "SELECT count(*) FROM ins_trig;");
 	is($result, qq(9), "wal_level = $wal_level, COPY with INSERT triggers");
@@ -375,7 +375,7 @@ wal_skip_threshold = 0
 		TRUNCATE trunc_trig;
 		COPY trunc_trig FROM '$copy_file' DELIMITER ',';
 		COMMIT;");
-	$node->stop('immediate');
+	$node->stop('crappy');
 	$node->start;
 	$result =
 	  $node->safe_psql('postgres', "SELECT count(*) FROM trunc_trig;");
@@ -386,7 +386,7 @@ wal_skip_threshold = 0
 	$node->safe_psql(
 		'postgres', "
 		CREATE TEMP TABLE temp (id serial PRIMARY KEY, id2 text);");
-	$node->stop('immediate');
+	$node->stop('crappy');
 	$node->start;
 	check_orphan_relfilenodes($node,
 		"wal_level = $wal_level, no orphan relfilenode remains");
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index 5654f3b545..fdc1508dd3 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -362,10 +362,10 @@ while (1)
 	# unlikely that the problem would resolve after 15s, so give up at point
 	if ($i++ == 150)
 	{
-		# An immediate shutdown may hide evidence of a locking bug. If
-		# retrying didn't resolve the issue, shut down in fast mode.
-		$node_primary3->stop('fast');
-		$node_standby3->stop('fast');
+		# A crappy shutdown may hide evidence of a locking bug. If
+		# retrying didn't resolve the issue, shut down in slow mode.
+		$node_primary3->stop('slow');
+		$node_standby3->stop('slow');
 		die "could not determine walsender pid, can't continue";
 	}
 
diff --git a/src/test/recovery/t/020_archive_status.pl b/src/test/recovery/t/020_archive_status.pl
index e6e4eb56a9..f79285564d 100644
--- a/src/test/recovery/t/020_archive_status.pl
+++ b/src/test/recovery/t/020_archive_status.pl
@@ -70,7 +70,7 @@ is( $primary->safe_psql(
 
 # Crash the cluster for the next test in charge of checking that non-archived
 # WAL segments are not removed.
-$primary->stop('immediate');
+$primary->stop('crappy');
 
 # Recovery tests for the archiving with a standby partially check
 # the recovery behavior when restoring a backup taken using a
@@ -201,7 +201,7 @@ $standby2->safe_psql('postgres', q{SELECT pg_stat_reset_shared('archiver')});
 # Now crash the cluster to check that recovery step does not
 # remove non-archived WAL segments on a standby where archiving
 # is enabled.
-$standby2->stop('immediate');
+$standby2->stop('crappy');
 $standby2->start;
 
 ok( -f "$standby2_data/$segment_path_1_ready",
diff --git a/src/test/recovery/t/023_pitr_prepared_xact.pl b/src/test/recovery/t/023_pitr_prepared_xact.pl
index 39e8a8fa17..cac370802a 100644
--- a/src/test/recovery/t/023_pitr_prepared_xact.pl
+++ b/src/test/recovery/t/023_pitr_prepared_xact.pl
@@ -85,7 +85,7 @@ CHECKPOINT;
 
 # Enforce recovery, the checkpoint record generated previously should
 # still be found.
-$node_pitr->stop('immediate');
+$node_pitr->stop('crappy');
 $node_pitr->start;
 
 done_testing();
diff --git a/src/test/recovery/t/026_overwrite_contrecord.pl b/src/test/recovery/t/026_overwrite_contrecord.pl
index 78feccd9aa..94379fffbd 100644
--- a/src/test/recovery/t/026_overwrite_contrecord.pl
+++ b/src/test/recovery/t/026_overwrite_contrecord.pl
@@ -63,10 +63,10 @@ my $endfile = $node->safe_psql('postgres',
 	'SELECT pg_walfile_name(pg_current_wal_insert_lsn())');
 ok($initfile ne $endfile, "$initfile differs from $endfile");
 
-# Now stop abruptly, to avoid a stop checkpoint.  We can remove the tail file
+# Now stop crappily, to avoid a stop checkpoint.  We can remove the tail file
 # afterwards, and on startup the large message should be overwritten with new
 # contents
-$node->stop('immediate');
+$node->stop('crappy');
 
 unlink $node->basedir . "/pgdata/pg_wal/$endfile"
   or die "could not unlink " . $node->basedir . "/pgdata/pg_wal/$endfile: $!";
diff --git a/src/test/recovery/t/028_pitr_timelines.pl b/src/test/recovery/t/028_pitr_timelines.pl
index a8b12d9af6..af7c2e54e2 100644
--- a/src/test/recovery/t/028_pitr_timelines.pl
+++ b/src/test/recovery/t/028_pitr_timelines.pl
@@ -78,7 +78,7 @@ is($result, qq{2}, "check table contents after archive recovery");
 
 # Kill the old primary, before it archives the most recent WAL segment that
 # contains all the INSERTs.
-$node_primary->stop('immediate');
+$node_primary->stop('crappy');
 
 # Promote the standby, and switch WAL so that it archives a WAL segment
 # that contains all the INSERTs, on a new timeline.
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index d35a133f15..c547c31ccf 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -445,7 +445,7 @@ $node_publisher->safe_psql('postgres', "DELETE FROM tab_rep");
 
 # Restart the publisher and check the state of the subscriber which
 # should be in a streaming state after catching up.
-$node_publisher->stop('fast');
+$node_publisher->stop('slow');
 $node_publisher->start;
 
 $node_publisher->wait_for_catchup('tap_sub');
@@ -545,8 +545,8 @@ $result = $node_subscriber->safe_psql('postgres',
 	"SELECT count(*) FROM pg_replication_origin");
 is($result, qq(0), 'check replication origin was dropped on subscriber');
 
-$node_subscriber->stop('fast');
-$node_publisher->stop('fast');
+$node_subscriber->stop('slow');
+$node_publisher->stop('slow');
 
 # CREATE PUBLICATION while wal_level=minimal should succeed, with a WARNING
 $node_publisher->append_conf(
diff --git a/src/test/subscription/t/002_types.pl b/src/test/subscription/t/002_types.pl
index 3f1f00f7c8..3cf2abd0bd 100644
--- a/src/test/subscription/t/002_types.pl
+++ b/src/test/subscription/t/002_types.pl
@@ -564,7 +564,7 @@ $result =
 	"SELECT sum(a) FROM tst_dom_constr");
 is($result, '21', 'sql-function constraint on domain');
 
-$node_subscriber->stop('fast');
-$node_publisher->stop('fast');
+$node_subscriber->stop('slow');
+$node_publisher->stop('slow');
 
 done_testing();
diff --git a/src/test/subscription/t/003_constraints.pl b/src/test/subscription/t/003_constraints.pl
index 63c22699c0..c021e0d9e0 100644
--- a/src/test/subscription/t/003_constraints.pl
+++ b/src/test/subscription/t/003_constraints.pl
@@ -135,7 +135,7 @@ $result = $node_subscriber->safe_psql('postgres',
 is($result, qq(2|1|2),
 	'check column trigger applied even on update for other column');
 
-$node_subscriber->stop('fast');
-$node_publisher->stop('fast');
+$node_subscriber->stop('slow');
+$node_publisher->stop('slow');
 
 done_testing();
diff --git a/src/test/subscription/t/004_sync.pl b/src/test/subscription/t/004_sync.pl
index cf61fc1e0f..139d7e367b 100644
--- a/src/test/subscription/t/004_sync.pl
+++ b/src/test/subscription/t/004_sync.pl
@@ -176,7 +176,7 @@ $result = $node_publisher->safe_psql('postgres',
 is($result, qq(0),
 	'DROP SUBSCRIPTION during error can clean up the slots on the publisher');
 
-$node_subscriber->stop('fast');
-$node_publisher->stop('fast');
+$node_subscriber->stop('slow');
+$node_publisher->stop('slow');
 
 done_testing();
diff --git a/src/test/subscription/t/014_binary.pl b/src/test/subscription/t/014_binary.pl
index a1f03e7adc..4e6a4148fa 100644
--- a/src/test/subscription/t/014_binary.pl
+++ b/src/test/subscription/t/014_binary.pl
@@ -133,7 +133,7 @@ is( $result, '{1,2,3}|{42,1.2,1.3}|
 {2,3,1}|{1.2,1.3,1.1}|{two,three,one}
 {3,1,2}|{42,1.1,1.2}|', 'check replicated data on subscriber');
 
-$node_subscriber->stop('fast');
-$node_publisher->stop('fast');
+$node_subscriber->stop('slow');
+$node_publisher->stop('slow');
 
 done_testing();
diff --git a/src/test/subscription/t/020_messages.pl b/src/test/subscription/t/020_messages.pl
index d21d929c2d..4d5360c46c 100644
--- a/src/test/subscription/t/020_messages.pl
+++ b/src/test/subscription/t/020_messages.pl
@@ -143,7 +143,7 @@ is( $result, qq(77|0
 77|0),
 	'non-transactional message on slot from aborted transaction is M');
 
-$node_subscriber->stop('fast');
-$node_publisher->stop('fast');
+$node_subscriber->stop('slow');
+$node_publisher->stop('slow');
 
 done_testing();
diff --git a/src/test/subscription/t/021_twophase.pl b/src/test/subscription/t/021_twophase.pl
index aacc0fcf46..346b959710 100644
--- a/src/test/subscription/t/021_twophase.pl
+++ b/src/test/subscription/t/021_twophase.pl
@@ -133,8 +133,8 @@ $node_publisher->safe_psql('postgres', "
     INSERT INTO tab_full VALUES (13);
     PREPARE TRANSACTION 'test_prepared_tab';");
 
-$node_subscriber->stop('immediate');
-$node_publisher->stop('immediate');
+$node_subscriber->stop('crappy');
+$node_publisher->stop('crappy');
 
 $node_publisher->start;
 $node_subscriber->start;
@@ -158,8 +158,8 @@ $node_publisher->safe_psql('postgres', "
     INSERT INTO tab_full VALUES (13);
     PREPARE TRANSACTION 'test_prepared_tab';");
 
-$node_subscriber->stop('immediate');
-$node_publisher->stop('immediate');
+$node_subscriber->stop('crappy');
+$node_publisher->stop('crappy');
 
 $node_publisher->start;
 $node_subscriber->start;
@@ -183,7 +183,7 @@ $node_publisher->safe_psql('postgres', "
     INSERT INTO tab_full VALUES (15);
     PREPARE TRANSACTION 'test_prepared_tab';");
 
-$node_subscriber->stop('immediate');
+$node_subscriber->stop('crappy');
 $node_subscriber->start;
 
 # commit post the restart
@@ -205,7 +205,7 @@ $node_publisher->safe_psql('postgres', "
     INSERT INTO tab_full VALUES (17);
     PREPARE TRANSACTION 'test_prepared_tab';");
 
-$node_publisher->stop('immediate');
+$node_publisher->stop('crappy');
 $node_publisher->start;
 
 # commit post the restart
@@ -357,7 +357,7 @@ is($result, qq(0), 'check subscription relation status was dropped on subscriber
 $result = $node_subscriber->safe_psql('postgres', "SELECT count(*) FROM pg_replication_origin");
 is($result, qq(0), 'check replication origin was dropped on subscriber');
 
-$node_subscriber->stop('fast');
-$node_publisher->stop('fast');
+$node_subscriber->stop('slow');
+$node_publisher->stop('slow');
 
 done_testing();
diff --git a/src/test/subscription/t/022_twophase_cascade.pl b/src/test/subscription/t/022_twophase_cascade.pl
index 900c25d5ce..f0eff3ad01 100644
--- a/src/test/subscription/t/022_twophase_cascade.pl
+++ b/src/test/subscription/t/022_twophase_cascade.pl
@@ -391,8 +391,8 @@ $result = $node_A->safe_psql('postgres', "SELECT count(*) FROM pg_replication_sl
 is($result, qq(0), 'check replication slot was dropped on publisher node A');
 
 # shutdown
-$node_C->stop('fast');
-$node_B->stop('fast');
-$node_A->stop('fast');
+$node_C->stop('slow');
+$node_B->stop('slow');
+$node_A->stop('slow');
 
 done_testing();
diff --git a/src/test/subscription/t/023_twophase_stream.pl b/src/test/subscription/t/023_twophase_stream.pl
index 93ce3ef132..67568d742f 100644
--- a/src/test/subscription/t/023_twophase_stream.pl
+++ b/src/test/subscription/t/023_twophase_stream.pl
@@ -158,8 +158,8 @@ $node_publisher->safe_psql('postgres', q{
 	DELETE FROM test_tab WHERE mod(a,3) = 0;
 	PREPARE TRANSACTION 'test_prepared_tab';});
 
-$node_subscriber->stop('immediate');
-$node_publisher->stop('immediate');
+$node_subscriber->stop('crappy');
+$node_publisher->stop('crappy');
 
 $node_publisher->start;
 $node_subscriber->start;
@@ -280,7 +280,7 @@ is($result, qq(0), 'check subscription relation status was dropped on subscriber
 $result = $node_subscriber->safe_psql('postgres', "SELECT count(*) FROM pg_replication_origin");
 is($result, qq(0), 'check replication origin was dropped on subscriber');
 
-$node_subscriber->stop('fast');
-$node_publisher->stop('fast');
+$node_subscriber->stop('slow');
+$node_publisher->stop('slow');
 
 done_testing();
diff --git a/src/test/subscription/t/024_add_drop_pub.pl b/src/test/subscription/t/024_add_drop_pub.pl
index 561ddde421..ade19de5cf 100644
--- a/src/test/subscription/t/024_add_drop_pub.pl
+++ b/src/test/subscription/t/024_add_drop_pub.pl
@@ -94,7 +94,7 @@ $result = $node_subscriber->safe_psql('postgres',
 is($result, qq(20|1|10), 'check initial data is copied to subscriber');
 
 # shutdown
-$node_subscriber->stop('fast');
-$node_publisher->stop('fast');
+$node_subscriber->stop('slow');
+$node_publisher->stop('slow');
 
 done_testing();
diff --git a/src/test/subscription/t/025_rep_changes_for_schema.pl b/src/test/subscription/t/025_rep_changes_for_schema.pl
index 2a6ba5403d..f7918a8425 100644
--- a/src/test/subscription/t/025_rep_changes_for_schema.pl
+++ b/src/test/subscription/t/025_rep_changes_for_schema.pl
@@ -201,7 +201,7 @@ $result = $node_subscriber->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM sch1.tab1");
 is($result, qq(21|1|21), 'check replicated inserts on subscriber');
 
-$node_subscriber->stop('fast');
-$node_publisher->stop('fast');
+$node_subscriber->stop('slow');
+$node_publisher->stop('slow');
 
 done_testing();
diff --git a/src/test/subscription/t/026_stats.pl b/src/test/subscription/t/026_stats.pl
index a42ea3170e..ef62d6f6f7 100644
--- a/src/test/subscription/t/026_stats.pl
+++ b/src/test/subscription/t/026_stats.pl
@@ -96,7 +96,7 @@ WHERE subname = 'tap_sub'
 # Truncate test_tab1 so that apply worker can continue.
 $node_subscriber->safe_psql('postgres', "TRUNCATE test_tab1;");
 
-$node_subscriber->stop('fast');
-$node_publisher->stop('fast');
+$node_subscriber->stop('slow');
+$node_publisher->stop('slow');
 
 done_testing();
diff --git a/src/test/subscription/t/028_row_filter.pl b/src/test/subscription/t/028_row_filter.pl
index 82c4eb6ef6..76e3cdbdc4 100644
--- a/src/test/subscription/t/028_row_filter.pl
+++ b/src/test/subscription/t/028_row_filter.pl
@@ -736,7 +736,7 @@ is( $result, qq(),
 # Testcase end: FOR TABLE with row filter publications
 # ======================================================
 
-$node_subscriber->stop('fast');
-$node_publisher->stop('fast');
+$node_subscriber->stop('slow');
+$node_publisher->stop('slow');
 
 done_testing();
diff --git a/src/test/subscription/t/031_column_list.pl b/src/test/subscription/t/031_column_list.pl
index bdcf3e4a24..9cdf48b0ca 100644
--- a/src/test/subscription/t/031_column_list.pl
+++ b/src/test/subscription/t/031_column_list.pl
@@ -1125,7 +1125,7 @@ is($node_subscriber->safe_psql('postgres',"SELECT * FROM t ORDER BY a, b, c"),
    'publication containing both parent and child relation');
 
 
-$node_subscriber->stop('fast');
-$node_publisher->stop('fast');
+$node_subscriber->stop('slow');
+$node_publisher->stop('slow');
 
 done_testing();
diff --git a/src/test/subscription/t/100_bugs.pl b/src/test/subscription/t/100_bugs.pl
index 11ba473715..f46b9c5df1 100644
--- a/src/test/subscription/t/100_bugs.pl
+++ b/src/test/subscription/t/100_bugs.pl
@@ -69,8 +69,8 @@ $node_publisher->wait_for_catchup('sub1');
 
 pass('index predicates do not cause crash');
 
-$node_publisher->stop('fast');
-$node_subscriber->stop('fast');
+$node_publisher->stop('slow');
+$node_subscriber->stop('slow');
 
 
 # Handling of temporary and unlogged tables with FOR ALL TABLES publications
@@ -102,7 +102,7 @@ is( $node_publisher->psql(
 	'update to unlogged table without replica identity with FOR ALL TABLES publication'
 );
 
-$node_publisher->stop('fast');
+$node_publisher->stop('slow');
 
 # Bug #16643 - https://postgr.es/m/16643-eaadeb2a1a58d28c@postgresql.org
 #
@@ -221,9 +221,9 @@ $node_pub->safe_psql('postgres', "DROP TABLE tab1");
 $node_pub_sub->safe_psql('postgres', "DROP TABLE tab1");
 $node_sub->safe_psql('postgres', "DROP TABLE tab1");
 
-$node_pub->stop('fast');
-$node_pub_sub->stop('fast');
-$node_sub->stop('fast');
+$node_pub->stop('slow');
+$node_pub_sub->stop('slow');
+$node_sub->stop('slow');
 
 # https://postgr.es/m/OS0PR01MB61133CA11630DAE45BC6AD95FB939%40OS0PR01MB6113.jpnprd01.prod.outlook.com
 
@@ -304,7 +304,7 @@ is( $node_subscriber->safe_psql(
 	qq(-1|1),
 	"update works with REPLICA IDENTITY");
 
-$node_publisher->stop('fast');
-$node_subscriber->stop('fast');
+$node_publisher->stop('slow');
+$node_subscriber->stop('slow');
 
 done_testing();
-- 
2.24.3 (Apple Git-128)

#2Justin Pryzby
pryzby@telsasoft.com
In reply to: Robert Haas (#1)
Re: PostgreSQL shutdown modes

Isn't this missing support in pg_dumb ?

#3Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#1)
Re: PostgreSQL shutdown modes

On Fri, Apr 01, 2022 at 01:22:05PM -0400, Robert Haas wrote:

I attach herewith a modest patch to rename these shutdown modes to
more accurately correspond to their actual characteristics.

Date: Fri, 1 Apr 2022 12:50:05 -0400

I love the idea. Just in time, before the feature freeze deadline.
--
Michael

#4Rushabh Lathia
rushabh.lathia@gmail.com
In reply to: Michael Paquier (#3)
Re: PostgreSQL shutdown modes

+1 for the idea of changing the name, as it's really confusing.

I had quick check in the patch and noticed below replacements:

-#define SmartShutdown 1
-#define FastShutdown 2
-#define ImmediateShutdown 3
+#define DumbShutdown 1
+#define SlowShutdown 2
+#define CrappyShutdown 3

About the new naming, if "Crappy" can be replaced with something else. But
was not able to come up with any proper suggestions here. Or may be
"Immediate" is appropriate, as here it's talking about a "Shutdown"
operation.

On Sat, Apr 2, 2022 at 8:29 AM Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Apr 01, 2022 at 01:22:05PM -0400, Robert Haas wrote:

I attach herewith a modest patch to rename these shutdown modes to
more accurately correspond to their actual characteristics.

Date: Fri, 1 Apr 2022 12:50:05 -0400

I love the idea. Just in time, before the feature freeze deadline.
--
Michael

--
Rushabh Lathia

#5Noname
chap@anastigmatix.net
In reply to: Robert Haas (#1)
Re: PostgreSQL shutdown modes

On 2022-04-01 13:22, Robert Haas wrote:

I attach herewith a modest patch to rename these shutdown modes to
more accurately correspond to their actual characteristics.

I've waited for April 2nd to submit this comment, but it seemed to me
that the
suggestion about the first-pass checkpoint in 'slow' mode is a
no-foolin' good one.
Then I wondered whether there could be an option to accompany the 'dumb'
mode that
would take a WHERE clause, to be implicitly applied to pg_stat_activity,
whose
purpose would be to select those sessions that are ok to evict without
waiting for
them to exit. It could recognize, say, backend connections in no current
transaction
that are from your pesky app or connection pooler that holds things
open. It could
also, for example, select things in transaction state but where
current_timestamp - state_change > '5 minutes' (so it would be
re-evaluated every
so often until ready to shut down).

For conciseness (and sanity), maybe the WHERE clause could be implicitly
applied,
not to pg_stat_activity directly, but to a (virtual or actual) view that
has
already been restricted to client backend sessions, and already has a
column
for current_timestamp - state_change.

Regards,
-Chap

#6Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Michael Paquier (#3)
Re: PostgreSQL shutdown modes

At Sat, 2 Apr 2022 11:58:55 +0900, Michael Paquier <michael@paquier.xyz> wrote in

On Fri, Apr 01, 2022 at 01:22:05PM -0400, Robert Haas wrote:

I attach herewith a modest patch to rename these shutdown modes to
more accurately correspond to their actual characteristics.

Date: Fri, 1 Apr 2022 12:50:05 -0400

I love the idea. Just in time, before the feature freeze deadline.

FWIW, this came in to my mailbox with at "4/2 2:22 JST":p

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#7Robert Haas
robertmhaas@gmail.com
In reply to: Noname (#5)
Re: PostgreSQL shutdown modes

On Sat, Apr 2, 2022 at 9:39 AM <chap@anastigmatix.net> wrote:

I've waited for April 2nd to submit this comment, but it seemed to me
that the
suggestion about the first-pass checkpoint in 'slow' mode is a
no-foolin' good one.

Yeah. While the patch itself is mostly in jest, everything I wrote in
the email is unfortunately pretty much 100% accurate, no fooling. I
think it would be worth doing a number of things:

- Provide some way of backing out of smart shutdown mode.
- Provide some way of making a smart shutdown turn into a fast
shutdown after a configurable period of time.
- Do a preparatory checkpoint before the real shutdown checkpoint
especially in fast mode, but maybe also in smart mode. Maybe there's
some even smarter thing we could be doing here, not sure what exactly.
- Consider renaming "immediate" mode, maybe to "crash" or something.
Oracle uses "abort".

Then I wondered whether there could be an option to accompany the 'dumb'
mode that
would take a WHERE clause, to be implicitly applied to pg_stat_activity,
whose
purpose would be to select those sessions that are ok to evict without
waiting for
them to exit. It could recognize, say, backend connections in no current
transaction
that are from your pesky app or connection pooler that holds things
open. It could
also, for example, select things in transaction state but where
current_timestamp - state_change > '5 minutes' (so it would be
re-evaluated every
so often until ready to shut down).

Seems like this might be better done in user-space than hard-coded
into the server behavior.

--
Robert Haas
EDB: http://www.enterprisedb.com