Allow users to choose what happens when recovery target is not reached

Started by Bharath Rupireddyabout 4 years ago7 messages

bharath.rupireddyforpostgres@gmail.com

about 4 years ago

1 attachment(s)

Hi,

Currently, the server shuts down with a FATAL error (added by commit
[1]: commit dc788668bb269b10a108e87d14fefd1b9301b793 Author: Peter Eisentraut <peter@eisentraut.org> Date: Wed Jan 29 15:43:32 2020 +0100
availability problem, especially in case of disaster recovery (geo
restores) where the primary was down and the user is doing a PITR on a
server lying in another region where it had missed to receive few of
the last WAL files required to reach the recovery target. In this
case, users might want the server to be available rather than a no
server. With the commit [1]commit dc788668bb269b10a108e87d14fefd1b9301b793 Author: Peter Eisentraut <peter@eisentraut.org> Date: Wed Jan 29 15:43:32 2020 +0100, there's no way to achieve what users
wanted.

There can be many reasons for the last few WAL files not reaching the
target server where the user is performing the PITR. The primary may
have been down before archiving the last few WAL files to the archive
locations, or archive command fails for whatever reasons or network
latency from primary to archive location and archive location to the
target server, or recovery command on the target server fails or users
may have chosen some wrong/futuristic recovery targets etc. If the
PITR fails with FATAL error and we may ask them to restart the server,
but imagine the wastage of compute resources - if there are a 1 TB of
WAL files to be replayed and just last 16MB WAL file is missing,
everything has to be replayed from the beginning.

Here's a proposal(and a patch) to have a GUC so that users can choose
either to emit a warning and promote or shutdown with FATAL error (as
default) when recovery target isn't reached. In reality, users can
choose to shutdown with FATAL error, if strict consistency is the
necessity, otherwise they can choose to get promoted, if availability
is preferred. There is some discussion around this idea in [2]/messages/by-id/b334d61396e6b0657a63dc38e16d429703fe9b96.camel@j-davis.com.

Thoughts?

[1]: commit dc788668bb269b10a108e87d14fefd1b9301b793 Author: Peter Eisentraut <peter@eisentraut.org> Date: Wed Jan 29 15:43:32 2020 +0100
Author: Peter Eisentraut <peter@eisentraut.org>
Date: Wed Jan 29 15:43:32 2020 +0100

Fail if recovery target is not reached

Before, if a recovery target is configured, but the archive ended
before the target was reached, recovery would end and the server would
promote without further notice. That was deemed to be pretty wrong.
With this change, if the recovery target is not reached, it is a fatal
error.

Based-on-patch-by: Leif Gunnar Erlandsen <leif@lako.no>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion:
/messages/by-id/993736dd3f1713ec1f63fc3b653839f5@lako.no

[2]: /messages/by-id/b334d61396e6b0657a63dc38e16d429703fe9b96.camel@j-davis.com

Regards,
Bharath Rupireddy.

Attachments:

v1-0001-Allow-users-to-choose-what-happens-when-recovery-.patchapplication/octet-stream; name=v1-0001-Allow-users-to-choose-what-happens-when-recovery-.patchDownload

From 10a103d543145d110f654fbeaba926b2157f6600 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 12 Nov 2021 09:50:09 +0000
Subject: [PATCH v1] Allow users to choose what happens when recovery target is
 not reached

Currently, the server shuts down with a FATAL error (added by commit
dc78866) when the recovery target isn't reached. This can cause a
server availability problem, especially in case of disaster
recovery (geo restores) where the primary was down and the user is
doing a PITR on a server lying in another region where it had missed
to receive few of the last WAL files required to reach the recovery
target. In this case, users might want the server to be available
rather than a no server. With the commit [1], there's no way to achieve
what users wanted.

This patch adds a new GUC so that users can choose either to emit
a warning and promote or shutdown with FATAL error (as default)
when recovery target isn't reached. In reality, users can choose
to shutdown with FATAL error, if strict consistency is the necessity,
otherwise they can choose to get promoted, if availability is
preferred.
---
 doc/src/sgml/config.sgml                      | 28 ++++++++++-
 src/backend/access/transam/xlog.c             | 30 ++++++++++-
 src/backend/utils/misc/guc.c                  | 11 ++++
 src/backend/utils/misc/postgresql.conf.sample |  3 +-
 src/include/access/xlog.h                     |  1 +
 src/include/access/xlog_internal.h            |  9 ++++
 src/test/recovery/t/003_recovery_targets.pl   | 50 ++++++++++++++++++-
 7 files changed, 126 insertions(+), 6 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 3f806740d5..8641f759ec 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4017,8 +4017,32 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
        </para>
        <para>
         In any case, if a recovery target is configured but the archive
-        recovery ends before the target is reached, the server will shut down
-        with a fatal error.
+        recovery ends before the target is reached, the server, depending on
+        <xref linkend="guc-recovery-end-before-target-action"/> parameter, will
+        either shutdown with a fatal error or emit a warning and promote.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry id="guc-recovery-end-before-target-action"
+                   xreflabel="recovery_end_before_target_action">
+      <term><varname>recovery_end_before_target_action</varname> (<type>enum</type>)
+      <indexterm>
+        <primary><varname>recovery_end_before_target_action</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Specifies what action the server should take when the recovery target is
+        not reached. The default is <literal>shutdown</literal>, which will stop
+        the server. <literal>promote</literal> means the recovery process will
+        finish and the server will start to accept connections. This parameter
+        has no effect if no recovery target is set.
+       </para>
+       <para>
+        The intended use of the <literal>promote</literal> setting is to allow
+        user to have the server available at least.
+       </para>
        </para>
       </listitem>
      </varlistentry>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index e073121a7e..d3ebf48715 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -184,6 +184,12 @@ const struct config_enum_entry recovery_target_action_options[] = {
 	{NULL, 0, false}
 };
 
+const struct config_enum_entry recovery_end_before_target_action_options[] = {
+	{"shutdown", RECOVERY_END_BEFORE_TARGET_ACTION_SHUTDOWN, false},
+	{"promote", RECOVERY_END_BEFORE_TARGET_ACTION_PROMOTE, false},
+	{NULL, 0, false}
+};
+
 /*
  * Statistics for current checkpoint are collected in this global struct.
  * Because only the checkpointer or a stand-alone backend can perform
@@ -273,6 +279,7 @@ char	   *archiveCleanupCommand = NULL;
 RecoveryTargetType recoveryTarget = RECOVERY_TARGET_UNSET;
 bool		recoveryTargetInclusive = true;
 int			recoveryTargetAction = RECOVERY_TARGET_ACTION_PAUSE;
+int			recoveryEndBeforeTargetAction = RECOVERY_END_BEFORE_TARGET_ACTION_SHUTDOWN;
 TransactionId recoveryTargetXid;
 char	   *recovery_target_time_string;
 static TimestampTz recoveryTargetTime;
@@ -7852,8 +7859,27 @@ StartupXLOG(void)
 		if (ArchiveRecoveryRequested &&
 			recoveryTarget != RECOVERY_TARGET_UNSET &&
 			!reachedRecoveryTarget)
-			ereport(FATAL,
-					(errmsg("recovery ended before configured recovery target was reached")));
+		{
+			switch (recoveryEndBeforeTargetAction)
+			{
+				case RECOVERY_END_BEFORE_TARGET_ACTION_SHUTDOWN:
+					ereport(FATAL,
+							(errmsg("recovery ended before configured recovery target was reached")));
+					break;
+				case RECOVERY_END_BEFORE_TARGET_ACTION_PROMOTE:
+					/*
+					 * Do not shutdown the server but issue a warning and
+					 * promote so that the user can have it available at least.
+					 * Note that it is the behaviour chosen by the user and the
+					 * server is not guaranteed to be in a consistent state
+					 * though.
+					 */
+					ereport(WARNING,
+							(errmsg("recovery ended before configured recovery target was reached, but promoting the server as parameter \"%s\" is set to \"%s\"",
+							"recovery_end_before_target_action", "promote")));
+					break;
+			}
+		}
 	}
 
 	/*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e91d5a3cfd..f239e9a5a6 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -563,6 +563,7 @@ static const struct config_enum_entry wal_compression_options[] = {
 extern const struct config_enum_entry wal_level_options[];
 extern const struct config_enum_entry archive_mode_options[];
 extern const struct config_enum_entry recovery_target_action_options[];
+extern const struct config_enum_entry recovery_end_before_target_action_options[];
 extern const struct config_enum_entry sync_method_options[];
 extern const struct config_enum_entry dynamic_shared_memory_options[];
 
@@ -4833,6 +4834,16 @@ static struct config_enum ConfigureNamesEnum[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"recovery_end_before_target_action", PGC_POSTMASTER, WAL_RECOVERY_TARGET,
+			gettext_noop("Sets the action to perform when the recovery target is not reached."),
+			NULL
+		},
+		&recoveryEndBeforeTargetAction,
+		RECOVERY_END_BEFORE_TARGET_ACTION_SHUTDOWN, recovery_end_before_target_action_options,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"trace_recovery_messages", PGC_SIGHUP, DEVELOPER_OPTIONS,
 			gettext_noop("Enables logging of recovery-related debugging information."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 1cbc9feeb6..2697180803 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -286,7 +286,8 @@
 				# (change requires restart)
 #recovery_target_action = 'pause'	# 'pause', 'promote', 'shutdown'
 				# (change requires restart)
-
+#recovery_end_before_target_action = 'shutdown'	# 'promote', 'shutdown'
+				# (change requires restart)
 
 #------------------------------------------------------------------------------
 # REPLICATION
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 898df2ee03..75d3500068 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -83,6 +83,7 @@ extern char *recoveryEndCommand;
 extern char *archiveCleanupCommand;
 extern bool recoveryTargetInclusive;
 extern int	recoveryTargetAction;
+extern int	recoveryEndBeforeTargetAction;
 extern int	recovery_min_apply_delay;
 extern char *PrimaryConnInfo;
 extern char *PrimarySlotName;
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index c0da76cab4..bc2caecd53 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -287,6 +287,15 @@ typedef enum
 	RECOVERY_TARGET_ACTION_SHUTDOWN
 }			RecoveryTargetAction;
 
+/*
+ * Recovery end before target action.
+ */
+typedef enum
+{
+	RECOVERY_END_BEFORE_TARGET_ACTION_SHUTDOWN,
+	RECOVERY_END_BEFORE_TARGET_ACTION_PROMOTE
+}			RecoveryEndBeforeTargetAction;
+
 /*
  * Method table for resource managers.
  *
diff --git a/src/test/recovery/t/003_recovery_targets.pl b/src/test/recovery/t/003_recovery_targets.pl
index 0d0636b85c..9f45b38ff5 100644
--- a/src/test/recovery/t/003_recovery_targets.pl
+++ b/src/test/recovery/t/003_recovery_targets.pl
@@ -6,7 +6,7 @@ use strict;
 use warnings;
 use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
-use Test::More tests => 9;
+use Test::More tests => 11;
 use Time::HiRes qw(usleep);
 
 # Create and test a standby from given backup, with a certain recovery target.
@@ -182,3 +182,51 @@ $logfile = slurp_file($node_standby->logfile());
 ok( $logfile =~
 	  qr/FATAL: .* recovery ended before configured recovery target was reached/,
 	'recovery end before target reached is a fatal error');
+
+# Check for a pattern in the logs associated to one format.
+sub check_server_logs
+{
+	local $Test::Builder::Level = $Test::Builder::Level + 1;
+
+	my $node      = shift;
+	my $pattern   = shift;
+	my $test_name = shift;
+
+	my $max_attempts = 180 * 10;
+
+	my $logcontents;
+	for (my $attempts = 0; $attempts < $max_attempts; $attempts++)
+	{
+		$logcontents = slurp_file($node->logfile());
+		last if $logcontents =~ m/$pattern/;
+		usleep(100_000);
+	}
+
+	like($logcontents, qr/$pattern/, "check server log for $test_name");
+	return;
+}
+
+# Check behavior when recovery ends before target is reached but
+# recovery_end_before_target_action is set to 'promote'
+
+$node_standby = PostgreSQL::Test::Cluster->new('standby_9');
+$node_standby->init_from_backup(
+	$node_primary, 'my_backup',
+	has_restoring => 1,
+	standby       => 0);
+$node_standby->append_conf('postgresql.conf',
+	"recovery_target_name = 'does_not_exist'
+	 recovery_end_before_target_action = 'promote'");
+
+$node_standby->start;
+
+my $msg = 'WARNING: .* recovery ended before configured recovery target was reached, but promoting the server as parameter "recovery_end_before_target_action" is set to "promote"';
+my $test_name = 'recovery ended before message with recovery_end_before_target_action set to promote';
+check_server_logs($node_standby, $msg, $test_name);
+
+$msg = 'LOG: .* database system is ready to accept connections';
+$test_name = 'database system is ready to accept connections message with recovery_end_before_target_action set to promote';
+check_server_logs($node_standby, $msg, $test_name);
+
+# Stop standby node
+$node_standby->teardown_node;
-- 
2.25.1

Julien Rouhaud

rjuju123@gmail.com

about 4 years ago

In reply to: Bharath Rupireddy (#1)

Re: Allow users to choose what happens when recovery target is not reached

On Fri, Nov 12, 2021 at 6:14 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Currently, the server shuts down with a FATAL error (added by commit
[1]) when the recovery target isn't reached. This can cause a server
availability problem, especially in case of disaster recovery (geo
restores) where the primary was down and the user is doing a PITR on a
server lying in another region where it had missed to receive few of
the last WAL files required to reach the recovery target. In this
case, users might want the server to be available rather than a no
server. With the commit [1], there's no way to achieve what users
wanted.

if users don't mind if the recovery target is reached or not isn't it
better to simply don't specify a target and let the recovery go as far
as possible?

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

about 4 years ago

In reply to: Julien Rouhaud (#2)

Re: Allow users to choose what happens when recovery target is not reached

On Fri, Nov 12, 2021 at 4:09 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Fri, Nov 12, 2021 at 6:14 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Currently, the server shuts down with a FATAL error (added by commit
[1]) when the recovery target isn't reached. This can cause a server
availability problem, especially in case of disaster recovery (geo
restores) where the primary was down and the user is doing a PITR on a
server lying in another region where it had missed to receive few of
the last WAL files required to reach the recovery target. In this
case, users might want the server to be available rather than a no
server. With the commit [1], there's no way to achieve what users
wanted.

if users don't mind if the recovery target is reached or not isn't it
better to simply don't specify a target and let the recovery go as far
as possible?

Users will always be optimistic and set a recovery target and try to
reach it, but somehow the few of the WAL files haven't arrived (for
whatever the reasons) the PITR target server, imagine if their primary
isn't available too, then with the proposal I made, they can choose to
have at least an available target server rather than a FATALly failed
one.

Regards,
Bharath Rupireddy.

Julien Rouhaud

rjuju123@gmail.com

about 4 years ago

In reply to: Bharath Rupireddy (#3)

Re: Allow users to choose what happens when recovery target is not reached

On Sat, Nov 13, 2021 at 11:00 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Users will always be optimistic and set a recovery target and try to
reach it, but somehow the few of the WAL files haven't arrived (for
whatever the reasons) the PITR target server, imagine if their primary
isn't available too, then with the proposal I made, they can choose to
have at least an available target server rather than a FATALly failed
one.

If your primary server isn't available, why would you want a recovery
target in the first place? I just don't understand in which case
someone would want to setup a recovery target and wouldn't care if the
recovery wasn't reached, especially if it can be off by GB / days of
data.

It seems like it could have the opposite effect of what you want most
of the time. What if for some reason the restore_command is flawed,
and you end up starting your server because it couldn't restore WAL
that are actually available? You would have to restart from scratch
and waste more time than if you didn't use this.

It look like what you actually want is some kind of a target window,
but the window you currently propose is a hardcoded (consistency,
given target], and it seems too dangerous to be useful.

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

about 4 years ago

In reply to: Julien Rouhaud (#4)

Re: Allow users to choose what happens when recovery target is not reached

On Sat, Nov 13, 2021 at 9:45 AM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Sat, Nov 13, 2021 at 11:00 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Users will always be optimistic and set a recovery target and try to
reach it, but somehow the few of the WAL files haven't arrived (for
whatever the reasons) the PITR target server, imagine if their primary
isn't available too, then with the proposal I made, they can choose to
have at least an available target server rather than a FATALly failed
one.

If your primary server isn't available, why would you want a recovery
target in the first place? I just don't understand in which case
someone would want to setup a recovery target and wouldn't care if the
recovery wasn't reached, especially if it can be off by GB / days of
data.

It seems like it could have the opposite effect of what you want most
of the time. What if for some reason the restore_command is flawed,
and you end up starting your server because it couldn't restore WAL
that are actually available? You would have to restart from scratch
and waste more time than if you didn't use this.

Firstly, the proposed patch adds no new behaviour as such, it just
gives the ability that is existing today on v12 and below (prior to
commit dc78866 which went into v13 and later).

I think performing PITR is the user's wish - whether the primary is
available or not, it is completely the user's choice. The user might
start the PITR, when the primary is available, thinking that it sends
all the WAL files required for achieving recovery target. But imagine
a disaster happens and the primary server crashes, say the recovery
has replayed a huge bunch of WAL records (a TB may be), and the
primary failed without sending the last one or few WAL files, should
the PITR target server be failing this case after replaying a huge
bunch of WAL records? The user might want the target server to be
available instead of FATALly shutting down. This is the exact problem
the proposed patch is trying to solve.

With the GUC proposed, the user can choose what to do in these
scenarios. The user will be fully aware what she needs when she choose
to set the new GUC recovery_end_before_target_action to 'promote'
instead of default 'shutdown'.

It look like what you actually want is some kind of a target window,
but the window you currently propose is a hardcoded (consistency,
given target], and it seems too dangerous to be useful.

As I said earlier, the behaviour is not too dangerous as it is not
something new that the patch is proposing, it exists today in v12 and
below. In fact, it gives a way out of a "dangerous situation" if the
user ever gets stuck in it without wasting recovery cycles and compute
resources, by quickly getting the database to be available(of course,
the responsibility lies with the user to deal with the missing WAL
files).

Regards,
Bharath Rupireddy.

Euler Taveira

euler@eulerto.com

about 4 years ago

In reply to: Bharath Rupireddy (#5)

Re: Allow users to choose what happens when recovery target is not reached

On Sat, Nov 13, 2021, at 10:15 AM, Bharath Rupireddy wrote:

Firstly, the proposed patch adds no new behaviour as such, it just
gives the ability that is existing today on v12 and below (prior to
commit dc78866 which went into v13 and later).

It reintroduces an awkward behavior [1]/messages/by-id/234a0c50-1160-86c2-4e4b-35e9684f1799@2ndquadrant.com.

I think performing PITR is the user's wish - whether the primary is
available or not, it is completely the user's choice. The user might
start the PITR, when the primary is available, thinking that it sends
all the WAL files required for achieving recovery target. But imagine
a disaster happens and the primary server crashes, say the recovery
has replayed a huge bunch of WAL records (a TB may be), and the
primary failed without sending the last one or few WAL files, should
the PITR target server be failing this case after replaying a huge
bunch of WAL records? The user might want the target server to be
available instead of FATALly shutting down. This is the exact problem
the proposed patch is trying to solve.

Are you archiving on the primary server? You are risking your customer's
business suggesting such setup. You should store the WAL files on your backup
server.

It seems your setup has a flaw. You set a recovery target but accept a scenario
that is not what you initially asked for. If it is a real PITR, it is awkward
like Peter [1]/messages/by-id/234a0c50-1160-86c2-4e4b-35e9684f1799@2ndquadrant.com said. You could validate your recovery settings checking the
timestamp of the last WAL file as a rough approximation of the maximum recovery
target time. The other option is to run pg_waldump to obtain the last commit
timestamp.

If you care about your customer's data, you won't use such option. Otherwise, I
repeat the Julien's question [2]/messages/by-id/CAOBaU_ZDkyoQvEsYT0-p1Hb0m_nGtQJ4tTGm2-Ay6v=TCjmsWg@mail.gmail.com: isn't it better to simply don't specify a target
and let the recovery go as far as possible?

As I said earlier, the behaviour is not too dangerous as it is not
something new that the patch is proposing, it exists today in v12 and
below. In fact, it gives a way out of a "dangerous situation" if the
user ever gets stuck in it without wasting recovery cycles and compute
resources, by quickly getting the database to be available(of course,
the responsibility lies with the user to deal with the missing WAL
files).

Your proposal seems that the user is shooting in the dark. If a FATAL message
was got it means the user missed the target. Even after that the user accepts
the situation, remove the target parameters and start the server again. I think
promote or even pause might lead to incorrect expectations (if the user doesn't
carefully inspect the log messages).

A disadvantage of this proposal is that if you have it set to 'promote', start
the recovery and the server gets promoted before reaching the target. While
inspecting your server configuration, you realized that you are pointing to the
incorrect archive or the WAL files were not available in time (due to timing
issues). You have no option but start from scratch.

[1]: /messages/by-id/234a0c50-1160-86c2-4e4b-35e9684f1799@2ndquadrant.com
[2]: /messages/by-id/CAOBaU_ZDkyoQvEsYT0-p1Hb0m_nGtQJ4tTGm2-Ay6v=TCjmsWg@mail.gmail.com

--
Euler Taveira
EDB https://www.enterprisedb.com/

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 4 years ago

In reply to: Bharath Rupireddy (#5)

Re: Allow users to choose what happens when recovery target is not reached

On Sat, Nov 13, 2021 at 6:45 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Firstly, the proposed patch adds no new behaviour as such, it just
gives the ability that is existing today on v12 and below (prior to
commit dc78866 which went into v13 and later).

I think performing PITR is the user's wish - whether the primary is
available or not, it is completely the user's choice. The user might
start the PITR, when the primary is available, thinking that it sends
all the WAL files required for achieving recovery target. But imagine
a disaster happens and the primary server crashes, say the recovery
has replayed a huge bunch of WAL records (a TB may be), and the
primary failed without sending the last one or few WAL files, should
the PITR target server be failing this case after replaying a huge
bunch of WAL records? The user might want the target server to be
available instead of FATALly shutting down. This is the exact problem
the proposed patch is trying to solve.

With the GUC proposed, the user can choose what to do in these
scenarios. The user will be fully aware what she needs when she choose
to set the new GUC recovery_end_before_target_action to 'promote'
instead of default 'shutdown'.

Hi Hackers, with a recent bug report [1]/messages/by-id/CALj2ACVnCsNyJTG_75+5Us2evfsLYz5CEhmCV4qH=VPa0kWOvw@mail.gmail.com in pgsql-bugs, I'm checking
if the proposal here in this thread interests anyone.

[1]: /messages/by-id/CALj2ACVnCsNyJTG_75+5Us2evfsLYz5CEhmCV4qH=VPa0kWOvw@mail.gmail.com

Regards,
Bharath Rupireddy.