PITR: enhance getRecordTimestamp()
For PITR, getRecordTimestamp() did not include all record types that
contain times.
Add handling for checkpoints, end of recovery and prepared xact record types.
Based on earlier discussions with community members.
Also, allow the option of recovery_target_use_origin_time = off (default) | on.
This allows PITR to consider whether it should use the local server
time of changes, or whether it should use the origin time on each
node. This is useful in multi-node data recovery.
This is part of a series of enhancements to PITR, in no specific order.
Passes make check and recovery testing; includes docs.
--
Simon Riggs http://www.EnterpriseDB.com/
Attachments:
pitr_enhance_getRecordTimestamp.v1.patchapplication/octet-stream; name=pitr_enhance_getRecordTimestamp.v1.patchDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 3eee988359..20260a8527 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3840,7 +3840,8 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
This parameter specifies the time stamp up to which recovery
will proceed.
The precise stopping point is also influenced by
- <xref linkend="guc-recovery-target-inclusive"/>.
+ <xref linkend="guc-recovery-target-inclusive"/> and
+ <xref linkend="guc-recovery-target-use-origin-time"/>.
</para>
<para>
@@ -3921,6 +3922,28 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
</listitem>
</varlistentry>
+ <variablelist>
+ <varlistentry id="guc-recovery-target-use-origin-time"
+ xreflabel="recovery_target_use_origin_time">
+ <term><varname>recovery_target_use_origin_time</varname> (<type>boolean</type>)
+ <indexterm>
+ <primary><varname>recovery_target_use_origin_time</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Specifies whether to use the timestamp from the local commit
+ record (<literal>off</literal>), or, if one exists, to use
+ the timestamp as recorded by the origin, for commits that
+ arrive by logical replication from another server.
+ This allows a PITR recovery to have a single consistent
+ timestamp across multiple servers, if that is desirable.
+ Applies when <xref linkend="guc-recovery-target-time"/>
+ is specified. Default is <literal>on</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-recovery-target-timeline"
xreflabel="recovery_target_timeline">
<term><varname>recovery_target_timeline</varname> (<type>string</type>)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 1b3a3d9bea..d2d9eca47f 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -284,6 +284,7 @@ char *recoveryEndCommand = NULL;
char *archiveCleanupCommand = NULL;
RecoveryTargetType recoveryTarget = RECOVERY_TARGET_UNSET;
bool recoveryTargetInclusive = true;
+bool recoveryTargetUseOriginTime = false;
int recoveryTargetAction = RECOVERY_TARGET_ACTION_PAUSE;
TransactionId recoveryTargetXid;
char *recovery_target_time_string;
@@ -5699,25 +5700,81 @@ static bool
getRecordTimestamp(XLogReaderState *record, TimestampTz *recordXtime)
{
uint8 info = XLogRecGetInfo(record) & ~XLR_INFO_MASK;
- uint8 xact_info = info & XLOG_XACT_OPMASK;
uint8 rmid = XLogRecGetRmid(record);
- if (rmid == RM_XLOG_ID && info == XLOG_RESTORE_POINT)
+ if (rmid == RM_XLOG_ID)
{
- *recordXtime = ((xl_restore_point *) XLogRecGetData(record))->rp_time;
- return true;
- }
- if (rmid == RM_XACT_ID && (xact_info == XLOG_XACT_COMMIT ||
- xact_info == XLOG_XACT_COMMIT_PREPARED))
- {
- *recordXtime = ((xl_xact_commit *) XLogRecGetData(record))->xact_time;
- return true;
+ if (info == XLOG_RESTORE_POINT)
+ {
+ *recordXtime = ((xl_restore_point *) XLogRecGetData(record))->rp_time;
+ return true;
+ }
+ if (info == XLOG_CHECKPOINT_ONLINE ||
+ info == XLOG_CHECKPOINT_SHUTDOWN)
+ {
+ *recordXtime = time_t_to_timestamptz(((CheckPoint *) XLogRecGetData(record))->time);
+ return true;
+ }
+ if (info == XLOG_END_OF_RECOVERY)
+ {
+ *recordXtime = ((xl_end_of_recovery *) XLogRecGetData(record))->end_time;
+ return true;
+ }
}
- if (rmid == RM_XACT_ID && (xact_info == XLOG_XACT_ABORT ||
- xact_info == XLOG_XACT_ABORT_PREPARED))
+ if (rmid == RM_XACT_ID)
{
- *recordXtime = ((xl_xact_abort *) XLogRecGetData(record))->xact_time;
- return true;
+ uint8 xact_info = info & XLOG_XACT_OPMASK;
+
+ if (xact_info == XLOG_XACT_COMMIT ||
+ xact_info == XLOG_XACT_COMMIT_PREPARED)
+ {
+ if (recoveryTargetUseOriginTime)
+ {
+ xl_xact_commit *xlrec = (xl_xact_commit *) XLogRecGetData(record);
+ xl_xact_parsed_commit parsed;
+
+ ParseCommitRecord(XLogRecGetInfo(record),
+ xlrec,
+ &parsed);
+ *recordXtime = parsed.origin_timestamp;
+ }
+ else
+ *recordXtime = ((xl_xact_commit *) XLogRecGetData(record))->xact_time;
+ return true;
+ }
+ if (xact_info == XLOG_XACT_ABORT ||
+ xact_info == XLOG_XACT_ABORT_PREPARED)
+ {
+ if (recoveryTargetUseOriginTime)
+ {
+ xl_xact_abort *xlrec = (xl_xact_abort *) XLogRecGetData(record);
+ xl_xact_parsed_abort parsed;
+
+ ParseAbortRecord(XLogRecGetInfo(record),
+ xlrec,
+ &parsed);
+ *recordXtime = parsed.origin_timestamp;
+ }
+ else
+ *recordXtime = ((xl_xact_abort *) XLogRecGetData(record))->xact_time;
+ return true;
+ }
+ if (xact_info == XLOG_XACT_PREPARE)
+ {
+ if (recoveryTargetUseOriginTime)
+ {
+ xl_xact_prepare *xlrec = (xl_xact_prepare *) XLogRecGetData(record);
+ xl_xact_parsed_prepare parsed;
+
+ ParsePrepareRecord(XLogRecGetInfo(record),
+ xlrec,
+ &parsed);
+ *recordXtime = parsed.origin_timestamp;
+ }
+ else
+ *recordXtime = ((xl_xact_prepare *) XLogRecGetData(record))->prepared_at;
+ return true;
+ }
}
return false;
}
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 297e705b80..b30ffccc91 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1897,6 +1897,16 @@ static struct config_bool ConfigureNamesBool[] =
NULL, NULL, NULL
},
+ {
+ {"recovery_target_use_origin_time", PGC_POSTMASTER, WAL_RECOVERY_TARGET,
+ gettext_noop("Sets whether to use local or origin transaction time with recovery target."),
+ NULL
+ },
+ &recoveryTargetUseOriginTime,
+ false,
+ NULL, NULL, NULL
+ },
+
{
{"hot_standby", PGC_POSTMASTER, REPLICATION_STANDBY,
gettext_noop("Allows connections and queries during recovery."),
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 77187c12be..8ab4f1fc67 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -126,6 +126,7 @@ extern char *recoveryRestoreCommand;
extern char *recoveryEndCommand;
extern char *archiveCleanupCommand;
extern bool recoveryTargetInclusive;
+extern bool recoveryTargetUseOriginTime;
extern int recoveryTargetAction;
extern int recovery_min_apply_delay;
extern char *PrimaryConnInfo;
On 30 Jun 2021, at 11:59, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
For PITR, getRecordTimestamp() did not include all record types that
contain times.
Add handling for checkpoints, end of recovery and prepared xact record types.
+ <variablelist>
This breaks doc compilation, and looks like a stray tag as you want this entry
in the currently open variablelist?
--
Daniel Gustafsson https://vmware.com/
On Wed, 3 Nov 2021 at 13:28, Daniel Gustafsson <daniel@yesql.se> wrote:
On 30 Jun 2021, at 11:59, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
For PITR, getRecordTimestamp() did not include all record types that
contain times.
Add handling for checkpoints, end of recovery and prepared xact record types.+ <variablelist>
This breaks doc compilation, and looks like a stray tag as you want this entry
in the currently open variablelist?
Thanks. Fixed and rebased.
--
Simon Riggs http://www.EnterpriseDB.com/
Attachments:
pitr_enhance_getRecordTimestamp.v2.patchapplication/octet-stream; name=pitr_enhance_getRecordTimestamp.v2.patchDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index de77f14573..462f27fc04 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3858,7 +3858,8 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
This parameter specifies the time stamp up to which recovery
will proceed.
The precise stopping point is also influenced by
- <xref linkend="guc-recovery-target-inclusive"/>.
+ <xref linkend="guc-recovery-target-inclusive"/> and
+ <xref linkend="guc-recovery-target-use-origin-time"/>.
</para>
<para>
@@ -3939,6 +3940,27 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
</listitem>
</varlistentry>
+ <varlistentry id="guc-recovery-target-use-origin-time"
+ xreflabel="recovery_target_use_origin_time">
+ <term><varname>recovery_target_use_origin_time</varname> (<type>boolean</type>)
+ <indexterm>
+ <primary><varname>recovery_target_use_origin_time</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Specifies whether to use the timestamp from the local commit
+ record (<literal>off</literal>), or, if one exists, to use
+ the timestamp as recorded by the origin, for commits that
+ arrive by logical replication from another server.
+ This allows a PITR recovery to have a single consistent
+ timestamp across multiple servers, if that is desirable.
+ Applies when <xref linkend="guc-recovery-target-time"/>
+ is specified. Default is <literal>on</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-recovery-target-timeline"
xreflabel="recovery_target_timeline">
<term><varname>recovery_target_timeline</varname> (<type>string</type>)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 0a0771a18e..9c0c5e389c 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -275,6 +275,7 @@ char *recoveryEndCommand = NULL;
char *archiveCleanupCommand = NULL;
RecoveryTargetType recoveryTarget = RECOVERY_TARGET_UNSET;
bool recoveryTargetInclusive = true;
+bool recoveryTargetUseOriginTime = false;
int recoveryTargetAction = RECOVERY_TARGET_ACTION_PAUSE;
TransactionId recoveryTargetXid;
char *recovery_target_time_string;
@@ -5840,25 +5841,81 @@ static bool
getRecordTimestamp(XLogReaderState *record, TimestampTz *recordXtime)
{
uint8 info = XLogRecGetInfo(record) & ~XLR_INFO_MASK;
- uint8 xact_info = info & XLOG_XACT_OPMASK;
uint8 rmid = XLogRecGetRmid(record);
- if (rmid == RM_XLOG_ID && info == XLOG_RESTORE_POINT)
+ if (rmid == RM_XLOG_ID)
{
- *recordXtime = ((xl_restore_point *) XLogRecGetData(record))->rp_time;
- return true;
- }
- if (rmid == RM_XACT_ID && (xact_info == XLOG_XACT_COMMIT ||
- xact_info == XLOG_XACT_COMMIT_PREPARED))
- {
- *recordXtime = ((xl_xact_commit *) XLogRecGetData(record))->xact_time;
- return true;
+ if (info == XLOG_RESTORE_POINT)
+ {
+ *recordXtime = ((xl_restore_point *) XLogRecGetData(record))->rp_time;
+ return true;
+ }
+ if (info == XLOG_CHECKPOINT_ONLINE ||
+ info == XLOG_CHECKPOINT_SHUTDOWN)
+ {
+ *recordXtime = time_t_to_timestamptz(((CheckPoint *) XLogRecGetData(record))->time);
+ return true;
+ }
+ if (info == XLOG_END_OF_RECOVERY)
+ {
+ *recordXtime = ((xl_end_of_recovery *) XLogRecGetData(record))->end_time;
+ return true;
+ }
}
- if (rmid == RM_XACT_ID && (xact_info == XLOG_XACT_ABORT ||
- xact_info == XLOG_XACT_ABORT_PREPARED))
+ if (rmid == RM_XACT_ID)
{
- *recordXtime = ((xl_xact_abort *) XLogRecGetData(record))->xact_time;
- return true;
+ uint8 xact_info = info & XLOG_XACT_OPMASK;
+
+ if (xact_info == XLOG_XACT_COMMIT ||
+ xact_info == XLOG_XACT_COMMIT_PREPARED)
+ {
+ if (recoveryTargetUseOriginTime)
+ {
+ xl_xact_commit *xlrec = (xl_xact_commit *) XLogRecGetData(record);
+ xl_xact_parsed_commit parsed;
+
+ ParseCommitRecord(XLogRecGetInfo(record),
+ xlrec,
+ &parsed);
+ *recordXtime = parsed.origin_timestamp;
+ }
+ else
+ *recordXtime = ((xl_xact_commit *) XLogRecGetData(record))->xact_time;
+ return true;
+ }
+ if (xact_info == XLOG_XACT_ABORT ||
+ xact_info == XLOG_XACT_ABORT_PREPARED)
+ {
+ if (recoveryTargetUseOriginTime)
+ {
+ xl_xact_abort *xlrec = (xl_xact_abort *) XLogRecGetData(record);
+ xl_xact_parsed_abort parsed;
+
+ ParseAbortRecord(XLogRecGetInfo(record),
+ xlrec,
+ &parsed);
+ *recordXtime = parsed.origin_timestamp;
+ }
+ else
+ *recordXtime = ((xl_xact_abort *) XLogRecGetData(record))->xact_time;
+ return true;
+ }
+ if (xact_info == XLOG_XACT_PREPARE)
+ {
+ if (recoveryTargetUseOriginTime)
+ {
+ xl_xact_prepare *xlrec = (xl_xact_prepare *) XLogRecGetData(record);
+ xl_xact_parsed_prepare parsed;
+
+ ParsePrepareRecord(XLogRecGetInfo(record),
+ xlrec,
+ &parsed);
+ *recordXtime = parsed.origin_timestamp;
+ }
+ else
+ *recordXtime = ((xl_xact_prepare *) XLogRecGetData(record))->prepared_at;
+ return true;
+ }
}
return false;
}
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e91d5a3cfd..9693afc602 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1906,6 +1906,16 @@ static struct config_bool ConfigureNamesBool[] =
NULL, NULL, NULL
},
+ {
+ {"recovery_target_use_origin_time", PGC_POSTMASTER, WAL_RECOVERY_TARGET,
+ gettext_noop("Sets whether to use local or origin transaction time with recovery target."),
+ NULL
+ },
+ &recoveryTargetUseOriginTime,
+ false,
+ NULL, NULL, NULL
+ },
+
{
{"hot_standby", PGC_POSTMASTER, REPLICATION_STANDBY,
gettext_noop("Allows connections and queries during recovery."),
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index c0a560204b..65c6633c0d 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -84,6 +84,7 @@ extern char *recoveryRestoreCommand;
extern char *recoveryEndCommand;
extern char *archiveCleanupCommand;
extern bool recoveryTargetInclusive;
+extern bool recoveryTargetUseOriginTime;
extern int recoveryTargetAction;
extern int recovery_min_apply_delay;
extern char *PrimaryConnInfo;
On Wed, Nov 03, 2021 at 04:59:04PM +0000, Simon Riggs wrote:
Thanks. Fixed and rebased.
+ if (xact_info == XLOG_XACT_PREPARE)
+ {
+ if (recoveryTargetUseOriginTime)
+ {
+ xl_xact_prepare *xlrec = (xl_xact_prepare *) XLogRecGetData(record);
+ xl_xact_parsed_prepare parsed;
+
+ ParsePrepareRecord(XLogRecGetInfo(record),
+ xlrec,
+ &parsed);
+ *recordXtime = parsed.origin_timestamp;
+ }
+ else
+ *recordXtime = ((xl_xact_prepare *) XLogRecGetData(record))->prepared_at;
As I learnt recently with ece8c76, there are cases where an origin
timestamp may not be set in the WAL record that includes the origin
timestamp depending on the setup done on the origin cluster. Isn't
this code going to finish by returning true when enabling
recovery_target_use_origin_time in some cases, even if recordXtime is
0? So it seems to me that this is lacking some sanity checks if
recordXtime is 0.
Could you add some tests for this proposal? This adds various PITR
scenarios that would be uncovered, and TAP should be able to cover
that.
--
Michael
On Thu, 27 Jan 2022 at 06:58, Michael Paquier <michael@paquier.xyz> wrote:
On Wed, Nov 03, 2021 at 04:59:04PM +0000, Simon Riggs wrote:
Thanks. Fixed and rebased.
+ if (xact_info == XLOG_XACT_PREPARE) + { + if (recoveryTargetUseOriginTime) + { + xl_xact_prepare *xlrec = (xl_xact_prepare *) XLogRecGetData(record); + xl_xact_parsed_prepare parsed; + + ParsePrepareRecord(XLogRecGetInfo(record), + xlrec, + &parsed); + *recordXtime = parsed.origin_timestamp; + } + else + *recordXtime = ((xl_xact_prepare *) XLogRecGetData(record))->prepared_at;As I learnt recently with ece8c76, there are cases where an origin
timestamp may not be set in the WAL record that includes the origin
timestamp depending on the setup done on the origin cluster. Isn't
this code going to finish by returning true when enabling
recovery_target_use_origin_time in some cases, even if recordXtime is
0? So it seems to me that this is lacking some sanity checks if
recordXtime is 0.Could you add some tests for this proposal? This adds various PITR
scenarios that would be uncovered, and TAP should be able to cover
that.
Thanks. Yes, will look at that.
--
Simon Riggs http://www.EnterpriseDB.com/