Creating backup history files for backups taken from standbys
Hi all,
In the recent thread related to a bug in pg_stop_backup when waiting
for WAL segments to be archived, it has been mentioned that it would
be nice to get backup history files generated as well on standbys:
/messages/by-id/CA+Tgmobh3=quCzugFP5yA3-z5uzz=YKZ7arnoApJoPN7=0H8TQ@mail.gmail.com
Backup history files are not essential to backups, but they are for
debugging purposes, and it has bugged me hard that we should have a
consistent behavior in this area, particularly because with
archive_mode = always they can be archived by a standby.
The previous thread has ended with 52f8a59d, which addressed the
original problem but discarded the backup history file generation
during recovery. Attached is a patch to fix this inconsistency, parked
in next CF. I am not arguing for this change as a bug fix, but as an
improvement for PG11.
Thoughts?
--
Michael
Attachments:
standby-backup-history.patchapplication/octet-stream; name=standby-backup-history.patchDownload
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index df4843f409..0bfe73f6a3 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -10894,10 +10894,9 @@ do_pg_stop_backup(char *labelfile, bool waitforarchive, TimeLineID *stoptli_p)
* pg_control. This is harmless for current uses.
*
* XXX currently a backup history file is for informational and debug
- * purposes only. It's not essential for an online backup. Furthermore,
- * even if it's created, it will not be archived during recovery because
- * an archiver is not invoked. So it doesn't seem worthwhile to write a
- * backup history file during recovery.
+ * purposes only. It's not essential for an online backup. It gets created
+ * even during recovery as an archiver can be spawned in this case as
+ * well.
*/
if (backup_started_in_recovery)
{
@@ -10942,49 +10941,48 @@ do_pg_stop_backup(char *labelfile, bool waitforarchive, TimeLineID *stoptli_p)
* valid as soon as archiver moves out the current segment file.
*/
RequestXLogSwitch(false);
+ }
- XLByteToPrevSeg(stoppoint, _logSegNo);
- XLogFileName(stopxlogfilename, stoptli, _logSegNo);
-
- /* Use the log timezone here, not the session timezone */
- stamp_time = (pg_time_t) time(NULL);
- pg_strftime(strfbuf, sizeof(strfbuf),
- "%Y-%m-%d %H:%M:%S %Z",
- pg_localtime(&stamp_time, log_timezone));
+ /* Use the log timezone here, not the session timezone */
+ stamp_time = (pg_time_t) time(NULL);
+ pg_strftime(strfbuf, sizeof(strfbuf),
+ "%Y-%m-%d %H:%M:%S %Z",
+ pg_localtime(&stamp_time, log_timezone));
- /*
- * Write the backup history file
- */
- XLByteToSeg(startpoint, _logSegNo);
- BackupHistoryFilePath(histfilepath, stoptli, _logSegNo,
- (uint32) (startpoint % XLogSegSize));
- fp = AllocateFile(histfilepath, "w");
- if (!fp)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not create file \"%s\": %m",
- histfilepath)));
- fprintf(fp, "START WAL LOCATION: %X/%X (file %s)\n",
- (uint32) (startpoint >> 32), (uint32) startpoint, startxlogfilename);
- fprintf(fp, "STOP WAL LOCATION: %X/%X (file %s)\n",
- (uint32) (stoppoint >> 32), (uint32) stoppoint, stopxlogfilename);
- /* transfer remaining lines from label to history file */
- fprintf(fp, "%s", remaining);
- fprintf(fp, "STOP TIME: %s\n", strfbuf);
- if (fflush(fp) || ferror(fp) || FreeFile(fp))
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not write file \"%s\": %m",
- histfilepath)));
+ /*
+ * Write the backup history file
+ */
+ XLByteToPrevSeg(stoppoint, _logSegNo);
+ XLogFileName(stopxlogfilename, stoptli, _logSegNo);
+ XLByteToSeg(startpoint, _logSegNo);
+ BackupHistoryFilePath(histfilepath, stoptli, _logSegNo,
+ (uint32) (startpoint % XLogSegSize));
+ fp = AllocateFile(histfilepath, "w");
+ if (!fp)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not create file \"%s\": %m",
+ histfilepath)));
+ fprintf(fp, "START WAL LOCATION: %X/%X (file %s)\n",
+ (uint32) (startpoint >> 32), (uint32) startpoint, startxlogfilename);
+ fprintf(fp, "STOP WAL LOCATION: %X/%X (file %s)\n",
+ (uint32) (stoppoint >> 32), (uint32) stoppoint, stopxlogfilename);
+ /* transfer remaining lines from label to history file */
+ fprintf(fp, "%s", remaining);
+ fprintf(fp, "STOP TIME: %s\n", strfbuf);
+ if (fflush(fp) || ferror(fp) || FreeFile(fp))
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not write file \"%s\": %m",
+ histfilepath)));
- /*
- * Clean out any no-longer-needed history files. As a side effect,
- * this will post a .ready file for the newly created history file,
- * notifying the archiver that history file may be archived
- * immediately.
- */
- CleanupBackupHistory();
- }
+ /*
+ * Clean out any no-longer-needed history files. As a side effect,
+ * this will post a .ready file for the newly created history file,
+ * notifying the archiver that history file may be archived
+ * immediately.
+ */
+ CleanupBackupHistory();
/*
* If archiving is enabled, wait for all the required WAL files to be
On Thu, Aug 10, 2017 at 3:56 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
Hi all,
In the recent thread related to a bug in pg_stop_backup when waiting
for WAL segments to be archived, it has been mentioned that it would
be nice to get backup history files generated as well on standbys:
/messages/by-id/CA+Tgmobh3=quCzugFP5yA3-z5uzz=YKZ7arnoApJoPN7=0H8TQ@mail.gmail.comBackup history files are not essential to backups, but they are for
debugging purposes, and it has bugged me hard that we should have a
consistent behavior in this area, particularly because with
archive_mode = always they can be archived by a standby.The previous thread has ended with 52f8a59d, which addressed the
original problem but discarded the backup history file generation
during recovery. Attached is a patch to fix this inconsistency, parked
in next CF. I am not arguing for this change as a bug fix, but as an
improvement for PG11.Thoughts?
Thank you for the patch. Regarding to creating the backup history file
on stanbys, is there any difference from the patch posted on earlier
thread?
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Aug 10, 2017 at 9:45 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Thank you for the patch. Regarding to creating the backup history file
on stanbys, is there any difference from the patch posted on earlier
thread?
That's a rebased version on top of what has been applied, except that
I have fixed an incorrect comment and shuffled the code building the
WAL segment name for the stop position.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Aug 10, 2017 at 4:50 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
On Thu, Aug 10, 2017 at 9:45 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Thank you for the patch. Regarding to creating the backup history file
on stanbys, is there any difference from the patch posted on earlier
thread?That's a rebased version on top of what has been applied, except that
I have fixed an incorrect comment and shuffled the code building the
WAL segment name for the stop position.
Understood, I'll review this patch.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Aug 10, 2017 at 4:55 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Aug 10, 2017 at 4:50 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:On Thu, Aug 10, 2017 at 9:45 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Thank you for the patch. Regarding to creating the backup history file
on stanbys, is there any difference from the patch posted on earlier
thread?That's a rebased version on top of what has been applied, except that
I have fixed an incorrect comment and shuffled the code building the
WAL segment name for the stop position.Understood, I'll review this patch.
Here is an updated patch with refreshed documentation, as a result of
449338c which was discussed in thread
/messages/by-id/d4d951b9-89c0-6bc1-b6ff-d0b2dd5a8966@pgmasters.net.
I am just outlining the fact that a backup history file gets created
on standbys and that it is archived.
--
Michael
Attachments:
standby-backup-history-v2.patchapplication/octet-stream; name=standby-backup-history-v2.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 641b3b8f4e..ba7b068501 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -18635,16 +18635,15 @@ postgres=# select pg_start_backup('label_goes_here');
</para>
<para>
- When executed on a primary, the function also creates a backup history file
- in the write-ahead log
+ The function also creates a backup history file in the write-ahead log
archive area. The history file includes the label given to
- <function>pg_start_backup</>, the starting and ending write-ahead log locations for
- the backup, and the starting and ending times of the backup. The return
- value is the backup's ending write-ahead log location (which again
- can be ignored). After recording the ending location, the current
- write-ahead log insertion
- point is automatically advanced to the next write-ahead log file, so that the
- ending write-ahead log file can be archived immediately to complete the backup.
+ <function>pg_start_backup</>, the starting and ending write-ahead log
+ locations for the backup, and the starting and ending times of the backup.
+ The return value is the backup's ending write-ahead log location (which
+ again can be ignored). When executed on a primary, the current write-ahead
+ log insertion point is automatically advanced to the next write-ahead log
+ file after recording the ending location, so that the ending write-ahead
+ log file can be archived immediately to complete the backup.
</para>
<para>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index df4843f409..0bfe73f6a3 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -10894,10 +10894,9 @@ do_pg_stop_backup(char *labelfile, bool waitforarchive, TimeLineID *stoptli_p)
* pg_control. This is harmless for current uses.
*
* XXX currently a backup history file is for informational and debug
- * purposes only. It's not essential for an online backup. Furthermore,
- * even if it's created, it will not be archived during recovery because
- * an archiver is not invoked. So it doesn't seem worthwhile to write a
- * backup history file during recovery.
+ * purposes only. It's not essential for an online backup. It gets created
+ * even during recovery as an archiver can be spawned in this case as
+ * well.
*/
if (backup_started_in_recovery)
{
@@ -10942,49 +10941,48 @@ do_pg_stop_backup(char *labelfile, bool waitforarchive, TimeLineID *stoptli_p)
* valid as soon as archiver moves out the current segment file.
*/
RequestXLogSwitch(false);
+ }
- XLByteToPrevSeg(stoppoint, _logSegNo);
- XLogFileName(stopxlogfilename, stoptli, _logSegNo);
-
- /* Use the log timezone here, not the session timezone */
- stamp_time = (pg_time_t) time(NULL);
- pg_strftime(strfbuf, sizeof(strfbuf),
- "%Y-%m-%d %H:%M:%S %Z",
- pg_localtime(&stamp_time, log_timezone));
+ /* Use the log timezone here, not the session timezone */
+ stamp_time = (pg_time_t) time(NULL);
+ pg_strftime(strfbuf, sizeof(strfbuf),
+ "%Y-%m-%d %H:%M:%S %Z",
+ pg_localtime(&stamp_time, log_timezone));
- /*
- * Write the backup history file
- */
- XLByteToSeg(startpoint, _logSegNo);
- BackupHistoryFilePath(histfilepath, stoptli, _logSegNo,
- (uint32) (startpoint % XLogSegSize));
- fp = AllocateFile(histfilepath, "w");
- if (!fp)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not create file \"%s\": %m",
- histfilepath)));
- fprintf(fp, "START WAL LOCATION: %X/%X (file %s)\n",
- (uint32) (startpoint >> 32), (uint32) startpoint, startxlogfilename);
- fprintf(fp, "STOP WAL LOCATION: %X/%X (file %s)\n",
- (uint32) (stoppoint >> 32), (uint32) stoppoint, stopxlogfilename);
- /* transfer remaining lines from label to history file */
- fprintf(fp, "%s", remaining);
- fprintf(fp, "STOP TIME: %s\n", strfbuf);
- if (fflush(fp) || ferror(fp) || FreeFile(fp))
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not write file \"%s\": %m",
- histfilepath)));
+ /*
+ * Write the backup history file
+ */
+ XLByteToPrevSeg(stoppoint, _logSegNo);
+ XLogFileName(stopxlogfilename, stoptli, _logSegNo);
+ XLByteToSeg(startpoint, _logSegNo);
+ BackupHistoryFilePath(histfilepath, stoptli, _logSegNo,
+ (uint32) (startpoint % XLogSegSize));
+ fp = AllocateFile(histfilepath, "w");
+ if (!fp)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not create file \"%s\": %m",
+ histfilepath)));
+ fprintf(fp, "START WAL LOCATION: %X/%X (file %s)\n",
+ (uint32) (startpoint >> 32), (uint32) startpoint, startxlogfilename);
+ fprintf(fp, "STOP WAL LOCATION: %X/%X (file %s)\n",
+ (uint32) (stoppoint >> 32), (uint32) stoppoint, stopxlogfilename);
+ /* transfer remaining lines from label to history file */
+ fprintf(fp, "%s", remaining);
+ fprintf(fp, "STOP TIME: %s\n", strfbuf);
+ if (fflush(fp) || ferror(fp) || FreeFile(fp))
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not write file \"%s\": %m",
+ histfilepath)));
- /*
- * Clean out any no-longer-needed history files. As a side effect,
- * this will post a .ready file for the newly created history file,
- * notifying the archiver that history file may be archived
- * immediately.
- */
- CleanupBackupHistory();
- }
+ /*
+ * Clean out any no-longer-needed history files. As a side effect,
+ * this will post a .ready file for the newly created history file,
+ * notifying the archiver that history file may be archived
+ * immediately.
+ */
+ CleanupBackupHistory();
/*
* If archiving is enabled, wait for all the required WAL files to be
Hi Michael,
On 8/31/17 11:56 PM, Michael Paquier wrote:
Here is an updated patch with refreshed documentation, as a result of
449338c which was discussed in thread
/messages/by-id/d4d951b9-89c0-6bc1-b6ff-d0b2dd5a8966@pgmasters.net.
I am just outlining the fact that a backup history file gets created
on standbys and that it is archived.
The patch looks good overall. It applies cleanly and passes all tests.
One question. Do we want to create this file all the time (as written),
or only when archive_mode = always?
It appears that CleanupBackupHistory() will immediately remove the
history file when archive_mode != always so it seems useless to write
it. On the other hand the code is a bit simpler this way. Thoughts?
Regards,
--
-David
david@pgmasters.net
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Sep 19, 2017 at 8:14 AM, David Steele <david@pgmasters.net> wrote:
On 8/31/17 11:56 PM, Michael Paquier wrote:
Here is an updated patch with refreshed documentation, as a result of
449338c which was discussed in thread
/messages/by-id/d4d951b9-89c0-6bc1-b6ff-d0b2dd5a8966@pgmasters.net.
I am just outlining the fact that a backup history file gets created
on standbys and that it is archived.The patch looks good overall. It applies cleanly and passes all tests.
One question. Do we want to create this file all the time (as written),
or only when archive_mode = always?It appears that CleanupBackupHistory() will immediately remove the
history file when archive_mode != always so it seems useless to write
it. On the other hand the code is a bit simpler this way. Thoughts?
With archive_mode = off, the bytes of the backup history file are
still written to disk, so my take on the matter is to keep the code
simple.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 9/18/17 7:26 PM, Michael Paquier wrote:
On Tue, Sep 19, 2017 at 8:14 AM, David Steele <david@pgmasters.net> wrote:
On 8/31/17 11:56 PM, Michael Paquier wrote:
Here is an updated patch with refreshed documentation, as a result of
449338c which was discussed in thread
/messages/by-id/d4d951b9-89c0-6bc1-b6ff-d0b2dd5a8966@pgmasters.net.
I am just outlining the fact that a backup history file gets created
on standbys and that it is archived.The patch looks good overall. It applies cleanly and passes all tests.
One question. Do we want to create this file all the time (as written),
or only when archive_mode = always?It appears that CleanupBackupHistory() will immediately remove the
history file when archive_mode != always so it seems useless to write
it. On the other hand the code is a bit simpler this way. Thoughts?With archive_mode = off, the bytes of the backup history file are
still written to disk, so my take on the matter is to keep the code
simple.
I'm OK with that.
I'll give Masahiko some time to respond before marking it RFC.
Regards,
--
-David
david@pgmasters.net
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Sep 19, 2017 at 8:33 AM, David Steele <david@pgmasters.net> wrote:
On 9/18/17 7:26 PM, Michael Paquier wrote:
On Tue, Sep 19, 2017 at 8:14 AM, David Steele <david@pgmasters.net> wrote:
On 8/31/17 11:56 PM, Michael Paquier wrote:
Here is an updated patch with refreshed documentation, as a result of
449338c which was discussed in thread
/messages/by-id/d4d951b9-89c0-6bc1-b6ff-d0b2dd5a8966@pgmasters.net.
I am just outlining the fact that a backup history file gets created
on standbys and that it is archived.The patch looks good overall. It applies cleanly and passes all tests.
One question. Do we want to create this file all the time (as written),
or only when archive_mode = always?It appears that CleanupBackupHistory() will immediately remove the
history file when archive_mode != always so it seems useless to write
it. On the other hand the code is a bit simpler this way. Thoughts?With archive_mode = off, the bytes of the backup history file are
still written to disk, so my take on the matter is to keep the code
simple.I'm OK with that.
I'll give Masahiko some time to respond before marking it RFC.
Thanks, I'll review it and send a comment by this Wednesday.
--
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Sep 19, 2017 at 2:48 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Sep 19, 2017 at 8:33 AM, David Steele <david@pgmasters.net> wrote:
On 9/18/17 7:26 PM, Michael Paquier wrote:
On Tue, Sep 19, 2017 at 8:14 AM, David Steele <david@pgmasters.net> wrote:
On 8/31/17 11:56 PM, Michael Paquier wrote:
Here is an updated patch with refreshed documentation, as a result of
449338c which was discussed in thread
/messages/by-id/d4d951b9-89c0-6bc1-b6ff-d0b2dd5a8966@pgmasters.net.
I am just outlining the fact that a backup history file gets created
on standbys and that it is archived.The patch looks good overall. It applies cleanly and passes all tests.
One question. Do we want to create this file all the time (as written),
or only when archive_mode = always?It appears that CleanupBackupHistory() will immediately remove the
history file when archive_mode != always so it seems useless to write
it. On the other hand the code is a bit simpler this way. Thoughts?With archive_mode = off, the bytes of the backup history file are
still written to disk, so my take on the matter is to keep the code
simple.I'm OK with that.
I'll give Masahiko some time to respond before marking it RFC.
Thanks, I'll review it and send a comment by this Wednesday.
I've reviewed and tested the latest patch but fond no problems, so
I've marked this patch to Ready for Committer.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 19 September 2017 at 00:33, David Steele <david@pgmasters.net> wrote:
On 9/18/17 7:26 PM, Michael Paquier wrote:
On Tue, Sep 19, 2017 at 8:14 AM, David Steele <david@pgmasters.net> wrote:
On 8/31/17 11:56 PM, Michael Paquier wrote:
Here is an updated patch with refreshed documentation, as a result of
449338c which was discussed in thread
/messages/by-id/d4d951b9-89c0-6bc1-b6ff-d0b2dd5a8966@pgmasters.net.
I am just outlining the fact that a backup history file gets created
on standbys and that it is archived.The patch looks good overall. It applies cleanly and passes all tests.
One question. Do we want to create this file all the time (as written),
or only when archive_mode = always?It appears that CleanupBackupHistory() will immediately remove the
history file when archive_mode != always so it seems useless to write
it. On the other hand the code is a bit simpler this way. Thoughts?With archive_mode = off, the bytes of the backup history file are
still written to disk, so my take on the matter is to keep the code
simple.I'm OK with that.
I'm not.
If we want to do this why not only do it in the modes that have meaning?
i.e. put an if() test in for archive_mode == always
Which also makes it a smaller and clearer patch
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Jan 5, 2018 at 11:27 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
I'm not.
If we want to do this why not only do it in the modes that have meaning?
i.e. put an if() test in for archive_mode == always.
OK, I can see value in your point as well. The check is a bit more
complicated that just looking for archive_mode == always though, you
need to look at if the backup has been started in recovery or not, and
then adapt using XLogArchivingActive() and XLogArchivingAlways(),
similarly to what is done when waiting for the archives.
Which also makes it a smaller and clearer patch
Yes, this generates less diffs, reducing the likelihood of bugs. What
do you think about the v3 attached?
--
Michael
Attachments:
standby-backup-history-v3.patchapplication/octet-stream; name=standby-backup-history-v3.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 4dd9d029e6..4ef68c308e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -18649,9 +18649,9 @@ postgres=# select pg_start_backup('label_goes_here');
</para>
<para>
- When executed on a primary, the function also creates a backup history file
- in the write-ahead log
- archive area. The history file includes the label given to
+ The function also creates a backup history file in the write-ahead log
+ archive area when archiving is enabled.
+ The history file includes the label given to
<function>pg_start_backup</function>, the starting and ending write-ahead log locations for
the backup, and the starting and ending times of the backup. The return
value is the backup's ending write-ahead log location (which again
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 02974f0e52..7e0c5da0e2 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -10729,11 +10729,7 @@ do_pg_stop_backup(char *labelfile, bool waitforarchive, TimeLineID *stoptli_p)
XLogRecPtr startpoint;
XLogRecPtr stoppoint;
TimeLineID stoptli;
- pg_time_t stamp_time;
- char strfbuf[128];
- char histfilepath[MAXPGPATH];
char startxlogfilename[MAXFNAMELEN];
- char stopxlogfilename[MAXFNAMELEN];
char lastxlogfilename[MAXFNAMELEN];
char histfilename[MAXFNAMELEN];
char backupfrom[20];
@@ -10939,12 +10935,6 @@ do_pg_stop_backup(char *labelfile, bool waitforarchive, TimeLineID *stoptli_p)
* location. Note that it can be greater than the exact backup end
* location if the minimum recovery point is updated after the backup of
* pg_control. This is harmless for current uses.
- *
- * XXX currently a backup history file is for informational and debug
- * purposes only. It's not essential for an online backup. Furthermore,
- * even if it's created, it will not be archived during recovery because
- * an archiver is not invoked. So it doesn't seem worthwhile to write a
- * backup history file during recovery.
*/
if (backup_started_in_recovery)
{
@@ -10989,9 +10979,21 @@ do_pg_stop_backup(char *labelfile, bool waitforarchive, TimeLineID *stoptli_p)
* valid as soon as archiver moves out the current segment file.
*/
RequestXLogSwitch(false);
+ }
- XLByteToPrevSeg(stoppoint, _logSegNo, wal_segment_size);
- XLogFileName(stopxlogfilename, stoptli, _logSegNo, wal_segment_size);
+ /*
+ * Write the backup history file if archiving is enabled.
+ *
+ * XXX currently a backup history file is for informational and debug
+ * purposes only. It's not essential for an online backup.
+ */
+ if ((!backup_started_in_recovery && XLogArchivingActive()) ||
+ (backup_started_in_recovery && XLogArchivingAlways()))
+ {
+ char strfbuf[128];
+ char histfilepath[MAXPGPATH];
+ char stopxlogfilename[MAXFNAMELEN];
+ pg_time_t stamp_time;
/* Use the log timezone here, not the session timezone */
stamp_time = (pg_time_t) time(NULL);
@@ -10999,9 +11001,8 @@ do_pg_stop_backup(char *labelfile, bool waitforarchive, TimeLineID *stoptli_p)
"%Y-%m-%d %H:%M:%S %Z",
pg_localtime(&stamp_time, log_timezone));
- /*
- * Write the backup history file
- */
+ XLByteToPrevSeg(stoppoint, _logSegNo, wal_segment_size);
+ XLogFileName(stopxlogfilename, stoptli, _logSegNo, wal_segment_size);
XLByteToSeg(startpoint, _logSegNo, wal_segment_size);
BackupHistoryFilePath(histfilepath, stoptli, _logSegNo,
startpoint, wal_segment_size);
On 1/6/18 3:48 AM, Michael Paquier wrote:
On Fri, Jan 5, 2018 at 11:27 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
Which also makes it a smaller and clearer patch
Yes, this generates less diffs, reducing the likelihood of bugs. What
do you think about the v3 attached?
I agree that this is a cleaner solution.
--
-David
david@pgmasters.net
On Sun, Jan 7, 2018 at 1:35 AM, David Steele <david@pgmasters.net> wrote:
On 1/6/18 3:48 AM, Michael Paquier wrote:
On Fri, Jan 5, 2018 at 11:27 PM, Simon Riggs <simon@2ndquadrant.com>
wrote:Which also makes it a smaller and clearer patch
Yes, this generates less diffs, reducing the likelihood of bugs. What
do you think about the v3 attached?I agree that this is a cleaner solution.
+1. And the changes looks good to me.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Thu, Jan 11, 2018 at 07:10:50PM +0900, Masahiko Sawada wrote:
On Sun, Jan 7, 2018 at 1:35 AM, David Steele <david@pgmasters.net> wrote:
On 1/6/18 3:48 AM, Michael Paquier wrote:
On Fri, Jan 5, 2018 at 11:27 PM, Simon Riggs <simon@2ndquadrant.com>
wrote:Which also makes it a smaller and clearer patch
Yes, this generates less diffs, reducing the likelihood of bugs. What
do you think about the v3 attached?I agree that this is a cleaner solution.
+1. And the changes looks good to me.
Cool. Thanks for the feedback.
--
Michael
On Thu, Jan 11, 2018 at 09:47:42PM +0900, Michael Paquier wrote:
Cool. Thanks for the feedback.
The last patch still applies, but no committer has been interested, so I
am moving my patch to the next CF.
--
Michael
On Thu, Jan 11, 2018 at 9:47 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
On Thu, Jan 11, 2018 at 07:10:50PM +0900, Masahiko Sawada wrote:
On Sun, Jan 7, 2018 at 1:35 AM, David Steele <david@pgmasters.net> wrote:
On 1/6/18 3:48 AM, Michael Paquier wrote:
On Fri, Jan 5, 2018 at 11:27 PM, Simon Riggs <simon@2ndquadrant.com>
wrote:Which also makes it a smaller and clearer patch
Yes, this generates less diffs, reducing the likelihood of bugs. What
do you think about the v3 attached?I agree that this is a cleaner solution.
+1. And the changes looks good to me.
The patch basically looks good to me. Here are some small comments.
<para>
The backup history file is not created in the database cluster backed up.
</para>
The above should be deleted in pg_basebackup.sgml.
* During recovery, since we don't use the end-of-backup WAL
* record and don't write the backup history file, the
This comment needs to be updated in xlog.c.
Regards,
--
Fujii Masao
On Fri, Feb 02, 2018 at 12:47:26AM +0900, Fujii Masao wrote:
The patch basically looks good to me. Here are some small comments.
<para>
The backup history file is not created in the database cluster backed up.
</para>The above should be deleted in pg_basebackup.sgml.
* During recovery, since we don't use the end-of-backup WAL
* record and don't write the backup history file, theThis comment needs to be updated in xlog.c.
Thanks Fujii-san for the review. Indeed those portions need a refresh..
--
Michael
Attachments:
standby-backup-history-v4.patchtext/x-diff; charset=us-asciiDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 487c7ff750..5dcee53dd2 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -18657,9 +18657,9 @@ postgres=# select pg_start_backup('label_goes_here');
</para>
<para>
- When executed on a primary, the function also creates a backup history file
- in the write-ahead log
- archive area. The history file includes the label given to
+ The function also creates a backup history file in the write-ahead log
+ archive area when archiving is enabled.
+ The history file includes the label given to
<function>pg_start_backup</function>, the starting and ending write-ahead log locations for
the backup, and the starting and ending times of the backup. The return
value is the backup's ending write-ahead log location (which again
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index e8921b1bb4..1c37cecf72 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -78,11 +78,6 @@ PostgreSQL documentation
Note that there are some limitations in an online backup from the standby:
<itemizedlist>
- <listitem>
- <para>
- The backup history file is not created in the database cluster backed up.
- </para>
- </listitem>
<listitem>
<para>
If you are using <literal>-X none</literal>, there is no guarantee that all
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index e42b828edf..f2ec7441f3 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -10389,10 +10389,10 @@ do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
/*
* During recovery, since we don't use the end-of-backup WAL
- * record and don't write the backup history file, the
- * starting WAL location doesn't need to be unique. This means
- * that two base backups started at the same time might use
- * the same checkpoint as starting locations.
+ * record, the starting WAL location doesn't need to be unique.
+ * This means that two base backups started at the same time
+ * might use the same checkpoint as starting locations, and
+ * write a backup history file with the same name.
*/
gotUniqueStartpoint = true;
}
@@ -10730,11 +10730,7 @@ do_pg_stop_backup(char *labelfile, bool waitforarchive, TimeLineID *stoptli_p)
XLogRecPtr startpoint;
XLogRecPtr stoppoint;
TimeLineID stoptli;
- pg_time_t stamp_time;
- char strfbuf[128];
- char histfilepath[MAXPGPATH];
char startxlogfilename[MAXFNAMELEN];
- char stopxlogfilename[MAXFNAMELEN];
char lastxlogfilename[MAXFNAMELEN];
char histfilename[MAXFNAMELEN];
char backupfrom[20];
@@ -10940,12 +10936,6 @@ do_pg_stop_backup(char *labelfile, bool waitforarchive, TimeLineID *stoptli_p)
* location. Note that it can be greater than the exact backup end
* location if the minimum recovery point is updated after the backup of
* pg_control. This is harmless for current uses.
- *
- * XXX currently a backup history file is for informational and debug
- * purposes only. It's not essential for an online backup. Furthermore,
- * even if it's created, it will not be archived during recovery because
- * an archiver is not invoked. So it doesn't seem worthwhile to write a
- * backup history file during recovery.
*/
if (backup_started_in_recovery)
{
@@ -10990,9 +10980,21 @@ do_pg_stop_backup(char *labelfile, bool waitforarchive, TimeLineID *stoptli_p)
* valid as soon as archiver moves out the current segment file.
*/
RequestXLogSwitch(false);
+ }
- XLByteToPrevSeg(stoppoint, _logSegNo, wal_segment_size);
- XLogFileName(stopxlogfilename, stoptli, _logSegNo, wal_segment_size);
+ /*
+ * Write the backup history file if archiving is enabled.
+ *
+ * XXX currently a backup history file is for informational and debug
+ * purposes only. It's not essential for an online backup.
+ */
+ if ((!backup_started_in_recovery && XLogArchivingActive()) ||
+ (backup_started_in_recovery && XLogArchivingAlways()))
+ {
+ char strfbuf[128];
+ char histfilepath[MAXPGPATH];
+ char stopxlogfilename[MAXFNAMELEN];
+ pg_time_t stamp_time;
/* Use the log timezone here, not the session timezone */
stamp_time = (pg_time_t) time(NULL);
@@ -11000,9 +11002,8 @@ do_pg_stop_backup(char *labelfile, bool waitforarchive, TimeLineID *stoptli_p)
"%Y-%m-%d %H:%M:%S %Z",
pg_localtime(&stamp_time, log_timezone));
- /*
- * Write the backup history file
- */
+ XLByteToPrevSeg(stoppoint, _logSegNo, wal_segment_size);
+ XLogFileName(stopxlogfilename, stoptli, _logSegNo, wal_segment_size);
XLByteToSeg(startpoint, _logSegNo, wal_segment_size);
BackupHistoryFilePath(histfilepath, stoptli, _logSegNo,
startpoint, wal_segment_size);
On Fri, Feb 2, 2018 at 2:06 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
On Fri, Feb 02, 2018 at 12:47:26AM +0900, Fujii Masao wrote:
The patch basically looks good to me. Here are some small comments.
<para>
The backup history file is not created in the database cluster backed up.
</para>The above should be deleted in pg_basebackup.sgml.
* During recovery, since we don't use the end-of-backup WAL
* record and don't write the backup history file, theThis comment needs to be updated in xlog.c.
Thanks Fujii-san for the review. Indeed those portions need a refresh..
Thanks for updating the patch!
+ * write a backup history file with the same name.
So more than one backup history files with the same name
but the diffferent content can be created and archived.
Isn't this problematic because the backup history file that
users want to use later might be overwritten unexpectedly?
Regards,
--
Fujii Masao
Hi!
On 2018-03-02 02:29:13 +0900, Fujii Masao wrote:
On Fri, Feb 2, 2018 at 2:06 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:On Fri, Feb 02, 2018 at 12:47:26AM +0900, Fujii Masao wrote:
The patch basically looks good to me. Here are some small comments.
<para>
The backup history file is not created in the database cluster backed up.
</para>The above should be deleted in pg_basebackup.sgml.
* During recovery, since we don't use the end-of-backup WAL
* record and don't write the backup history file, theThis comment needs to be updated in xlog.c.
Thanks Fujii-san for the review. Indeed those portions need a refresh..
Thanks for updating the patch!
You'd claimed this patch as a committer. You're still planning to commit
it?
+ * write a backup history file with the same name.
So more than one backup history files with the same name
but the diffferent content can be created and archived.
Isn't this problematic because the backup history file that
users want to use later might be overwritten unexpectedly?
Michael?
Greetings,
Andres Freund
On Fri, Mar 02, 2018 at 02:29:13AM +0900, Fujii Masao wrote:
+ * write a backup history file with the same name.
So more than one backup history files with the same name
but the diffferent content can be created and archived.
Isn't this problematic because the backup history file that
users want to use later might be overwritten unexpectedly?
Yeah, that's the intention behind the patch. Would that actually happen
in practice though? We would talk about two backups running
simultaneously on a standby, which would overlap with each other to
generate a file aimed only at being helpful for debugging purposes, and
we provide no information now for backups taken from standbys. We could
of course make that logic a bit smarter by checking if there is an
extsing file with the same name and create a new file with a different
name. But is that worth the complication? That's where I am not
convinced, and that's the reason why this patch is doing things this
way.
--
Michael
On Thu, Mar 01, 2018 at 07:26:09PM -0800, Andres Freund wrote:
On 2018-03-02 02:29:13 +0900, Fujii Masao wrote:
+ * write a backup history file with the same name.
So more than one backup history files with the same name
but the diffferent content can be created and archived.
Isn't this problematic because the backup history file that
users want to use later might be overwritten unexpectedly?Michael?
Just answered to that question.
--
Michael
On 3/1/18 11:07 PM, Michael Paquier wrote:
On Fri, Mar 02, 2018 at 02:29:13AM +0900, Fujii Masao wrote:
+ * write a backup history file with the same name.
So more than one backup history files with the same name
but the diffferent content can be created and archived.
Isn't this problematic because the backup history file that
users want to use later might be overwritten unexpectedly?Yeah, that's the intention behind the patch. Would that actually happen
in practice though? We would talk about two backups running
simultaneously on a standby, which would overlap with each other to
generate a file aimed only at being helpful for debugging purposes, and
we provide no information now for backups taken from standbys. We could
of course make that logic a bit smarter by checking if there is an
extsing file with the same name and create a new file with a different
name. But is that worth the complication? That's where I am not
convinced, and that's the reason why this patch is doing things this
way.
+1. Given that the history files are not used during restore and are
present primarily for debugging purposes, I can't see worrying too much
about this unlikely (if possible) race condition.
--
-David
david@pgmasters.net
On Fri, Mar 2, 2018 at 1:07 PM, Michael Paquier <michael@paquier.xyz> wrote:
On Fri, Mar 02, 2018 at 02:29:13AM +0900, Fujii Masao wrote:
+ * write a backup history file with the same name.
So more than one backup history files with the same name
but the diffferent content can be created and archived.
Isn't this problematic because the backup history file that
users want to use later might be overwritten unexpectedly?Yeah, that's the intention behind the patch. Would that actually happen
in practice though?
Yes, I think. During recovery, all the pg_basebackup would use the same
backup starting point and create the backup history file with the same name
until the next restartpoint runs. So unless the restartpoint interval is short,
ISTM that such a problematic case can happen even in practice. No?
We would talk about two backups running
simultaneously on a standby, which would overlap with each other to
generate a file aimed only at being helpful for debugging purposes, and
we provide no information now for backups taken from standbys. We could
of course make that logic a bit smarter by checking if there is an
extsing file with the same name and create a new file with a different
name. But is that worth the complication? That's where I am not
convinced, and that's the reason why this patch is doing things this
way.
What about including the backup history file in the base backup instead of
creating it in the standby's pg_wal and archiving it? As the good side effect
of this approach, we can use the backup history file for debugging purpose
even when WAL archiving is not enabled.
Regards,
--
Fujii Masao
On Fri, Mar 2, 2018 at 12:26 PM, Andres Freund <andres@anarazel.de> wrote:
Hi!
On 2018-03-02 02:29:13 +0900, Fujii Masao wrote:
On Fri, Feb 2, 2018 at 2:06 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:On Fri, Feb 02, 2018 at 12:47:26AM +0900, Fujii Masao wrote:
The patch basically looks good to me. Here are some small comments.
<para>
The backup history file is not created in the database cluster backed up.
</para>The above should be deleted in pg_basebackup.sgml.
* During recovery, since we don't use the end-of-backup WAL
* record and don't write the backup history file, theThis comment needs to be updated in xlog.c.
Thanks Fujii-san for the review. Indeed those portions need a refresh..
Thanks for updating the patch!
You'd claimed this patch as a committer. You're still planning to commit
it?
Yes. Thanks for the ping!
Regards,
--
Fujii Masao
Hi,
On 3/2/18 1:03 PM, Fujii Masao wrote:
On Fri, Mar 2, 2018 at 1:07 PM, Michael Paquier <michael@paquier.xyz> wrote:
We would talk about two backups running
simultaneously on a standby, which would overlap with each other to
generate a file aimed only at being helpful for debugging purposes, and
we provide no information now for backups taken from standbys. We could
of course make that logic a bit smarter by checking if there is an
extsing file with the same name and create a new file with a different
name. But is that worth the complication? That's where I am not
convinced, and that's the reason why this patch is doing things this
way.What about including the backup history file in the base backup instead of
creating it in the standby's pg_wal and archiving it? As the good side effect
of this approach, we can use the backup history file for debugging purpose
even when WAL archiving is not enabled.
I don't think this is a good idea, even if it does solve the race
condition. First, it would be different from the way primary-style
backups generate this file. Second, these files would not be excluded
from future restore / backup cycles so they could build up after a while
and cause confusion. I guess we could teach PG to delete them or
pg_basebackup to ignore them, but that doesn't seem worth the effort.
Regards,
--
-David
david@pgmasters.net
On Fri, Mar 02, 2018 at 03:41:57PM -0500, David Steele wrote:
On 3/2/18 1:03 PM, Fujii Masao wrote:
On Fri, Mar 2, 2018 at 1:07 PM, Michael Paquier <michael@paquier.xyz> wrote:
We would talk about two backups running
simultaneously on a standby, which would overlap with each other to
generate a file aimed only at being helpful for debugging purposes, and
we provide no information now for backups taken from standbys. We could
of course make that logic a bit smarter by checking if there is an
extsing file with the same name and create a new file with a different
name. But is that worth the complication? That's where I am not
convinced, and that's the reason why this patch is doing things this
way.
Sorry for my late reply, I was thinking more about the whole thing.
What about including the backup history file in the base backup instead of
creating it in the standby's pg_wal and archiving it? As the good side effect
of this approach, we can use the backup history file for debugging purpose
even when WAL archiving is not enabled.
That's not going to be much helpful for tar'ed base backups as the
backup history file would be at the end of the stream :( For a debug
file, that's not really helpful.
I don't think this is a good idea, even if it does solve the race
condition. First, it would be different from the way primary-style
backups generate this file. Second, these files would not be excluded
from future restore / backup cycles so they could build up after a while
and cause confusion. I guess we could teach PG to delete them or
pg_basebackup to ignore them, but that doesn't seem worth the effort.
Something I did not consider though is that this causes archive commands
to choke on those identical file names, so any archive command which is
not able to handle that would cause WAL to bloat on the standby. For
this reason I would rather drop the current approach taken. Remember
that the docs tell to use a plain "cp" for example, so this would cause
standbys to go silently down.
Something that I would find way more portable though is the addition of
the backup history file in the output of pg_stop_backup(), which would
map as well with what Fujii-san's idea to add this file to the backup
data itself, and do the same for backups taken on a primary. However
this would mean moving the 'c' marker marking the end of the tar stream
within do_pg_stop_backup(), and I am in no way a fan of that: the
handling of the tar stream should be out of reach of the start and stop
phases.
Another idea that I have here as well would be to change a bit the
history backup file name so as the backup name is added to it, but that
does not fly high as pg_basebackup uses its own name with spaces
(Yeah!), or even better the PID of the backend taking the backup.
The renaming is more appealing, still not perfect. So my mood of the
day would be to just drop the patch entirely.
--
Michael
On 3/5/18 1:06 AM, Michael Paquier wrote:
On Fri, Mar 02, 2018 at 03:41:57PM -0500, David Steele wrote:
On 3/2/18 1:03 PM, Fujii Masao wrote:
On Fri, Mar 2, 2018 at 1:07 PM, Michael Paquier <michael@paquier.xyz> wrote:
We would talk about two backups running
simultaneously on a standby, which would overlap with each other to
generate a file aimed only at being helpful for debugging purposes, and
we provide no information now for backups taken from standbys. We could
of course make that logic a bit smarter by checking if there is an
extsing file with the same name and create a new file with a different
name. But is that worth the complication? That's where I am not
convinced, and that's the reason why this patch is doing things this
way.Sorry for my late reply, I was thinking more about the whole thing.
What about including the backup history file in the base backup instead of
creating it in the standby's pg_wal and archiving it? As the good side effect
of this approach, we can use the backup history file for debugging purpose
even when WAL archiving is not enabled.That's not going to be much helpful for tar'ed base backups as the
backup history file would be at the end of the stream :( For a debug
file, that's not really helpful.I don't think this is a good idea, even if it does solve the race
condition. First, it would be different from the way primary-style
backups generate this file. Second, these files would not be excluded
from future restore / backup cycles so they could build up after a while
and cause confusion. I guess we could teach PG to delete them or
pg_basebackup to ignore them, but that doesn't seem worth the effort.Something I did not consider though is that this causes archive commands
to choke on those identical file names, so any archive command which is
not able to handle that would cause WAL to bloat on the standby. For
this reason I would rather drop the current approach taken. Remember
that the docs tell to use a plain "cp" for example, so this would cause
standbys to go silently down.
That's a good point. The pgBackRest archive command would definitely
not like this.
Something that I would find way more portable though is the addition of
the backup history file in the output of pg_stop_backup(), which would
map as well with what Fujii-san's idea to add this file to the backup
data itself, and do the same for backups taken on a primary. However
this would mean moving the 'c' marker marking the end of the tar stream
within do_pg_stop_backup(), and I am in no way a fan of that: the
handling of the tar stream should be out of reach of the start and stop
phases.
If the file is stored with the backup instead of the archive then I
don't think it adds much. Why not just add stop time and stop LSN to
backup_label and get rid of the history file entirely.
I've been working closely with backup for nearly five years now and I've
need had need to look at a .backup file. Other than writing code to
handle it in archiving, I've barely given it a thought.
Another idea that I have here as well would be to change a bit the
history backup file name so as the backup name is added to it, but that
does not fly high as pg_basebackup uses its own name with spaces
(Yeah!), or even better the PID of the backend taking the backup.
This could work, but seems complicated for little benefit. It would
also require some archive commands to be updated to understand the new
format.
The renaming is more appealing, still not perfect. So my mood of the
day would be to just drop the patch entirely.
I think that's the best idea for now. It doesn't seem worth using
cycles in the last CF coming up with a new approach.
Regards,
--
-David
david@pgmasters.net
On Mon, Mar 5, 2018 at 11:01 PM, David Steele <david@pgmasters.net> wrote:
On 3/5/18 1:06 AM, Michael Paquier wrote:
On Fri, Mar 02, 2018 at 03:41:57PM -0500, David Steele wrote:
On 3/2/18 1:03 PM, Fujii Masao wrote:
On Fri, Mar 2, 2018 at 1:07 PM, Michael Paquier <michael@paquier.xyz> wrote:
We would talk about two backups running
simultaneously on a standby, which would overlap with each other to
generate a file aimed only at being helpful for debugging purposes, and
we provide no information now for backups taken from standbys. We could
of course make that logic a bit smarter by checking if there is an
extsing file with the same name and create a new file with a different
name. But is that worth the complication? That's where I am not
convinced, and that's the reason why this patch is doing things this
way.Sorry for my late reply, I was thinking more about the whole thing.
What about including the backup history file in the base backup instead of
creating it in the standby's pg_wal and archiving it? As the good side effect
of this approach, we can use the backup history file for debugging purpose
even when WAL archiving is not enabled.That's not going to be much helpful for tar'ed base backups as the
backup history file would be at the end of the stream :( For a debug
file, that's not really helpful.I don't think this is a good idea, even if it does solve the race
condition. First, it would be different from the way primary-style
backups generate this file. Second, these files would not be excluded
from future restore / backup cycles so they could build up after a while
and cause confusion. I guess we could teach PG to delete them or
pg_basebackup to ignore them, but that doesn't seem worth the effort.Something I did not consider though is that this causes archive commands
to choke on those identical file names, so any archive command which is
not able to handle that would cause WAL to bloat on the standby. For
this reason I would rather drop the current approach taken. Remember
that the docs tell to use a plain "cp" for example, so this would cause
standbys to go silently down.That's a good point. The pgBackRest archive command would definitely
not like this.Something that I would find way more portable though is the addition of
the backup history file in the output of pg_stop_backup(), which would
map as well with what Fujii-san's idea to add this file to the backup
data itself, and do the same for backups taken on a primary. However
this would mean moving the 'c' marker marking the end of the tar stream
within do_pg_stop_backup(), and I am in no way a fan of that: the
handling of the tar stream should be out of reach of the start and stop
phases.If the file is stored with the backup instead of the archive then I
don't think it adds much. Why not just add stop time and stop LSN to
backup_label and get rid of the history file entirely.I've been working closely with backup for nearly five years now and I've
need had need to look at a .backup file. Other than writing code to
handle it in archiving, I've barely given it a thought.Another idea that I have here as well would be to change a bit the
history backup file name so as the backup name is added to it, but that
does not fly high as pg_basebackup uses its own name with spaces
(Yeah!), or even better the PID of the backend taking the backup.This could work, but seems complicated for little benefit. It would
also require some archive commands to be updated to understand the new
format.The renaming is more appealing, still not perfect. So my mood of the
day would be to just drop the patch entirely.I think that's the best idea for now. It doesn't seem worth using
cycles in the last CF coming up with a new approach.
I have no objection to mark the patch "returned with feedback".
Regards,
--
Fujii Masao