Online base backup from the hot-standby
I will provide a patch which can exeute pg_start/stop_backup
including to solve above comment and conditions in next stage.
Then please review.
done.
* Procedure
1. Call pg_start_backup('x') on the standby.
2. Take a backup of the data dir.
3. Call pg_stop_backup() on the standby.
4. Copy the control file on the standby to the backup.
5. Check whether the control file is status during hot standby with pg_controldata.
-> If the standby promote between 3. and 4., the backup can not recovery.
-> pg_control is that "Minimum recovery ending location" is equals 0/0.
-> backup-end record is not written.
* Not correspond yet
* full_page_write = off
-> If the primary is "full_page_write = off", archive recovery may not act
normally. Therefore the standby may need to check whether "full_page_write
= off" to WAL.
--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka.jun@po.ntts.co.jp
--------------------------------------------
2011/8/5 Jun Ishiduka <ishizuka.jun@po.ntts.co.jp>:
I will provide a patch which can exeute pg_start/stop_backup
including to solve above comment and conditions in next stage.
Then please review.done.
great !
* Procedure
1. Call pg_start_backup('x') on the standby.
2. Take a backup of the data dir.
3. Call pg_stop_backup() on the standby.
4. Copy the control file on the standby to the backup.
5. Check whether the control file is status during hot standby with pg_controldata.
-> If the standby promote between 3. and 4., the backup can not recovery.
-> pg_control is that "Minimum recovery ending location" is equals 0/0.
-> backup-end record is not written.* Not correspond yet
* full_page_write = off
-> If the primary is "full_page_write = off", archive recovery may not act
normally. Therefore the standby may need to check whether "full_page_write
= off" to WAL.
Isn't having a standby make the full_page_write = on in all case
(bypass configuration) ?
--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka.jun@po.ntts.co.jp
----------------------------------------------
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation
* Not correspond yet
* full_page_write = off
-> If the primary is "full_page_write = off", archive recovery may not act
normally. Therefore the standby may need to check whether "full_page_write
= off" to WAL.Isn't having a standby make the full_page_write = on in all case
(bypass configuration) ?
what's the meaning?
--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka.jun@po.ntts.co.jp
--------------------------------------------
2011/8/15 Jun Ishiduka <ishizuka.jun@po.ntts.co.jp>:
* Not correspond yet
* full_page_write = off
-> If the primary is "full_page_write = off", archive recovery may not act
normally. Therefore the standby may need to check whether "full_page_write
= off" to WAL.Isn't having a standby make the full_page_write = on in all case
(bypass configuration) ?what's the meaning?
Yeah. full_page_writes is a WAL generation parameter. Standbys don't
generate WAL. I think you just have to insist that the master has it
on.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
* Not correspond yet
?* full_page_write = off
? ?-> If the primary is "full_page_write = off", archive recovery may not act
? ? ? normally. Therefore the standby may need to check whether "full_page_write
? ? ? = off" to WAL.Isn't having a standby make the full_page_write = on in all case
(bypass configuration) ?what's the meaning?
Thanks.
This has the following two problems.
* pg_start_backup() must set 'on' to full_page_writes of the master that
is actual writing of the WAL, but not the standby.
* The standby doesn't need to connect to the master that's actual writing
WAL.
(Ex. Standby2 in Cascade Replication: Master - Standby1 - Standby2)
I'm worried how I should clear these problems.
Regards.
--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka.jun@po.ntts.co.jp
--------------------------------------------
* Not correspond yet
?* full_page_write = off
? ?-> If the primary is "full_page_write = off", archive recovery may not act
? ? ? normally. Therefore the standby may need to check whether "full_page_write
? ? ? = off" to WAL.Isn't having a standby make the full_page_write = on in all case
(bypass configuration) ?what's the meaning?
Yeah. full_page_writes is a WAL generation parameter. Standbys don't
generate WAL. I think you just have to insist that the master has it
on.
Thanks.
This has the following two problems.
* pg_start_backup() must set 'on' to full_page_writes of the master that
is actual writing of the WAL, but not the standby.
* The standby doesn't need to connect to the master that's actual writing
WAL.
(Ex. Standby2 in Cascade Replication: Master - Standby1 - Standby2)
I'm worried how I should clear these problems.
Regards.
--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka.jun@po.ntts.co.jp
--------------------------------------------
On 11-08-16 02:09 AM, Jun Ishiduka wrote:
Thanks.
This has the following two problems.
* pg_start_backup() must set 'on' to full_page_writes of the master that
is actual writing of the WAL, but not the standby.
Is there any way to tell from the WAL segments if they contain the full
page data? If so could you verify this on the second slave when it is
brought up? Or can you track this on the first slave and produce an
error in either pg_start_backup or pg_stop_backup()
I see in xlog.h XLR_BKP_REMOVABLE, the comment above it says that this
flag is used to indicate that the archiver can compress the full page
blocks to non-full page blocks. I am not familiar with where in the code
this actually happens but will this cause issues if the first standby is
processing WAL files from the archive?
Show quoted text
* The standby doesn't need to connect to the master that's actual writing
WAL.
(Ex. Standby2 in Cascade Replication: Master - Standby1 - Standby2)I'm worried how I should clear these problems.
Regards.
--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka.jun@po.ntts.co.jp
--------------------------------------------
Is there any way to tell from the WAL segments if they contain the full
page data? If so could you verify this on the second slave when it is
brought up? Or can you track this on the first slave and produce an
error in either pg_start_backup or pg_stop_backup()
Sure.
I will make a patch with the way to tell from the WAL segments if they
contain the full page data.
I see in xlog.h XLR_BKP_REMOVABLE, the comment above it says that this
flag is used to indicate that the archiver can compress the full page
blocks to non-full page blocks. I am not familiar with where in the code
this actually happens but will this cause issues if the first standby is
processing WAL files from the archive?
I confirmed the flag in xlog.c, so I seemed to only insert it in
XLogInsert(). I consider whether it is available.
--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka.jun@po.ntts.co.jp
--------------------------------------------
2011/8/17 Jun Ishiduka <ishizuka.jun@po.ntts.co.jp>:
I see in xlog.h XLR_BKP_REMOVABLE, the comment above it says that this
flag is used to indicate that the archiver can compress the full page
blocks to non-full page blocks. I am not familiar with where in the code
this actually happens but will this cause issues if the first standby is
processing WAL files from the archive?I confirmed the flag in xlog.c, so I seemed to only insert it in
XLogInsert(). I consider whether it is available.
That flag is not available to check whether full-page writing was
skipped or not.
Because it's in full-page data, not non-full-page one.
The straightforward approach to address the problem you raised is to log
the change of full_page_writes on the master. Since such a WAL record is also
replicated to the standby, the standby can know whether full_page_writes is
enabled or not in the master, from the WAL record. If it's disabled,
pg_start_backup() in the standby should emit an error and refuse standby-only
backup. If the WAL record indicating that full_page_writes was disabled
on the master arrives during standby-only backup, the standby should cancel
the backup.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Wed, Aug 17, 2011 at 6:19 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
2011/8/17 Jun Ishiduka <ishizuka.jun@po.ntts.co.jp>:
I see in xlog.h XLR_BKP_REMOVABLE, the comment above it says that this
flag is used to indicate that the archiver can compress the full page
blocks to non-full page blocks. I am not familiar with where in the code
this actually happens but will this cause issues if the first standby is
processing WAL files from the archive?I confirmed the flag in xlog.c, so I seemed to only insert it in
XLogInsert(). I consider whether it is available.That flag is not available to check whether full-page writing was
skipped or not.
Because it's in full-page data, not non-full-page one.The straightforward approach to address the problem you raised is to log
the change of full_page_writes on the master. Since such a WAL record is also
replicated to the standby, the standby can know whether full_page_writes is
enabled or not in the master, from the WAL record. If it's disabled,
pg_start_backup() in the standby should emit an error and refuse standby-only
backup. If the WAL record indicating that full_page_writes was disabled
on the master arrives during standby-only backup, the standby should cancel
the backup.
Seems like something we could add to XLOG_PARAMETER_CHANGE fairly easily.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, Aug 17, 2011 at 9:40 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Aug 17, 2011 at 6:19 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
The straightforward approach to address the problem you raised is to log
the change of full_page_writes on the master. Since such a WAL record is also
replicated to the standby, the standby can know whether full_page_writes is
enabled or not in the master, from the WAL record. If it's disabled,
pg_start_backup() in the standby should emit an error and refuse standby-only
backup. If the WAL record indicating that full_page_writes was disabled
on the master arrives during standby-only backup, the standby should cancel
the backup.Seems like something we could add to XLOG_PARAMETER_CHANGE fairly easily.
I'm afraid it's not so easy. Because since fpw can be changed by
SIGHUP, it's not
easy to ensure that logging the change of fpw must happen ahead of the actual
behavior change by that. Probably we need to make the backend which detects
the change of fpw first log that.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Wed, Aug 17, 2011 at 9:53 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
On Wed, Aug 17, 2011 at 9:40 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Aug 17, 2011 at 6:19 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
The straightforward approach to address the problem you raised is to log
the change of full_page_writes on the master. Since such a WAL record is also
replicated to the standby, the standby can know whether full_page_writes is
enabled or not in the master, from the WAL record. If it's disabled,
pg_start_backup() in the standby should emit an error and refuse standby-only
backup. If the WAL record indicating that full_page_writes was disabled
on the master arrives during standby-only backup, the standby should cancel
the backup.Seems like something we could add to XLOG_PARAMETER_CHANGE fairly easily.
I'm afraid it's not so easy. Because since fpw can be changed by
SIGHUP, it's not
easy to ensure that logging the change of fpw must happen ahead of the actual
behavior change by that. Probably we need to make the backend which detects
the change of fpw first log that.
Ugh, you're right. But then you might have problems if the state
changes again before all backends have picked up the previous change.
What I've thought about before is making one backend (say, bgwriter)
store its latest value in shared memory, protected by some lock that
would already be held at the time the value is needed. Everyone else
uses the shared memory copy instead of relying on their local value.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Thu, Aug 18, 2011 at 12:09 AM, Robert Haas <robertmhaas@gmail.com> wrote:
Ugh, you're right. But then you might have problems if the state
changes again before all backends have picked up the previous change.
Right.
What I've thought about before is making one backend (say, bgwriter)
store its latest value in shared memory, protected by some lock that
would already be held at the time the value is needed. Everyone else
uses the shared memory copy instead of relying on their local value.
Sounds reasonable.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
2011/8/5 Jun Ishiduka <ishizuka.jun@po.ntts.co.jp>:
* Procedure
1. Call pg_start_backup('x') on the standby.
2. Take a backup of the data dir.
3. Call pg_stop_backup() on the standby.
4. Copy the control file on the standby to the backup.
5. Check whether the control file is status during hot standby with pg_controldata.
-> If the standby promote between 3. and 4., the backup can not recovery.
-> pg_control is that "Minimum recovery ending location" is equals 0/0.
-> backup-end record is not written.
What if we do #4 before #3? The backup gets corrupted? My guess is
that the backup is still valid even if we copy pg_control before executing
pg_stop_backup(). Which would not require #5 because if the standby
promotion happens before pg_stop_backup(), pg_stop_backup() can
detect that status change and cancel the backup.
#5 looks fragile. If we can get rid of it, the procedure becomes more
robust, I think.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
* Procedure
1. Call pg_start_backup('x') on the standby.
2. Take a backup of the data dir.
3. Call pg_stop_backup() on the standby.
4. Copy the control file on the standby to the backup.
5. Check whether the control file is status during hot standby with pg_controldata.
? -> If the standby promote between 3. and 4., the backup can not recovery.
? ? ?-> pg_control is that "Minimum recovery ending location" is equals 0/0.
? ? ?-> backup-end record is not written.What if we do #4 before #3? The backup gets corrupted? My guess is
that the backup is still valid even if we copy pg_control before executing
pg_stop_backup(). Which would not require #5 because if the standby
promotion happens before pg_stop_backup(), pg_stop_backup() can
detect that status change and cancel the backup.#5 looks fragile. If we can get rid of it, the procedure becomes more
robust, I think.
Sure, you're right.
--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka.jun@po.ntts.co.jp
--------------------------------------------
Hi, Created a patch in response to comments.
* Procedure
1. Call pg_start_backup('x') on hot standby.
2. Take a backup of the data dir.
3. Copy the control file on hot standby to the backup.
4. Call pg_stop_backup() on hot standby.
* Behavior
(take backup)
If we execute pg_start_backup() on hot standby then execute restartpoint,
write a strings as "FROM: slave" in backup_label and change backup mode,
but do not change full_page_writes into "on" forcibly.
If we execute pg_stop_backup() on hot standby then rename backup_label
and change backup mode, but neither write backup end record and history
file nor wait to complete the WAL archiving.
pg_stop_backup() is returned this MinRecoveryPoint as result.
If we execute pg_stop_backup() on the server promoted then error
message is output since read the backup_label.
(recovery)
If we recover with the backup taken on hot standby, MinRecoveryPoint in
the control file copied by 3 of above-procedure is used instead of backup
end record.
If recovery starts as first, BackupEndPoint in the control file is written
a same value as MinRecoveryPoint. This is for remembering the value of
MinRecoveryPoint during recovery.
HINT message("If this has ...") is always output when we recover with the
backup taken on hot standby.
* Problem
full_page_writes's problem.
This has the following two problems.
* pg_start_backup() must set 'on' to full_page_writes of the master that
is actual writing of the WAL, but not the standby.
* The standby doesn't need to connect to the master that's actual writing
WAL.
(Ex. Standby2 in Cascade Replication: Master - Standby1 - Standby2)I'm worried how I should clear these problems.
Status: Considering
(Latest: http://archives.postgresql.org/pgsql-hackers/2011-08/msg00880.php)
Regards.
--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka.jun@po.ntts.co.jp
--------------------------------------------
Attachments:
standby_online_backup_06.patchapplication/octet-stream; name=standby_online_backup_06.patchDownload+5308-111
Update patch.
Changes:
* set 'on' full_page_writes by user (in document)
* read "FROM: XX" in backup_label (in xlog.c)
* check status when pg_stop_backup is executed (in xlog.c)
Hi, Created a patch in response to comments.
* Procedure
1. Call pg_start_backup('x') on hot standby.
2. Take a backup of the data dir.
3. Copy the control file on hot standby to the backup.
4. Call pg_stop_backup() on hot standby.* Behavior
(take backup)
If we execute pg_start_backup() on hot standby then execute restartpoint,
write a strings as "FROM: slave" in backup_label and change backup mode,
but do not change full_page_writes into "on" forcibly.If we execute pg_stop_backup() on hot standby then rename backup_label
and change backup mode, but neither write backup end record and history
file nor wait to complete the WAL archiving.
pg_stop_backup() is returned this MinRecoveryPoint as result.If we execute pg_stop_backup() on the server promoted then error
message is output since read the backup_label.(recovery)
If we recover with the backup taken on hot standby, MinRecoveryPoint in
the control file copied by 3 of above-procedure is used instead of backup
end record.If recovery starts as first, BackupEndPoint in the control file is written
a same value as MinRecoveryPoint. This is for remembering the value of
MinRecoveryPoint during recovery.HINT message("If this has ...") is always output when we recover with the
backup taken on hot standby.* Problem
full_page_writes's problem.This has the following two problems.
* pg_start_backup() must set 'on' to full_page_writes of the master that
is actual writing of the WAL, but not the standby.
* The standby doesn't need to connect to the master that's actual writing
WAL.
(Ex. Standby2 in Cascade Replication: Master - Standby1 - Standby2)I'm worried how I should clear these problems.
Status: Considering
(Latest: http://archives.postgresql.org/pgsql-hackers/2011-08/msg00880.php)Regards.
--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka.jun@po.ntts.co.jp
--------------------------------------------
--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka.jun@po.ntts.co.jp
--------------------------------------------
Attachments:
standby_online_backup_07.patchapplication/octet-stream; name=standby_online_backup_07.patchDownload+261-111
2011/9/13 Jun Ishiduka <ishizuka.jun@po.ntts.co.jp>:
Update patch.
Changes:
* set 'on' full_page_writes by user (in document)
* read "FROM: XX" in backup_label (in xlog.c)
* check status when pg_stop_backup is executed (in xlog.c)
Thanks for updating the patch.
Before reviewing the patch, to encourage people to comment and
review the patch, I explain what this patch provides:
This patch provides the capability to take a base backup during recovery,
i.e., from the standby server. This is very useful feature to offload the
expense of periodic backups from the master. That backup procedure is
similar to that during normal running, but slightly different:
1. Execute pg_start_backup on the standby. To execute a query on the
standby, hot standby must be enabled.
2. Perform a file system backup on the standby.
3. Copy the pg_control file from the cluster directory on the standby to
the backup as follows:
cp $PGDATA/global/pg_control /mnt/server/backupdir/global
4. Execute pg_stop_backup on the standby.
The backup taken by the above procedure is available for an archive
recovery or standby server.
If the standby is promoted during a backup, pg_stop_backup() detects
the change of the server status and fails. The data backed up before the
promotion is invalid and not available for recovery.
Taking a backup from the standby by using pg_basebackup is still not
possible. But we can relax that restriction after applying this patch.
To take a base backup during recovery safely, some sort of parameters
must be set properly. Hot standby must be enabled on the standby, i.e.,
wal_level and hot_standby must be enabled on the master and the standby,
respectively. FPW (full page writes) is required for a base backup,
so full_page_writes must be enabled on the master.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Wed, Sep 21, 2011 at 04:50, Fujii Masao <masao.fujii@gmail.com> wrote:
2011/9/13 Jun Ishiduka <ishizuka.jun@po.ntts.co.jp>:
Update patch.
Changes:
* set 'on' full_page_writes by user (in document)
* read "FROM: XX" in backup_label (in xlog.c)
* check status when pg_stop_backup is executed (in xlog.c)Thanks for updating the patch.
Before reviewing the patch, to encourage people to comment and
review the patch, I explain what this patch provides:This patch provides the capability to take a base backup during recovery,
i.e., from the standby server. This is very useful feature to offload the
expense of periodic backups from the master. That backup procedure is
similar to that during normal running, but slightly different:1. Execute pg_start_backup on the standby. To execute a query on the
standby, hot standby must be enabled.2. Perform a file system backup on the standby.
3. Copy the pg_control file from the cluster directory on the standby to
the backup as follows:cp $PGDATA/global/pg_control /mnt/server/backupdir/global
But this is done as part of step 2 already. I assume what this really
means is that the pg_control file must be the last file backed up?
(Since there are certainly a lot other ways to do the backup than just
cp to a mounted directory..)
4. Execute pg_stop_backup on the standby.
The backup taken by the above procedure is available for an archive
recovery or standby server.If the standby is promoted during a backup, pg_stop_backup() detects
the change of the server status and fails. The data backed up before the
promotion is invalid and not available for recovery.Taking a backup from the standby by using pg_basebackup is still not
possible. But we can relax that restriction after applying this patch.
I think that this is going to be very important, particularly given
the requirements on pt 3 above. (But yes, it certainly doesn't have to
be done as part of this patch, but it really should be the plan to
have this included in the same version)
To take a base backup during recovery safely, some sort of parameters
must be set properly. Hot standby must be enabled on the standby, i.e.,
wal_level and hot_standby must be enabled on the master and the standby,
respectively. FPW (full page writes) is required for a base backup,
so full_page_writes must be enabled on the master.
Presumably pg_start_backup() will check this. And we'll somehow track
this before pg_stop_backup() as well? (for such evil things such as
the user changing FPW from on to off and then back to on again during
a backup, will will make it look correct both during start and stop,
but incorrect in the middle - pg_stop_backup needs to fail in that
case as well)
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On Wed, Sep 21, 2011 at 2:13 PM, Magnus Hagander <magnus@hagander.net> wrote:
On Wed, Sep 21, 2011 at 04:50, Fujii Masao <masao.fujii@gmail.com> wrote:
3. Copy the pg_control file from the cluster directory on the standby to
the backup as follows:cp $PGDATA/global/pg_control /mnt/server/backupdir/global
But this is done as part of step 2 already. I assume what this really
means is that the pg_control file must be the last file backed up?
Yes.
When we perform an archive recovery from the backup taken during
normal processing, we gets a backup end location from the backup-end
WAL record which was written by pg_stop_backup(). But since no WAL
writing is allowed during recovery, pg_stop_backup() on the standby
cannot write a backup-end WAL record. So, in his patch, instead of
a backup-end WAL record, the startup process uses the minimum
recovery point recorded in pg_control which has been included in the
backup, as a backup end location. BTW, a backup end location is
used to check whether recovery has reached a consistency state
(i.e., end-of-backup).
To use the minimum recovery point in pg_control as a backup end
location safely, pg_control must be backed up last. Otherwise, data
page which has the newer LSN than the minimum recovery point
might be included in the backup.
(Since there are certainly a lot other ways to do the backup than just
cp to a mounted directory..)
Yes. The above command I described is just an example.
4. Execute pg_stop_backup on the standby.
The backup taken by the above procedure is available for an archive
recovery or standby server.If the standby is promoted during a backup, pg_stop_backup() detects
the change of the server status and fails. The data backed up before the
promotion is invalid and not available for recovery.Taking a backup from the standby by using pg_basebackup is still not
possible. But we can relax that restriction after applying this patch.I think that this is going to be very important, particularly given
the requirements on pt 3 above. (But yes, it certainly doesn't have to
be done as part of this patch, but it really should be the plan to
have this included in the same version)
Agreed.
To take a base backup during recovery safely, some sort of parameters
must be set properly. Hot standby must be enabled on the standby, i.e.,
wal_level and hot_standby must be enabled on the master and the standby,
respectively. FPW (full page writes) is required for a base backup,
so full_page_writes must be enabled on the master.Presumably pg_start_backup() will check this. And we'll somehow track
this before pg_stop_backup() as well? (for such evil things such as
the user changing FPW from on to off and then back to on again during
a backup, will will make it look correct both during start and stop,
but incorrect in the middle - pg_stop_backup needs to fail in that
case as well)
Right. As I suggested upthread, to address that problem, we need to log
the change of FPW on the master, and then we need to check whether
such a WAL is replayed on the standby during the backup. If it's done,
pg_stop_backup() should emit an error.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center