Stronger safeguard for archive recovery not to miss data
Hello
The attached patch is intended to prevent a scenario that
archive recovery hits WALs which come from wal_level=minimal
and the server continues to work, which was discussed in the thread of [1]/messages/by-id/TYAPR01MB29901EBE5A3ACCE55BA99186FE320@TYAPR01MB2990.jpnprd01.prod.outlook.com.
The motivation is to protect that user ends up with both getting replica
that could miss data and getting the server to miss data in targeted recovery mode.
About how to modify this, we reached the consensus in the thread.
It is by changing the ereport's level from WARNING to FATAL in CheckRequiredParameterValues().
In order to test this fix, what I did is
1 - get a base backup during wal_level is replica
2 - stop the server and change the wal_level from replica to minimal
3 - restart the server(to generate XLOG_PARAMETER_CHANGE)
4 - stop the server and make the wal_level back to replica
5 - start the server again
6 - execute archive recoveries in both cases
(1) by editing the postgresql.conf and
touching recovery.signal in the base backup from 1th step
(2) by making a replica with standby.signal
* During wal_level is replica, I enabled archive_mode in this test.
First of all, I confirmed the server started up without this patch.
After applying this safeguard patch, I checked that
the server cannot start up any more in the scenario case.
I checked the log and got the result below with this patch.
2020-11-26 06:49:46.003 UTC [19715] FATAL: WAL was generated with wal_level=minimal, data may be missing
2020-11-26 06:49:46.003 UTC [19715] HINT: This happens if you temporarily set wal_level=minimal without taking a new base backup.
Lastly, this should be backpatched.
Any comments ?
[1]: /messages/by-id/TYAPR01MB29901EBE5A3ACCE55BA99186FE320@TYAPR01MB2990.jpnprd01.prod.outlook.com
/messages/by-id/TYAPR01MB29901EBE5A3ACCE55BA99186FE320@TYAPR01MB2990.jpnprd01.prod.outlook.com
Best,
Takamichi Osumi
Attachments:
stronger_safeguard_for_archive_recovery.patchapplication/octet-stream; name=stronger_safeguard_for_archive_recovery.patchDownload+1-2
At Thu, 26 Nov 2020 07:18:39 +0000, "osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com> wrote in
Hello
The attached patch is intended to prevent a scenario that
archive recovery hits WALs which come from wal_level=minimal
and the server continues to work, which was discussed in the thread of [1].
The motivation is to protect that user ends up with both getting replica
that could miss data and getting the server to miss data in targeted recovery mode.About how to modify this, we reached the consensus in the thread.
It is by changing the ereport's level from WARNING to FATAL in CheckRequiredParameterValues().In order to test this fix, what I did is
1 - get a base backup during wal_level is replica
2 - stop the server and change the wal_level from replica to minimal
3 - restart the server(to generate XLOG_PARAMETER_CHANGE)
4 - stop the server and make the wal_level back to replica
5 - start the server again
6 - execute archive recoveries in both cases
(1) by editing the postgresql.conf and
touching recovery.signal in the base backup from 1th step
(2) by making a replica with standby.signal
* During wal_level is replica, I enabled archive_mode in this test.First of all, I confirmed the server started up without this patch.
After applying this safeguard patch, I checked that
the server cannot start up any more in the scenario case.
I checked the log and got the result below with this patch.2020-11-26 06:49:46.003 UTC [19715] FATAL: WAL was generated with wal_level=minimal, data may be missing
2020-11-26 06:49:46.003 UTC [19715] HINT: This happens if you temporarily set wal_level=minimal without taking a new base backup.Lastly, this should be backpatched.
Any comments ?
Perhaps we need the TAP test that conducts the above steps.
[1]
/messages/by-id/TYAPR01MB29901EBE5A3ACCE55BA99186FE320@TYAPR01MB2990.jpnprd01.prod.outlook.com
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Hello
First of all, I confirmed the server started up without this patch.
It is possible only with manually configured hot_standby = off, right?
We have ERROR in hot standby mode just below in CheckRequiredParameterValues.
regards, Sergei
Hello, Horiguchi-San
On Thursday, Nov 26, 2020 4:29 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com>
At Thu, 26 Nov 2020 07:18:39 +0000, "osumi.takamichi@fujitsu.com"
<osumi.takamichi@fujitsu.com> wrote inIn order to test this fix, what I did is
1 - get a base backup during wal_level is replica
2 - stop the server and change the wal_level from replica to minimal
3 - restart the server(to generate XLOG_PARAMETER_CHANGE)
4 - stop the server and make the wal_level back to replica
5 - start the server again
6 - execute archive recoveries in both cases
(1) by editing the postgresql.conf and
touching recovery.signal in the base backup from 1th step
(2) by making a replica with standby.signal
* During wal_level is replica, I enabled archive_mode in this test.Perhaps we need the TAP test that conducts the above steps.
It really makes sense that it's better to show the procedures
about how to reproduce the flow.
Thanks for your advice.
Best,
Takamichi Osumi
Hello, Sergei
It is possible only with manually configured hot_standby = off, right?
We have ERROR in hot standby mode just below in
CheckRequiredParameterValues.
Yes, you are right. I turned off the hot_standby in the test in the previous e-mail.
I'm sorry that I missed writing this tip of information.
Best,
Takamichi Osumi
Hi
On Thursday, November 26, 2020 4:29 PM
Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
At Thu, 26 Nov 2020 07:18:39 +0000, "osumi.takamichi@fujitsu.com"
<osumi.takamichi@fujitsu.com> wrote inThe attached patch is intended to prevent a scenario that archive
recovery hits WALs which come from wal_level=minimal and the server
continues to work, which was discussed in the thread of [1].
The motivation is to protect that user ends up with both getting
replica that could miss data and getting the server to miss data in targetedrecovery mode.
About how to modify this, we reached the consensus in the thread.
It is by changing the ereport's level from WARNING to FATAL inCheckRequiredParameterValues().
In order to test this fix, what I did is
1 - get a base backup during wal_level is replica
2 - stop the server and change the wal_level from replica to minimal
3 - restart the server(to generate XLOG_PARAMETER_CHANGE)
4 - stop the server and make the wal_level back to replica
5 - start the server again
6 - execute archive recoveries in both cases
(1) by editing the postgresql.conf and
touching recovery.signal in the base backup from 1th step
(2) by making a replica with standby.signal
* During wal_level is replica, I enabled archive_mode in this test.First of all, I confirmed the server started up without this patch.
After applying this safeguard patch, I checked that the server cannot
start up any more in the scenario case.
I checked the log and got the result below with this patch.2020-11-26 06:49:46.003 UTC [19715] FATAL: WAL was generated with
wal_level=minimal, data may be missing
2020-11-26 06:49:46.003 UTC [19715] HINT: This happens if youtemporarily set wal_level=minimal without taking a new base backup.
Lastly, this should be backpatched.
Any comments ?Perhaps we need the TAP test that conducts the above steps.
I added the TAP tests to reproduce and share the result,
using the case of 6-(1) described above.
Here, I created a new file for it because the purposes of other files in
src/recovery didn't match the purpose of my TAP tests perfectly.
If you are dubious about this idea, please have a look at the comments
in each file.
When the attached patch is applied,
my TAP tests are executed like other ones like below.
t/018_wal_optimize.pl ................ ok
t/019_replslot_limit.pl .............. ok
t/020_archive_status.pl .............. ok
t/021_row_visibility.pl .............. ok
t/022_archive_recovery.pl ............ ok
All tests successful.
Also, I confirmed that there's no regression by make check-world.
Any comments ?
Best,
Takamichi Osumi
Attachments:
stronger_safeguard_for_archive_recovery_v02.patchapplication/octet-stream; name=stronger_safeguard_for_archive_recovery_v02.patchDownload+78-2
On Tue, 2020-12-08 at 03:08 +0000, osumi.takamichi@fujitsu.com wrote:
On Thursday, November 26, 2020 4:29 PM
Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:At Thu, 26 Nov 2020 07:18:39 +0000, "osumi.takamichi@fujitsu.com"
<osumi.takamichi@fujitsu.com> wrote inThe attached patch is intended to prevent a scenario that archive
recovery hits WALs which come from wal_level=minimal and the server
continues to work, which was discussed in the thread of [1].Perhaps we need the TAP test that conducts the above steps.
I added the TAP tests to reproduce and share the result,
using the case of 6-(1) described above.
Here, I created a new file for it because the purposes of other files in
src/recovery didn't match the purpose of my TAP tests perfectly.
If you are dubious about this idea, please have a look at the comments
in each file.When the attached patch is applied,
my TAP tests are executed like other ones like below.t/018_wal_optimize.pl ................ ok
t/019_replslot_limit.pl .............. ok
t/020_archive_status.pl .............. ok
t/021_row_visibility.pl .............. ok
t/022_archive_recovery.pl ............ ok
All tests successful.Also, I confirmed that there's no regression by make check-world.
Any comments ?
The patch applies and passes regression tests, as well as the new TAP test.
I think this should be backpatched, since it fixes a bug.
I am not quite happy with the message:
FATAL: WAL was generated with wal_level=minimal, data may be missing
HINT: This happens if you temporarily set wal_level=minimal without taking a new base backup.
This sounds too harmless to me and doesn't give the user a clue
what would be the best way to proceed.
Suggestion:
FATAL: WAL was generated with wal_level=minimal, cannot continue recovering
DETAIL: This happens if you temporarily set wal_level=minimal on the primary server.
HINT: Create a new standby from a new base backup after setting wal_level=replica.
Yours,
Laurenz Albe
Hi, Laurenz
On Friday, January 15, 2021 12:56 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Tue, 2020-12-08 at 03:08 +0000, osumi.takamichi@fujitsu.com wrote:
On Thursday, November 26, 2020 4:29 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Thu, 26 Nov 2020 07:18:39 +0000, "osumi.takamichi@fujitsu.com"
<osumi.takamichi@fujitsu.com> wrote inThe attached patch is intended to prevent a scenario that archive
recovery hits WALs which come from wal_level=minimal and the
server continues to work, which was discussed in the thread of [1].Perhaps we need the TAP test that conducts the above steps.
I added the TAP tests to reproduce and share the result, using the
case of 6-(1) described above.
Here, I created a new file for it because the purposes of other files
in src/recovery didn't match the purpose of my TAP tests perfectly.
If you are dubious about this idea, please have a look at the comments
in each file.When the attached patch is applied,
my TAP tests are executed like other ones like below.t/018_wal_optimize.pl ................ ok t/019_replslot_limit.pl
.............. ok t/020_archive_status.pl .............. ok
t/021_row_visibility.pl .............. ok t/022_archive_recovery.pl
............ ok All tests successful.Also, I confirmed that there's no regression by make check-world.
Any comments ?The patch applies and passes regression tests, as well as the new TAP test.
Thank you for checking.
I think this should be backpatched, since it fixes a bug.
Agreed.
I am not quite happy with the message:
FATAL: WAL was generated with wal_level=minimal, data may be missing
HINT: This happens if you temporarily set wal_level=minimal without taking a
new base backup.This sounds too harmless to me and doesn't give the user a clue what would be
the best way to proceed.Suggestion:
FATAL: WAL was generated with wal_level=minimal, cannot continue
recovering
Adopted.
DETAIL: This happens if you temporarily set wal_level=minimal on the primary
server.
HINT: Create a new standby from a new base backup after setting
wal_level=replica.
Thanks for your suggestion.
I noticed that this message should cover both archive recovery modes,
which means in recovery mode and standby mode. Then, I combined your
suggestion above with this point of view. Have a look at the updated patch.
I also enriched the new tap tests to show this perspective.
Best Regards,
Takamichi Osumi
Attachments:
stronger_safeguard_for_archive_recovery_v03.patchapplication/octet-stream; name=stronger_safeguard_for_archive_recovery_v03.patchDownload+112-4
On Mon, 2021-01-18 at 07:34 +0000, osumi.takamichi@fujitsu.com wrote:
I noticed that this message should cover both archive recovery modes,
which means in recovery mode and standby mode. Then, I combined your
suggestion above with this point of view. Have a look at the updated patch.
I also enriched the new tap tests to show this perspective.
Looks good, thanks.
I'll mark this patch as "ready for committer".
Yours,
Laurenz Albe
On 2021/01/20 1:05, Laurenz Albe wrote:
On Mon, 2021-01-18 at 07:34 +0000, osumi.takamichi@fujitsu.com wrote:
I noticed that this message should cover both archive recovery modes,
which means in recovery mode and standby mode. Then, I combined your
suggestion above with this point of view. Have a look at the updated patch.
I also enriched the new tap tests to show this perspective.Looks good, thanks.
I'll mark this patch as "ready for committer".
+ errhint("Run recovery again from a new base backup taken after setting wal_level higher than minimal")));
Isn't it impossible to do this in normal archive recovery case? In that case,
since the server crashed and the database got corrupted, probably
we cannot take a new base backup.
Originally even when users accidentally set wal_level to minimal, they could
start the server from the backup by disabling hot_standby and salvage the data.
But with the patch, we lost the way to do that. Right? I was wondering that
WARNING was used intentionally there for that case.
if (ControlFile->wal_level < WAL_LEVEL_REPLICA)
ereport(ERROR,
(errmsg("hot standby is not possible because wal_level was not set to \"replica\" or higher on the primary server"),
errhint("Either set wal_level to \"replica\" on the primary, or turn off hot_standby here.")));
With the patch, we never reach the above code?
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
On Wed, 2021-01-20 at 13:10 +0900, Fujii Masao wrote:
+ errhint("Run recovery again from a new base backup taken after setting wal_level higher than minimal")));
Isn't it impossible to do this in normal archive recovery case? In that case,
since the server crashed and the database got corrupted, probably
we cannot take a new base backup.Originally even when users accidentally set wal_level to minimal, they could
start the server from the backup by disabling hot_standby and salvage the data.
But with the patch, we lost the way to do that. Right? I was wondering that
WARNING was used intentionally there for that case.
I would argue that if you set your "wal_level" to minimal, do some work,
reset it to "replica" and recover past that, two things could happen:
1. Since "archive_mode" was off, you are missing some WAL segments and
cannot recover past that point anyway.
2. You are missing some relevant WAL records, and your recovered
database is corrupted. This is very likely, because you probably
switched to "minimal" with the intention to generate less WAL.
Now I see your point that a corrupted database may be better than no
database at all, but wouldn't you agree that a warning in the log
(that nobody reads) is too little information?
Normally we don't take such a relaxed attitude towards data corruption.
Perhaps there could be a GUC "recovery_allow_data_corruption" to
override this check, but I'd say that a warning is too little.
if (ControlFile->wal_level < WAL_LEVEL_REPLICA)
ereport(ERROR,
(errmsg("hot standby is not possible because wal_level was not set to \"replica\" or higher on the primary server"),
errhint("Either set wal_level to \"replica\" on the primary, or turn off hot_standby here.")));With the patch, we never reach the above code?
Right, that would have to go. I didn't notice that.
Yours,
Laurenz Albe
Hi
On Wed, Jan 20, 2021 9:03 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Wed, 2021-01-20 at 13:10 +0900, Fujii Masao wrote:
+ errhint("Run recovery again from a + new base backup taken after setting wal_level higher than + minimal")));Isn't it impossible to do this in normal archive recovery case? In
that case, since the server crashed and the database got corrupted,
probably we cannot take a new base backup.Originally even when users accidentally set wal_level to minimal, they
could start the server from the backup by disabling hot_standby and salvagethe data.
But with the patch, we lost the way to do that. Right? I was wondering
that WARNING was used intentionally there for that case.
OK. I understand that this WARNING is necessary to give user a chance
to start up the server again and salvage his/her data,
when hot_standby=off and the server goes through a period of wal_level=minimal
during an archive recovery.
I would argue that if you set your "wal_level" to minimal, do some work, reset it
to "replica" and recover past that, two things could happen:1. Since "archive_mode" was off, you are missing some WAL segments and
cannot recover past that point anyway.2. You are missing some relevant WAL records, and your recovered
database is corrupted. This is very likely, because you probably
switched to "minimal" with the intention to generate less WAL.Now I see your point that a corrupted database may be better than no database
at all, but wouldn't you agree that a warning in the log (that nobody reads) is
too little information?Normally we don't take such a relaxed attitude towards data corruption.
Yeah, we thought we needed stronger protection for that reason.
Perhaps there could be a GUC "recovery_allow_data_corruption" to override this
check, but I'd say that a warning is too little.if (ControlFile->wal_level < WAL_LEVEL_REPLICA)
ereport(ERROR,
(errmsg("hot standby is notpossible because wal_level was not set to \"replica\" or higher on the primary
server"),errhint("Either set wal_level
to \"replica\" on the primary, or turn off hot_standby here.")));With the patch, we never reach the above code?
Right, that would have to go. I didn't notice that.
Adding a condition to check if "recovery_allow_data_corruption" is 'on' around the end of
CheckRequiredParameterValues() sounds safer for me too, although
implementing a new GUC parameter sounds bigger than what I expected at first.
The default of the value should be 'off' to protect users from getting the corrupted server.
Does everyone agree with this direction ?
Best Regards,
Takamichi Osumi
On Thu, 2021-01-21 at 11:49 +0000, osumi.takamichi@fujitsu.com wrote:
Adding a condition to check if "recovery_allow_data_corruption" is 'on' around the end of
CheckRequiredParameterValues() sounds safer for me too, although
implementing a new GUC parameter sounds bigger than what I expected at first.
The default of the value should be 'off' to protect users from getting the corrupted server.
Does everyone agree with this direction ?
I'd say that adding such a GUC is material for another patch, if we want it at all.
I think it is very unlikely that people will switch from "wal_level=replica" to
"minimal" and back very soon afterwards and also try to recover past such
a switch, which probably explains why nobody has complained about data corruption
generated that way. To get the server to start with "wal_level=minimal", you must
set "archive_mode=off" and "max_wal_senders=0", and few people will do that and
still expect recovery to work.
My vote is that we should not have a GUC for such an unlikely event, and that
stopping recovery is good enough.
Yours,
Laurenz Albe
Hi, Laurenz
On Thursday, January 21, 2021 9:51 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Thu, 2021-01-21 at 11:49 +0000, osumi.takamichi@fujitsu.com wrote:
Adding a condition to check if "recovery_allow_data_corruption" is
'on' around the end of
CheckRequiredParameterValues() sounds safer for me too, although
implementing a new GUC parameter sounds bigger than what I expected atfirst.
The default of the value should be 'off' to protect users from getting the
corrupted server.
Does everyone agree with this direction ?
I'd say that adding such a GUC is material for another patch, if we want it at all.
OK. You meant another different patch.
I think it is very unlikely that people will switch from "wal_level=replica" to
"minimal" and back very soon afterwards and also try to recover past such a
switch, which probably explains why nobody has complained about data
corruption generated that way. To get the server to start with
"wal_level=minimal", you must set "archive_mode=off" and
"max_wal_senders=0", and few people will do that and still expect recovery to
work.
Yeah, the possibility is low of course.
My vote is that we should not have a GUC for such an unlikely event, and that
stopping recovery is good enough.
OK. IIUC, my current patch for this fix doesn't need to be changed or withdrawn.
Thank you for your explanation.
Best Regards,
Takamichi Osumi
On Thu, 2021-01-21 at 13:09 +0000, osumi.takamichi@fujitsu.com wrote:
My vote is that we should not have a GUC for such an unlikely event, and that
stopping recovery is good enough.OK. IIUC, my current patch for this fix doesn't need to be changed or withdrawn.
Thank you for your explanation.
Well, that's just my opinion.
Fujii Masao seemed to disagree with the patch, and his voice carries weight.
Yours,
Laurenz Albe
On Thu, 2021-01-21 at 15:30 +0100, I wrote:
On Thu, 2021-01-21 at 13:09 +0000, osumi.takamichi@fujitsu.com wrote:
My vote is that we should not have a GUC for such an unlikely event, and that
stopping recovery is good enough.OK. IIUC, my current patch for this fix doesn't need to be changed or withdrawn.
Thank you for your explanation.Well, that's just my opinion.
Fujii Masao seemed to disagree with the patch, and his voice carries weight.
I think you should pst another patch where the second, now superfluous,
error message is removed.
Yours,
Laurenz Albe
Hi
On Monday, January 25, 2021 5:13 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Thu, 2021-01-21 at 15:30 +0100, I wrote:
On Thu, 2021-01-21 at 13:09 +0000, osumi.takamichi@fujitsu.com wrote:
My vote is that we should not have a GUC for such an unlikely
event, and that stopping recovery is good enough.OK. IIUC, my current patch for this fix doesn't need to be changed or
withdrawn.
Thank you for your explanation.
Well, that's just my opinion.
Fujii Masao seemed to disagree with the patch, and his voice carries weight.
I think you should pst another patch where the second, now superfluous, error
message is removed.
Updated. This patch showed no failure during regression tests
and has been aligned by pgindent.
Best Regards,
Takamichi Osumi
Attachments:
stronger_safeguard_for_archive_recovery_v04.patchapplication/octet-stream; name=stronger_safeguard_for_archive_recovery_v04.patchDownload+140-37
On Mon, 2021-01-25 at 08:19 +0000, osumi.takamichi@fujitsu.com wrote:
I think you should pst another patch where the second, now superfluous, error
message is removed.Updated. This patch showed no failure during regression tests
and has been aligned by pgindent.
Looks good to me.
I'll set it to "ready for committer" again.
Yours,
Laurenz Albe
On 1/25/21 3:55 AM, Laurenz Albe wrote:
On Mon, 2021-01-25 at 08:19 +0000, osumi.takamichi@fujitsu.com wrote:
I think you should pst another patch where the second, now superfluous, error
message is removed.Updated. This patch showed no failure during regression tests
and has been aligned by pgindent.Looks good to me.
I'll set it to "ready for committer" again.
Fujii, does the new patch in [1]/messages/by-id/OSBPR01MB48887EFFCA39FA9B1DBAFB0FEDBD0@OSBPR01MB4888.jpnprd01.prod.outlook.com address your concerns?
Regards,
--
-David
david@pgmasters.net
[1]: /messages/by-id/OSBPR01MB48887EFFCA39FA9B1DBAFB0FEDBD0@OSBPR01MB4888.jpnprd01.prod.outlook.com
/messages/by-id/OSBPR01MB48887EFFCA39FA9B1DBAFB0FEDBD0@OSBPR01MB4888.jpnprd01.prod.outlook.com
On 2021/03/25 23:21, David Steele wrote:
On 1/25/21 3:55 AM, Laurenz Albe wrote:
On Mon, 2021-01-25 at 08:19 +0000, osumi.takamichi@fujitsu.com wrote:
I think you should pst another patch where the second, now superfluous, error
message is removed.Updated. This patch showed no failure during regression tests
and has been aligned by pgindent.Looks good to me.
I'll set it to "ready for committer" again.Fujii, does the new patch in [1] address your concerns?
No. I'm still not sure if this patch is good idea... I understand
why this safeguard is necessary. OTOH I'm afraid it increases
a bit the risk that users get unstartable database, i.e., lose whole database.
But maybe I'm concerned about rare case and my opinion is minority one.
So I'd like to hear more opinions about this patch.
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION