Is it correct to update db state in control file as "shutting down" during end-of-recovery checkpoint?
Hi,
While the database is performing end-of-recovery checkpoint, the
control file gets updated with db state as "shutting down" in
CreateCheckPoint (see the code snippet at [1]void CreateCheckPoint(int flags) {) and at the end it sets
it back to "shut down" for a brief moment and then finally to "in
production". If the end-of-recovery checkpoint takes a lot of time or
the db goes down during the end-of-recovery checkpoint for whatever
reasons, the control file ends up having the wrong db state.
Should we add a new db state something like
DB_IN_END_OF_RECOVERY_CHECKPOINT/"in end-of-recovery checkpoint" or
something else to represent the correct state?
Thoughts?
[1]: void CreateCheckPoint(int flags) {
void
CreateCheckPoint(int flags)
{
/*
* An end-of-recovery checkpoint is really a shutdown checkpoint, just
* issued at a different time.
*/
if (flags & (CHECKPOINT_IS_SHUTDOWN | CHECKPOINT_END_OF_RECOVERY))
shutdown = true;
else
shutdown = false;
if (shutdown)
{
LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
ControlFile->state = DB_SHUTDOWNING;
UpdateControlFile();
LWLockRelease(ControlFileLock);
}
if (shutdown)
ControlFile->state = DB_SHUTDOWNED;
Regards,
Bharath Rupireddy.
On 12/6/21, 4:34 AM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
While the database is performing end-of-recovery checkpoint, the
control file gets updated with db state as "shutting down" in
CreateCheckPoint (see the code snippet at [1]) and at the end it sets
it back to "shut down" for a brief moment and then finally to "in
production". If the end-of-recovery checkpoint takes a lot of time or
the db goes down during the end-of-recovery checkpoint for whatever
reasons, the control file ends up having the wrong db state.Should we add a new db state something like
DB_IN_END_OF_RECOVERY_CHECKPOINT/"in end-of-recovery checkpoint" or
something else to represent the correct state?
This seems like a reasonable change to me. From a quick glance, it
looks like it should be a simple fix that wouldn't add too much
divergence between the shutdown and end-of-recovery checkpoint code
paths.
Nathan
At Mon, 6 Dec 2021 19:28:03 +0000, "Bossart, Nathan" <bossartn@amazon.com> wrote in
On 12/6/21, 4:34 AM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
While the database is performing end-of-recovery checkpoint, the
control file gets updated with db state as "shutting down" in
CreateCheckPoint (see the code snippet at [1]) and at the end it sets
it back to "shut down" for a brief moment and then finally to "in
production". If the end-of-recovery checkpoint takes a lot of time or
the db goes down during the end-of-recovery checkpoint for whatever
reasons, the control file ends up having the wrong db state.Should we add a new db state something like
DB_IN_END_OF_RECOVERY_CHECKPOINT/"in end-of-recovery checkpoint" or
something else to represent the correct state?This seems like a reasonable change to me. From a quick glance, it
looks like it should be a simple fix that wouldn't add too much
divergence between the shutdown and end-of-recovery checkpoint code
paths.
Technically end-of-crash-recovery checkpointis actually a kind of
shutdown checkpoint. In other words, a server that needs to run a
crash recovery actually is once shut down then enters normal operation
mode internally. So if the server crashed after the end-of-recovery
checkpoint finished and before it enters DB_IN_PRODUCTION state, the
server would start with a clean startup next time. We could treat
DB_IN_END_OF_RECOVERY_CHECKPOINT as safe state to skip recovery but I
don't think we need to preserve that behavior.
In other places, server log and ps display specifically, we already
make distinction between end-of-recovery checkopint and shutdown
checkpoint.
Finally, I agree to Nathan that it should be simple enough.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Tue, Dec 7, 2021 at 12:58 AM Bossart, Nathan <bossartn@amazon.com> wrote:
On 12/6/21, 4:34 AM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
While the database is performing end-of-recovery checkpoint, the
control file gets updated with db state as "shutting down" in
CreateCheckPoint (see the code snippet at [1]) and at the end it sets
it back to "shut down" for a brief moment and then finally to "in
production". If the end-of-recovery checkpoint takes a lot of time or
the db goes down during the end-of-recovery checkpoint for whatever
reasons, the control file ends up having the wrong db state.Should we add a new db state something like
DB_IN_END_OF_RECOVERY_CHECKPOINT/"in end-of-recovery checkpoint" or
something else to represent the correct state?This seems like a reasonable change to me. From a quick glance, it
looks like it should be a simple fix that wouldn't add too much
divergence between the shutdown and end-of-recovery checkpoint code
paths.
Here's a patch that I've come up with. Please see if this looks okay
and let me know if we want to take it forward so that I can add a CF
entry.
The new status one would see is as follows:
bharath@bharathubuntu3:~/postgres/inst/bin$ ./pg_controldata -D data
pg_control version number: 1500
Catalog version number: 202111301
Database system identifier: 7038867865889221935
Database cluster state: in end-of-recovery checkpoint
pg_control last modified: Tue Dec 7 08:06:18 2021
Latest checkpoint location: 0/14D24A0
Latest checkpoint's REDO location: 0/14D24A0
Latest checkpoint's REDO WAL file: 000000010000000000000001
Regards,
Bharath Rupireddy.
Attachments:
v1-0001-Add-DB_IN_END_OF_RECOVERY_CHECKPOINT-state-for-co.patchapplication/x-patch; name=v1-0001-Add-DB_IN_END_OF_RECOVERY_CHECKPOINT-state-for-co.patchDownload+15-4
On 12/7/21, 12:10 AM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
Here's a patch that I've come up with. Please see if this looks okay
and let me know if we want to take it forward so that I can add a CF
entry.
Overall, the patch seems reasonable to me.
+ case DB_IN_END_OF_RECOVERY_CHECKPOINT:
+ ereport(LOG,
+ (errmsg("database system was interrupted while in end-of-recovery checkpoint at %s",
+ str_time(ControlFile->time))));
+ break;
I noticed that some (but not all) of the surrounding messages say
"last known up at" the control file time. I'm curious why you chose
not to use that phrasing in this case.
Nathan
On Wed, Dec 8, 2021 at 2:50 AM Bossart, Nathan <bossartn@amazon.com> wrote:
On 12/7/21, 12:10 AM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
Here's a patch that I've come up with. Please see if this looks okay
and let me know if we want to take it forward so that I can add a CF
entry.Overall, the patch seems reasonable to me.
Thanks for reviewing this.
+ case DB_IN_END_OF_RECOVERY_CHECKPOINT: + ereport(LOG, + (errmsg("database system was interrupted while in end-of-recovery checkpoint at %s", + str_time(ControlFile->time)))); + break;I noticed that some (but not all) of the surrounding messages say
"last known up at" the control file time. I'm curious why you chose
not to use that phrasing in this case.
If state is DB_IN_END_OF_RECOVERY_CHECKPOINT that means the db was
interrupted while in end-of-recovery checkpoint, so I used the
phrasing similar to DB_IN_CRASH_RECOVERY and DB_IN_ARCHIVE_RECOVERY
cases. I would like to keep it as-is (in the v1 patch) unless anyone
has other thoughts here?
(errmsg("database system was interrupted while in recovery at %s",
(errmsg("database system was interrupted while in recovery at log time %s",
I added a CF entry here - https://commitfest.postgresql.org/36/3442/
Regards,
Bharath Rupireddy.
On 12/7/21, 5:21 PM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
On Wed, Dec 8, 2021 at 2:50 AM Bossart, Nathan <bossartn@amazon.com> wrote:
I noticed that some (but not all) of the surrounding messages say
"last known up at" the control file time. I'm curious why you chose
not to use that phrasing in this case.If state is DB_IN_END_OF_RECOVERY_CHECKPOINT that means the db was
interrupted while in end-of-recovery checkpoint, so I used the
phrasing similar to DB_IN_CRASH_RECOVERY and DB_IN_ARCHIVE_RECOVERY
cases. I would like to keep it as-is (in the v1 patch) unless anyone
has other thoughts here?
(errmsg("database system was interrupted while in recovery at %s",
(errmsg("database system was interrupted while in recovery at log time %s",
I think that's alright. The only other small suggestion I have would
be to say "during end-of-recovery checkpoint" instead of "while in
end-of-recovery checkpoint."
Another option we might want to consider is to just skip updating the
state entirely for end-of-recovery checkpoints. The state would
instead go straight from DB_IN_CRASH_RECOVERY to DB_IN_PRODUCTION. I
don't know if it's crucial to have a dedicated control file state for
end-of-recovery checkpoints.
Nathan
On Wed, Dec 8, 2021 at 9:49 AM Bossart, Nathan <bossartn@amazon.com> wrote:
On 12/7/21, 5:21 PM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
On Wed, Dec 8, 2021 at 2:50 AM Bossart, Nathan <bossartn@amazon.com> wrote:
I noticed that some (but not all) of the surrounding messages say
"last known up at" the control file time. I'm curious why you chose
not to use that phrasing in this case.If state is DB_IN_END_OF_RECOVERY_CHECKPOINT that means the db was
interrupted while in end-of-recovery checkpoint, so I used the
phrasing similar to DB_IN_CRASH_RECOVERY and DB_IN_ARCHIVE_RECOVERY
cases. I would like to keep it as-is (in the v1 patch) unless anyone
has other thoughts here?
(errmsg("database system was interrupted while in recovery at %s",
(errmsg("database system was interrupted while in recovery at log time %s",I think that's alright. The only other small suggestion I have would
be to say "during end-of-recovery checkpoint" instead of "while in
end-of-recovery checkpoint."
"while in" is being used by DB_IN_CRASH_RECOVERY and
DB_IN_ARCHIVE_RECOVERY messages. I don't think it's a good idea to
deviate from that and use "during".
Another option we might want to consider is to just skip updating the
state entirely for end-of-recovery checkpoints. The state would
instead go straight from DB_IN_CRASH_RECOVERY to DB_IN_PRODUCTION. I
don't know if it's crucial to have a dedicated control file state for
end-of-recovery checkpoints.
Please note that end-of-recovery can take a while in production
systems (we have observed such things working with our customers) and
anything can happen during that period of time. The end-of-recovery
checkpoint is not something that gets finished momentarily. Therefore,
having a separate db state in the control file is useful.
Regards,
Bharath Rupireddy.
On 12/7/21, 8:42 PM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
On Wed, Dec 8, 2021 at 9:49 AM Bossart, Nathan <bossartn@amazon.com> wrote:
I think that's alright. The only other small suggestion I have would
be to say "during end-of-recovery checkpoint" instead of "while in
end-of-recovery checkpoint.""while in" is being used by DB_IN_CRASH_RECOVERY and
DB_IN_ARCHIVE_RECOVERY messages. I don't think it's a good idea to
deviate from that and use "during".
Fair enough. I don't have a strong opinion about this.
Another option we might want to consider is to just skip updating the
state entirely for end-of-recovery checkpoints. The state would
instead go straight from DB_IN_CRASH_RECOVERY to DB_IN_PRODUCTION. I
don't know if it's crucial to have a dedicated control file state for
end-of-recovery checkpoints.Please note that end-of-recovery can take a while in production
systems (we have observed such things working with our customers) and
anything can happen during that period of time. The end-of-recovery
checkpoint is not something that gets finished momentarily. Therefore,
having a separate db state in the control file is useful.
Is there some useful distinction between the states for users? ISTM
that users will be waiting either way, and I don't know that an extra
control file state will help all that much. The main reason I bring
up this option is that the list of states is pretty short and appears
to be intended to indicate the high-level status of the server. Most
of the states are over 20 years old, and the newest one is over 10
years old, so I don't think new states can be added willy-nilly.
Of course, I could be off-base and others might agree that this new
state would be nice to have.
Nathan
On Wed, Dec 8, 2021 at 10:59 AM Bossart, Nathan <bossartn@amazon.com> wrote:
Another option we might want to consider is to just skip updating the
state entirely for end-of-recovery checkpoints. The state would
instead go straight from DB_IN_CRASH_RECOVERY to DB_IN_PRODUCTION. I
don't know if it's crucial to have a dedicated control file state for
end-of-recovery checkpoints.Please note that end-of-recovery can take a while in production
systems (we have observed such things working with our customers) and
anything can happen during that period of time. The end-of-recovery
checkpoint is not something that gets finished momentarily. Therefore,
having a separate db state in the control file is useful.Is there some useful distinction between the states for users? ISTM
that users will be waiting either way, and I don't know that an extra
control file state will help all that much. The main reason I bring
up this option is that the list of states is pretty short and appears
to be intended to indicate the high-level status of the server. Most
of the states are over 20 years old, and the newest one is over 10
years old, so I don't think new states can be added willy-nilly.
Firstly, updating the control file with "DB_SHUTDOWNING" and
"DB_SHUTDOWNED" for end-of-recovery checkpoint is wrong. I don't think
having DB_IN_CRASH_RECOVERY for end-of-recovery checkpoint is a great
idea. We have a checkpoint (which most of the time takes a while) in
between the states DB_IN_CRASH_RECOVERY to DB_IN_PRODUCTION. The state
DB_IN_END_OF_RECOVERY_CHECKPOINT added by the v1 patch at [1]/messages/by-id/CALj2ACVn5M8xgQ3RD=6rSTbbXRBdBWZ=TTOBOY_5+edMCkWjHA@mail.gmail.com (in this
thread) helps users to understand and clearly distinguish what state
the db is in.
IMHO, the age of the code doesn't stop us adding/fixing/improving the code.
Of course, I could be off-base and others might agree that this new
state would be nice to have.
Let's see what others have to say about this.
[1]: /messages/by-id/CALj2ACVn5M8xgQ3RD=6rSTbbXRBdBWZ=TTOBOY_5+edMCkWjHA@mail.gmail.com
Regards,
Bharath Rupireddy.
At Wed, 8 Dec 2021 11:47:30 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in
On Wed, Dec 8, 2021 at 10:59 AM Bossart, Nathan <bossartn@amazon.com> wrote:
Another option we might want to consider is to just skip updating the
state entirely for end-of-recovery checkpoints. The state would
instead go straight from DB_IN_CRASH_RECOVERY to DB_IN_PRODUCTION. I
don't know if it's crucial to have a dedicated control file state for
end-of-recovery checkpoints.
FWIW I find it simple but sufficient since I regarded the
end-of-recovery checkpoint as a part of recovery. In that case what
is strange here is only that the state transition passes the
DB_SHUTDOWN(ING/ED) states.
On the other hand, when a server is going to shutdown, the state stays
at DB_IN_PRODUCTION if there are clinging clients even if the shutdown
procedure has been already started and no new clients can connect to
the server. There's no reason we need to be so particular about states
for recovery-end.
Please note that end-of-recovery can take a while in production
systems (we have observed such things working with our customers) and
anything can happen during that period of time. The end-of-recovery
checkpoint is not something that gets finished momentarily. Therefore,
having a separate db state in the control file is useful.Is there some useful distinction between the states for users? ISTM
that users will be waiting either way, and I don't know that an extra
control file state will help all that much. The main reason I bring
up this option is that the list of states is pretty short and appears
to be intended to indicate the high-level status of the server. Most
of the states are over 20 years old, and the newest one is over 10
years old, so I don't think new states can be added willy-nilly.Firstly, updating the control file with "DB_SHUTDOWNING" and
"DB_SHUTDOWNED" for end-of-recovery checkpoint is wrong. I don't think
having DB_IN_CRASH_RECOVERY for end-of-recovery checkpoint is a great
idea. We have a checkpoint (which most of the time takes a while) in
between the states DB_IN_CRASH_RECOVERY to DB_IN_PRODUCTION. The state
DB_IN_END_OF_RECOVERY_CHECKPOINT added by the v1 patch at [1] (in this
thread) helps users to understand and clearly distinguish what state
the db is in.IMHO, the age of the code doesn't stop us adding/fixing/improving the code.
Of course, I could be off-base and others might agree that this new
state would be nice to have.Let's see what others have to say about this.
I see it a bit too complex for the advantage. When end-of-recovery
checkpoint takes so long, that state is shown in server log, which
operators would look into before the control file.
[1] - /messages/by-id/CALj2ACVn5M8xgQ3RD=6rSTbbXRBdBWZ=TTOBOY_5+edMCkWjHA@mail.gmail.com
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Wed, Dec 8, 2021 at 1:05 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Wed, 8 Dec 2021 11:47:30 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in
On Wed, Dec 8, 2021 at 10:59 AM Bossart, Nathan <bossartn@amazon.com> wrote:
Another option we might want to consider is to just skip updating the
state entirely for end-of-recovery checkpoints. The state would
instead go straight from DB_IN_CRASH_RECOVERY to DB_IN_PRODUCTION. I
don't know if it's crucial to have a dedicated control file state for
end-of-recovery checkpoints.FWIW I find it simple but sufficient since I regarded the
end-of-recovery checkpoint as a part of recovery. In that case what
is strange here is only that the state transition passes the
DB_SHUTDOWN(ING/ED) states.On the other hand, when a server is going to shutdown, the state stays
at DB_IN_PRODUCTION if there are clinging clients even if the shutdown
procedure has been already started and no new clients can connect to
the server. There's no reason we need to be so particular about states
for recovery-end.I see it a bit too complex for the advantage. When end-of-recovery
checkpoint takes so long, that state is shown in server log, which
operators would look into before the control file.
Thanks for your thoughts. I'm fine either way, hence attaching two
patches here with and I will leave it for the committer 's choice.
1) v1-0001-Add-DB_IN_END_OF_RECOVERY_CHECKPOINT-state-for-co.patch --
adds new db state DB_IN_END_OF_RECOVERY_CHECKPOINT for control file.
2) v1-0001-Skip-control-file-db-state-updation-during-end-of.patch --
just skips setting db state to DB_SHUTDOWNING and DB_SHUTDOWNED in
case of end-of-recovery checkpoint so that the state will be
DB_IN_CRASH_RECOVERY which then changes to DB_IN_PRODUCTION.
Regards,
Bharath Rupireddy.
Attachments:
v1-0001-Add-DB_IN_END_OF_RECOVERY_CHECKPOINT-state-for-co.patchapplication/x-patch; name=v1-0001-Add-DB_IN_END_OF_RECOVERY_CHECKPOINT-state-for-co.patchDownload+15-4
v1-0001-Skip-control-file-db-state-updation-during-end-of.patchapplication/x-patch; name=v1-0001-Skip-control-file-db-state-updation-during-end-of.patchDownload+2-3
On 12/8/21, 3:29 AM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
Thanks for your thoughts. I'm fine either way, hence attaching two
patches here with and I will leave it for the committer 's choice.
1) v1-0001-Add-DB_IN_END_OF_RECOVERY_CHECKPOINT-state-for-co.patch --
adds new db state DB_IN_END_OF_RECOVERY_CHECKPOINT for control file.
2) v1-0001-Skip-control-file-db-state-updation-during-end-of.patch --
just skips setting db state to DB_SHUTDOWNING and DB_SHUTDOWNED in
case of end-of-recovery checkpoint so that the state will be
DB_IN_CRASH_RECOVERY which then changes to DB_IN_PRODUCTION.
I've bumped this one to ready-for-committer. For the record, my
preference is the second patch (for the reasons discussed upthread).
Both patches might benefit from a small comment or two, too.
Nathan
On Wed, Dec 8, 2021 at 11:02 PM Bossart, Nathan <bossartn@amazon.com> wrote:
On 12/8/21, 3:29 AM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
Thanks for your thoughts. I'm fine either way, hence attaching two
patches here with and I will leave it for the committer 's choice.
1) v1-0001-Add-DB_IN_END_OF_RECOVERY_CHECKPOINT-state-for-co.patch --
adds new db state DB_IN_END_OF_RECOVERY_CHECKPOINT for control file.
2) v1-0001-Skip-control-file-db-state-updation-during-end-of.patch --
just skips setting db state to DB_SHUTDOWNING and DB_SHUTDOWNED in
case of end-of-recovery checkpoint so that the state will be
DB_IN_CRASH_RECOVERY which then changes to DB_IN_PRODUCTION.I've bumped this one to ready-for-committer. For the record, my
preference is the second patch (for the reasons discussed upthread).
Both patches might benefit from a small comment or two, too.
Thanks. I've added a comment to the patch
v2-0001-Skip-control-file-db-state-updation-during-end-of.patch. The
other patch remains the same as the new state
DB_IN_END_OF_RECOVERY_CHECKPOINT introduced there says it all.
Regards,
Bharath Rupireddy.
Attachments:
v2-0001-Skip-control-file-db-state-updation-during-end-of.patchapplication/octet-stream; name=v2-0001-Skip-control-file-db-state-updation-during-end-of.patchDownload+6-3
v1-0001-Add-DB_IN_END_OF_RECOVERY_CHECKPOINT-state-for-co.patchapplication/octet-stream; name=v1-0001-Add-DB_IN_END_OF_RECOVERY_CHECKPOINT-state-for-co.patchDownload+15-4
On Thu, Dec 09, 2021 at 07:41:52AM +0530, Bharath Rupireddy wrote:
On Wed, Dec 8, 2021 at 11:02 PM Bossart, Nathan <bossartn@amazon.com> wrote:
On 12/8/21, 3:29 AM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
Thanks for your thoughts. I'm fine either way, hence attaching two
patches here with and I will leave it for the committer 's choice.
1) v1-0001-Add-DB_IN_END_OF_RECOVERY_CHECKPOINT-state-for-co.patch --
adds new db state DB_IN_END_OF_RECOVERY_CHECKPOINT for control file.
2) v1-0001-Skip-control-file-db-state-updation-during-end-of.patch --
just skips setting db state to DB_SHUTDOWNING and DB_SHUTDOWNED in
case of end-of-recovery checkpoint so that the state will be
DB_IN_CRASH_RECOVERY which then changes to DB_IN_PRODUCTION.I've bumped this one to ready-for-committer. For the record, my
preference is the second patch (for the reasons discussed upthread).
Both patches might benefit from a small comment or two, too.Thanks. I've added a comment to the patch
v2-0001-Skip-control-file-db-state-updation-during-end-of.patch. The
other patch remains the same as the new state
DB_IN_END_OF_RECOVERY_CHECKPOINT introduced there says it all.
AFAIU is one patch or the other but not both, isn't it?
This habit of putting two conflicting versions of patches on the same
thread causes http://cfbot.cputube.org/ to fail.
Now; I do think that the secondd patch, the one that just skips update
of the state in control file, is the way to go. The other patch adds too
much complexity for a small return.
--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL
On Mon, Jan 10, 2022 at 10:58 AM Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:
On Thu, Dec 09, 2021 at 07:41:52AM +0530, Bharath Rupireddy wrote:
On Wed, Dec 8, 2021 at 11:02 PM Bossart, Nathan <bossartn@amazon.com> wrote:
On 12/8/21, 3:29 AM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
Thanks for your thoughts. I'm fine either way, hence attaching two
patches here with and I will leave it for the committer 's choice.
1) v1-0001-Add-DB_IN_END_OF_RECOVERY_CHECKPOINT-state-for-co.patch --
adds new db state DB_IN_END_OF_RECOVERY_CHECKPOINT for control file.
2) v1-0001-Skip-control-file-db-state-updation-during-end-of.patch --
just skips setting db state to DB_SHUTDOWNING and DB_SHUTDOWNED in
case of end-of-recovery checkpoint so that the state will be
DB_IN_CRASH_RECOVERY which then changes to DB_IN_PRODUCTION.I've bumped this one to ready-for-committer. For the record, my
preference is the second patch (for the reasons discussed upthread).
Both patches might benefit from a small comment or two, too.Thanks. I've added a comment to the patch
v2-0001-Skip-control-file-db-state-updation-during-end-of.patch. The
other patch remains the same as the new state
DB_IN_END_OF_RECOVERY_CHECKPOINT introduced there says it all.
Now; I do think that the secondd patch, the one that just skips update
of the state in control file, is the way to go. The other patch adds too
much complexity for a small return.
Thanks. Attaching the above patch.
Regards,
Bharath Rupireddy.
Attachments:
v2-0001-Skip-control-file-db-state-updation-during-end-of.patchapplication/octet-stream; name=v2-0001-Skip-control-file-db-state-updation-during-end-of.patchDownload+6-3
On Mon, Jan 10, 2022 at 11:04:05AM +0530, Bharath Rupireddy wrote:
On Mon, Jan 10, 2022 at 10:58 AM Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:Now; I do think that the secondd patch, the one that just skips update
of the state in control file, is the way to go. The other patch adds too
much complexity for a small return.Thanks. Attaching the above patch.
I agree that the addition of DB_IN_END_OF_RECOVERY_CHECKPOINT is not
necessary as the control file state will be reflected in a live server
once it the instance is ready to write WAL after promotion, as much as
I agree that the state stored in the control file because of the
end-of-recovery record does not reflect the reality.
Now, I also find confusing the state of CreateCheckpoint() once this
patch gets applied. Now the code and comments imply that an
end-of-recovery checkpoint is a shutdown checkpoint because they
perform the same actions, which is fine. Could it be less confusing
to remove completely the "shutdown" variable instead and replace those
checks with "flags"? What the patch is doing is one step in this
direction.
--
Michael
On 1/24/22, 9:16 PM, "Michael Paquier" <michael@paquier.xyz> wrote:
Now, I also find confusing the state of CreateCheckpoint() once this
patch gets applied. Now the code and comments imply that an
end-of-recovery checkpoint is a shutdown checkpoint because they
perform the same actions, which is fine. Could it be less confusing
to remove completely the "shutdown" variable instead and replace those
checks with "flags"? What the patch is doing is one step in this
direction.
I looked into removing the "shutdown" variable in favor of using
"flags" everywhere, but the patch was quite messy and repetitive. I
think another way to make things less confusing is to replace
"shutdown" with an inverse variable called "online." The attached
patch does it this way.
Nathan
Attachments:
v3-0001-Skip-control-file-db-state-updation-during-end-of.patchapplication/octet-stream; name=v3-0001-Skip-control-file-db-state-updation-during-end-of.patchDownload+43-43
On Tue, Jan 25, 2022 at 07:20:05PM +0000, Bossart, Nathan wrote:
I looked into removing the "shutdown" variable in favor of using
"flags" everywhere, but the patch was quite messy and repetitive. I
think another way to make things less confusing is to replace
"shutdown" with an inverse variable called "online." The attached
patch does it this way.
Yeah, that sounds like a good compromise. At least, I find the whole
a bit easier to follow.
Heikki was planning to commit a large refactoring of xlog.c, so we'd
better wait for that to happen before concluding here.
--
Michael
On Wed, Jan 26, 2022 at 6:28 AM Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Jan 25, 2022 at 07:20:05PM +0000, Bossart, Nathan wrote:
I looked into removing the "shutdown" variable in favor of using
"flags" everywhere, but the patch was quite messy and repetitive. I
think another way to make things less confusing is to replace
"shutdown" with an inverse variable called "online." The attached
patch does it this way.Yeah, that sounds like a good compromise. At least, I find the whole
a bit easier to follow.
v3 LGTM.
Heikki was planning to commit a large refactoring of xlog.c, so we'd
better wait for that to happen before concluding here.
Will that patch set fix the issue reported here?
Regards,
Bharath Rupireddy.