Background writer and checkpointer in crash recovery

Started by Thomas Munroover 5 years ago24 messageshackers

thomas.munro@gmail.com

over 5 years ago

(Forking this thread from the SLRU fsync one[1]/messages/by-id/CA+hUKGLJ=84YT+NvhkEEDAuUtVHMfQ9i-N7k_o50JmQ6Rpj_OQ@mail.gmail.com to allow for a
separate CF entry; it's unrelated, except for being another case of
moving work off the recovery process's plate.)

Hello hackers,

Currently we don't run the bgwriter process during crash recovery.
I've CCed Simon and Heikki who established this in commit cdd46c76.
Based on that commit message, I think the bar to clear to change the
policy is to show that it's useful, and that it doesn't make crash
recovery less robust. See the other thread for some initial evidence
of usefulness from Jakub Wartak. I think it also just makes intuitive
sense that it's got to help bigger-than-buffer-pool crash recovery if
you can shift buffer eviction out of the recovery loop. As for
robustness, I suppose we could provide the option to turn it off just
in case that turns out to be useful in some scenarios, but I'm
wondering why we would expect something that we routinely run in
archive/streaming recovery to reduce robustness in only slightly
different circumstances.

Here's an experiment-grade patch, comments welcome, though at this
stage it's primarily thoughts about the concept that I'm hoping to
solicit.

One question raised by Jakub that I don't have a great answer to right
now is whether you'd want different bgwriter settings in this scenario
for best results. I don't know, but I suspect that the answer is to
make bgwriter more adaptive rather than more configurable, and that's
an orthogonal project.

Once we had the checkpointer running, we could also consider making
the end-of-recovery checkpoint optional, or at least the wait for it
to complete. I haven't shown that in this patch, but it's just
different checkpoint request flags and possibly an end-of-recovery
record. What problems do you foresee with that? Why should we have
"fast" promotion but not "fast" crash recovery?

[1]: /messages/by-id/CA+hUKGLJ=84YT+NvhkEEDAuUtVHMfQ9i-N7k_o50JmQ6Rpj_OQ@mail.gmail.com

Tom Lane

tgl@sss.pgh.pa.us

over 5 years ago

In reply to: Thomas Munro (#1)

Re: Background writer and checkpointer in crash recovery

Thomas Munro <thomas.munro@gmail.com> writes:

Once we had the checkpointer running, we could also consider making
the end-of-recovery checkpoint optional, or at least the wait for it
to complete. I haven't shown that in this patch, but it's just
different checkpoint request flags and possibly an end-of-recovery
record. What problems do you foresee with that? Why should we have
"fast" promotion but not "fast" crash recovery?

I think that the rationale for that had something to do with trying
to reduce the costs of repeated crashes. If you've had one crash,
you might well have another one in your near future, due to the same
problem recurring. Therefore, avoiding multiple replays of the same WAL
is attractive; and to do that we have to complete a checkpoint before
giving users a chance to crash us again.

I'm not sure what I think about your primary proposal here. On the
one hand, optimizing crash recovery ought to be pretty darn far down
our priority list; if it happens often enough for performance to be
a consideration then we have worse problems to fix. Also, at least
in theory, not running the bgwriter/checkpointer makes for fewer moving
parts and thus better odds of completing crash recovery successfully.
On the other hand, it could be argued that running without the
bgwriter/checkpointer actually makes recovery less reliable, since
that's a far less thoroughly-tested operating regime than when they're
active.

tl;dr: I think this should be much less about performance than about
whether we successfully recover or not. Maybe there's still a case for
changing in that framework, though.

regards, tom lane

Simon Riggs

simon@2ndQuadrant.com

over 5 years ago

In reply to: Tom Lane (#2)

Re: Background writer and checkpointer in crash recovery

On Sun, 30 Aug 2020 at 01:39, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Thomas Munro <thomas.munro@gmail.com> writes:

Once we had the checkpointer running, we could also consider making
the end-of-recovery checkpoint optional, or at least the wait for it
to complete. I haven't shown that in this patch, but it's just
different checkpoint request flags and possibly an end-of-recovery
record. What problems do you foresee with that? Why should we have
"fast" promotion but not "fast" crash recovery?

I think that the rationale for that had something to do with trying
to reduce the costs of repeated crashes. If you've had one crash,
you might well have another one in your near future, due to the same
problem recurring. Therefore, avoiding multiple replays of the same WAL
is attractive; and to do that we have to complete a checkpoint before
giving users a chance to crash us again.

Agreed. That rationale is shown in comments and in the commit message.

"We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful."

Having said that, we did raise the checkpoint_timeout by a lot, so the
situation today might be quite different. A large checkpoint_timeout
could eventually overflow shared buffers, with the right workload.

We don't have any stats to show whether this patch is worthwhile or
not, so I suggest adding the attached instrumentation patch as well so
we can see on production systems whether checkpoint_timeout is too
high by comparison with pg_stat_bgwriter. The patch is written in the
style of log_checkpoints.

--
Simon Riggs http://www.EnterpriseDB.com/

Thomas Munro

thomas.munro@gmail.com

over 5 years ago

In reply to: Simon Riggs (#3)

Re: Background writer and checkpointer in crash recovery

On Wed, Nov 11, 2020 at 9:57 PM Simon Riggs <simon@2ndquadrant.com> wrote:

Having said that, we did raise the checkpoint_timeout by a lot, so the
situation today might be quite different. A large checkpoint_timeout
could eventually overflow shared buffers, with the right workload.

FWIW Jakuk Wartak did manage to show a 1.64x speedup while running
crash recovery of an insert-only workload (with a variant of this
patch that I shared in another thread), albeit with aggressive tuning:

/messages/by-id/VI1PR0701MB6960EEB838D53886D8A180E3F6520@VI1PR0701MB6960.eurprd07.prod.outlook.com

We don't have any stats to show whether this patch is worthwhile or
not, so I suggest adding the attached instrumentation patch as well so
we can see on production systems whether checkpoint_timeout is too
high by comparison with pg_stat_bgwriter. The patch is written in the
style of log_checkpoints.

Very useful. I've also been wondering how to get that sort of
information in hot standby.

Robert Haas

robertmhaas@gmail.com

over 5 years ago

In reply to: Thomas Munro (#1)

Re: Background writer and checkpointer in crash recovery

On Sat, Aug 29, 2020 at 8:13 PM Thomas Munro <thomas.munro@gmail.com> wrote:

Currently we don't run the bgwriter process during crash recovery.
I've CCed Simon and Heikki who established this in commit cdd46c76.
Based on that commit message, I think the bar to clear to change the
policy is to show that it's useful, and that it doesn't make crash
recovery less robust. See the other thread for some initial evidence
of usefulness from Jakub Wartak. I think it also just makes intuitive
sense that it's got to help bigger-than-buffer-pool crash recovery if
you can shift buffer eviction out of the recovery loop. As for
robustness, I suppose we could provide the option to turn it off just
in case that turns out to be useful in some scenarios, but I'm
wondering why we would expect something that we routinely run in
archive/streaming recovery to reduce robustness in only slightly
different circumstances.

Here's an experiment-grade patch, comments welcome, though at this
stage it's primarily thoughts about the concept that I'm hoping to
solicit.

I think the way it works right now is stupid and the proposed change
is going in the right direction. We have ample evidence already that
handing off fsyncs to a background process is a good idea, and there's
no reason why that shouldn't be beneficial during crash recovery just
as it is at other times. But even if it somehow failed to improve
performance during recovery, there's another good reason to do this,
which is that it would make the code simpler. Having the pendingOps
stuff in the startup process in some recovery situations and in the
checkpointer in other recovery situations makes this harder to reason
about. As Tom said, the system state where bgwriter and checkpointer
are not running is an uncommon one, and is probably more likely to
have (or grow) bugs than the state where they are running.

The rat's-nest of logic introduced by the comment "Perform a
checkpoint to update all our recovery activity to disk." inside
StartupXLOG() could really do with some simplification. Right now we
have three cases: CreateEndOfRecoveryRecord(), RequestCheckpoint(),
and CreateCheckpoint(). Maybe with this change we could get it down to
just two, since RequestCheckpoint() already knows what to do about
!IsUnderPostmaster.

--
Robert Haas
EDB: http://www.enterprisedb.com

Thomas Munro

thomas.munro@gmail.com

about 5 years ago

In reply to: Robert Haas (#5)

Re: Background writer and checkpointer in crash recovery

On Wed, Feb 3, 2021 at 11:11 AM Robert Haas <robertmhaas@gmail.com> wrote:

I think the way it works right now is stupid and the proposed change
is going in the right direction. We have ample evidence already that
handing off fsyncs to a background process is a good idea, and there's
no reason why that shouldn't be beneficial during crash recovery just
as it is at other times. But even if it somehow failed to improve
performance during recovery, there's another good reason to do this,
which is that it would make the code simpler. Having the pendingOps
stuff in the startup process in some recovery situations and in the
checkpointer in other recovery situations makes this harder to reason
about. As Tom said, the system state where bgwriter and checkpointer
are not running is an uncommon one, and is probably more likely to
have (or grow) bugs than the state where they are running.

Yeah, it's a good argument.

The rat's-nest of logic introduced by the comment "Perform a
checkpoint to update all our recovery activity to disk." inside
StartupXLOG() could really do with some simplification. Right now we
have three cases: CreateEndOfRecoveryRecord(), RequestCheckpoint(),
and CreateCheckpoint(). Maybe with this change we could get it down to
just two, since RequestCheckpoint() already knows what to do about
!IsUnderPostmaster.

True. Done in this version.

Here's a rebase of this patch + Simon's patch to report on stats.

I also have a sketch patch to provide a GUC that turns off the
end-of-recovery checkpoint as mentioned earlier, attached (sharing
mainly because this is one of the stack of patches that Jakub was
testing for his baseback/PITR workloads and he might want to test some
more), but I'm not proposing that for PG14. That idea is tangled up
with the "relfile tombstone" stuff I wrote about elsewhere[1]https://commitfest.postgresql.org/33/3030/, but I
haven't finished studying the full arsenal of footguns in that area
(it's something like: if we don't wait for end-of-recovery checkpoint
before allowing connections, then we'd better start creating
tombstones in recovery unless the WAL level is high enough to avoid
data eating hazards with unlogged changes and a double crash).

[1]: https://commitfest.postgresql.org/33/3030/

Aleksander Alekseev

aleksander@timescale.com

almost 5 years ago

In reply to: Thomas Munro (#6)

Re: Background writer and checkpointer in crash recovery

The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: tested, passed

v2-0001 and v2-0002 look fine, but I don't like much the idea of introducing a new GUC in v2-0003. It's for very specific needs, which most of the users, I believe, don't care about. I suggest dealing with v2-0001 and v2-0002 first and then maybe submit and discuss v2-0003 as a separate CF entry.

Tested on MacOS against master `1ec7fca8`.

The new status of this patch is: Ready for Committer

Robert Haas

robertmhaas@gmail.com

almost 5 years ago

In reply to: Aleksander Alekseev (#7)

Re: Background writer and checkpointer in crash recovery

On Fri, Jul 30, 2021 at 4:42 AM Aleksander Alekseev
<aleksander@timescale.com> wrote:

v2-0001 and v2-0002 look fine, but I don't like much the idea of introducing a new GUC in v2-0003. It's for very specific needs, which most of the users, I believe, don't care about. I suggest dealing with v2-0001 and v2-0002 first and then maybe submit and discuss v2-0003 as a separate CF entry.

Hi!

Thanks for bumping this thread; I had forgotten all about this effort,
but having just spent a bunch of time struggling with the thicket of
cases in StartupXLOG(), I'm now feeling highly motivated to make some
more progress in simplifying things over there. I am still of the
opinion that 0001 is a good idea, and I don't have any suggestions for
how it could be improved, except perhaps that the call to
PublishStartupProcessInformation() could maybe have a one-line
comment. Thomas, are you planning to press forward with committing
this soon? If not, do you mind if I do?

Regarding Simon's 0002, I wonder why it's useful to print this
information out at the end of crash recovery but not at the end of
archive recovery. It seems to me that if the information is useful
enough to be worth printing, it's probably good to print it in both
cases. In fact, rather than adding a separate message for this
information, I think we should just change the existing "redo done at"
message to print the details Simon proposes rather than what it does
now. Currently, we get output like this:

2021-07-30 09:13:05.319 EDT [48380] LOG: redo starts at 0/23A6E18
2021-07-30 09:13:05.612 EDT [48380] LOG: redo done at 0/D0D9CE8
system usage: CPU: user: 0.13 s, system: 0.12 s, elapsed: 0.29 s

With Simon's patch, I get something like this:

2021-07-30 09:39:43.579 EDT [63702] LOG: redo starts at 0/14A2F48
2021-07-30 09:39:44.129 EDT [63702] LOG: redo done at 0/15F48230
system usage: CPU: user: 0.25 s, system: 0.25 s, elapsed: 0.55 s
2021-07-30 09:39:44.129 EDT [63702] LOG: crash recovery complete:
wrote 36517 buffers (222.9%); dirtied 52985 buffers; read 7 buffers

Now I really think that information on the number of buffers touched
and how long it took is way more useful than user and system time.
Knowing how much user and system time were spent doesn't really tell
you anything, but a count of buffers touched gives you some meaningful
idea of how much work recovery did, and whether I/O was slow. Elapsed
time you can figure out yourself from the timestamp. However, I don't
really like printing the percentage here; unlike the checkpoint case,
it can very easily be way more than a hundred percent, and I think
that will confuse people. It could be tens of thousands of percent,
really, or even more.

So my proposal is:

redo done at %X/%X: wrote %ld buffers (%0.3f ms); dirtied %ld buffers;
read %ld buffers (%0.3f ms)

Regarding 0003, I agree with Alexander's comment that a GUC doesn't
seem particularly appropriate, but I also think that the approach may
not be quite right. In the promotion case, we emit an end-of-recovery
record and then later in the code we trigger a checkpoint. In your
patch, there's no end-of-recovery checkpoint -- you just trigger a
checkpoint instead of waiting for it. I think it's probably better to
make those two cases work the same. The end-of-recovery record isn't
needed to change the TLI as it is in the promotion case, but (1) it
seems better to have fewer code paths and (2) it might be good for
debuggability.

--
Robert Haas
EDB: http://www.enterprisedb.com

Andres Freund

andres@anarazel.de

almost 5 years ago

In reply to: Robert Haas (#8)

Re: Background writer and checkpointer in crash recovery

Hi,

On 2021-07-30 10:16:44 -0400, Robert Haas wrote:

2021-07-30 09:39:43.579 EDT [63702] LOG: redo starts at 0/14A2F48
2021-07-30 09:39:44.129 EDT [63702] LOG: redo done at 0/15F48230
system usage: CPU: user: 0.25 s, system: 0.25 s, elapsed: 0.55 s
2021-07-30 09:39:44.129 EDT [63702] LOG: crash recovery complete:
wrote 36517 buffers (222.9%); dirtied 52985 buffers; read 7 buffers

Now I really think that information on the number of buffers touched
and how long it took is way more useful than user and system time.
Knowing how much user and system time were spent doesn't really tell
you anything, but a count of buffers touched gives you some meaningful
idea of how much work recovery did, and whether I/O was slow.

I don't agree with that? If (user+system) << wall then it is very likely
that recovery is IO bound. If system is a large percentage of wall, then
shared buffers is likely too small (or we're replacing the wrong
buffers) because you spend a lot of time copying data in/out of the
kernel page cache. If user is the majority, you're CPU bound.

Without user & system time it's much harder to figure that out - at
least for me.

In your patch, there's no end-of-recovery checkpoint -- you just
trigger a checkpoint instead of waiting for it. I think it's probably
better to make those two cases work the same. The end-of-recovery
record isn't needed to change the TLI as it is in the promotion case,
but (1) it seems better to have fewer code paths and (2) it might be
good for debuggability.

In addition, the end-of-recovery record also good for e.g. hot standby,
logical decoding, etc, because it's a point where no write transactions
can be in progress...

Greetings,

Andres Freund

#10

Thomas Munro

thomas.munro@gmail.com

almost 5 years ago

In reply to: Robert Haas (#8)

Re: Background writer and checkpointer in crash recovery

On Sat, Jul 31, 2021 at 2:16 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Jul 30, 2021 at 4:42 AM Aleksander Alekseev
<aleksander@timescale.com> wrote:

v2-0001 and v2-0002 look fine, but I don't like much the idea of introducing a new GUC in v2-0003. It's for very specific needs, which most of the users, I believe, don't care about. I suggest dealing with v2-0001 and v2-0002 first and then maybe submit and discuss v2-0003 as a separate CF entry.

Thanks.

Thanks for bumping this thread; I had forgotten all about this effort,
but having just spent a bunch of time struggling with the thicket of
cases in StartupXLOG(), I'm now feeling highly motivated to make some
more progress in simplifying things over there. I am still of the
opinion that 0001 is a good idea, and I don't have any suggestions for
how it could be improved,

That's good news, and thanks. Yes, clearly there is much more that
can be simplified here.

except perhaps that the call to
PublishStartupProcessInformation() could maybe have a one-line
comment.

Done. BTW that is temporary, as I'm planning to remove that machinery soon[1]/messages/by-id/CA+hUKGLYRyDaneEwz5Uya_OgFLMx5BgJfkQSD=q9HmwsfRRb-w@mail.gmail.com.

Thomas, are you planning to press forward with committing
this soon? If not, do you mind if I do?

I pushed 0001. Let me think about 0002, and flesh out 0003 a bit more.

[1]: /messages/by-id/CA+hUKGLYRyDaneEwz5Uya_OgFLMx5BgJfkQSD=q9HmwsfRRb-w@mail.gmail.com

#11

Robert Haas

robertmhaas@gmail.com

almost 5 years ago

In reply to: Andres Freund (#9)

Re: Background writer and checkpointer in crash recovery

On Fri, Jul 30, 2021 at 4:00 PM Andres Freund <andres@anarazel.de> wrote:

I don't agree with that? If (user+system) << wall then it is very likely
that recovery is IO bound. If system is a large percentage of wall, then
shared buffers is likely too small (or we're replacing the wrong
buffers) because you spend a lot of time copying data in/out of the
kernel page cache. If user is the majority, you're CPU bound.

Without user & system time it's much harder to figure that out - at
least for me.

Oh, that's an interesting point. At least now I'll know why I am
supposed to care about that log line the next time I see it. I guess
we could include both things, though the line might get a little long.
Or maybe there's some other subset that would make sense.

--
Robert Haas
EDB: http://www.enterprisedb.com

#12

Robert Haas

robertmhaas@gmail.com

almost 5 years ago

In reply to: Thomas Munro (#10)

Re: Background writer and checkpointer in crash recovery

On Mon, Aug 2, 2021 at 1:37 AM Thomas Munro <thomas.munro@gmail.com> wrote:

I pushed 0001.

That's great. I just realized that this leaves us with identical
RequestCheckpoint() calls in two nearby places. Is there any reason
not to further simplify as in the attached?

--
Robert Haas
EDB: http://www.enterprisedb.com

#13

Amul Sul

sulamul@gmail.com

almost 5 years ago

In reply to: Robert Haas (#12)

Re: Background writer and checkpointer in crash recovery

On Mon, Aug 2, 2021 at 6:47 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Aug 2, 2021 at 1:37 AM Thomas Munro <thomas.munro@gmail.com> wrote:

I pushed 0001.

That's great. I just realized that this leaves us with identical
RequestCheckpoint() calls in two nearby places. Is there any reason
not to further simplify as in the attached?

+1, also, can we just get rid off "promoted" flag? The only
inconvenience is we might need to check three flags instead of one to
perform the checkpoint at the end.

#14

Jakub Wartak

Jakub.Wartak@tomtom.com

almost 5 years ago

In reply to: Robert Haas (#11)

RE: Background writer and checkpointer in crash recovery

On Fri, Jul 30, 2021 at 4:00 PM Andres Freund <andres@anarazel.de> wrote:

I don't agree with that? If (user+system) << wall then it is very
likely that recovery is IO bound. If system is a large percentage of
wall, then shared buffers is likely too small (or we're replacing the
wrong
buffers) because you spend a lot of time copying data in/out of the
kernel page cache. If user is the majority, you're CPU bound.

Without user & system time it's much harder to figure that out - at
least for me.

Oh, that's an interesting point. At least now I'll know why I am supposed to care
about that log line the next time I see it. I guess we could include both things,
though the line might get a little long.
Or maybe there's some other subset that would make sense.

Hi Robert,

The email threads from [1]https://commitfest.postgresql.org/30/2799/ can serve as indication that having complete view of
resource usage (user+system+elapsed) is advantageous in different situations and
pretty platform-generic. Also as Andres and Simon earlier pointed out - the performance
insight into crash recovery/replication performance is next to nothing, judging just by
looking at currently emitted log messages. The more the there is, the better I think.

BTW, if you now there's this big push for refactoring StartupXLOG() then what
frustrating^H^H^H^H^H could be done better - at least from end-user point of view -
is that there is lack of near real time cyclic messages (every 1min?) about current status,
performance and maybe even ETA (simplistic case; assuming it is linear).

[1]: https://commitfest.postgresql.org/30/2799/

#15

Robert Haas

robertmhaas@gmail.com

almost 5 years ago

In reply to: Amul Sul (#13)

Re: Background writer and checkpointer in crash recovery

On Mon, Aug 2, 2021 at 9:40 AM Amul Sul <sulamul@gmail.com> wrote:

That's great. I just realized that this leaves us with identical
RequestCheckpoint() calls in two nearby places. Is there any reason
not to further simplify as in the attached?

+1, also, can we just get rid off "promoted" flag? The only
inconvenience is we might need to check three flags instead of one to
perform the checkpoint at the end.

I'm not sure that's really a win, because if we use the same
conditional in both places then it might not be clear to somebody that
they're supposed to match.

I do think we ought to consider renaming the flag, though, because
LocalPromoteIsTriggered is already tracking whether we were promoted,
and what this flag is tracking is not quite that thing. It's real
purpose is to track whether we should request a non-immediate
checkpoint at the end of recovery, and I think we ought to name it
some way that reflects that, e.g. perform_checkpoint_later.

--
Robert Haas
EDB: http://www.enterprisedb.com

#16

Robert Haas

robertmhaas@gmail.com

almost 5 years ago

In reply to: Jakub Wartak (#14)

Re: Background writer and checkpointer in crash recovery

On Mon, Aug 2, 2021 at 9:51 AM Jakub Wartak <Jakub.Wartak@tomtom.com> wrote:

BTW, if you now there's this big push for refactoring StartupXLOG() then what
frustrating^H^H^H^H^H could be done better - at least from end-user point of view -
is that there is lack of near real time cyclic messages (every 1min?) about current status,
performance and maybe even ETA (simplistic case; assuming it is linear).

I agree. See /messages/by-id/CA+TgmoaHQrgDFOBwgY16XCoMtXxsrVGFB2jNCvb7-ubuEe1MGg@mail.gmail.com
and subsequent discussion.

--
Robert Haas
EDB: http://www.enterprisedb.com

#17

Thomas Munro

thomas.munro@gmail.com

almost 5 years ago

In reply to: Robert Haas (#12)

Re: Background writer and checkpointer in crash recovery

On Tue, Aug 3, 2021 at 1:17 AM Robert Haas <robertmhaas@gmail.com> wrote:

That's great. I just realized that this leaves us with identical
RequestCheckpoint() calls in two nearby places. Is there any reason
not to further simplify as in the attached?

LGTM.

#18

Thomas Munro

thomas.munro@gmail.com

almost 5 years ago

In reply to: Thomas Munro (#17)

Re: Background writer and checkpointer in crash recovery

On Tue, Aug 3, 2021 at 9:52 AM Thomas Munro <thomas.munro@gmail.com> wrote:

On Tue, Aug 3, 2021 at 1:17 AM Robert Haas <robertmhaas@gmail.com> wrote:

That's great. I just realized that this leaves us with identical
RequestCheckpoint() calls in two nearby places. Is there any reason
not to further simplify as in the attached?

LGTM.

And pushed.

#19

Justin Pryzby

pryzby@telsasoft.com

over 3 years ago

In reply to: Thomas Munro (#18)

Re: Background writer and checkpointer in crash recovery

On Tue, Aug 03, 2021 at 02:19:22PM +1200, Thomas Munro wrote:

On Tue, Aug 3, 2021 at 9:52 AM Thomas Munro <thomas.munro@gmail.com> wrote:

On Tue, Aug 3, 2021 at 1:17 AM Robert Haas <robertmhaas@gmail.com> wrote:

That's great. I just realized that this leaves us with identical
RequestCheckpoint() calls in two nearby places. Is there any reason
not to further simplify as in the attached?

LGTM.

And pushed.

Gripe: this made the "ps" display worse than before.

7ff23c6d2 Run checkpointer and bgwriter in crash recovery.

A couple years ago, I complained that during the end-of-recovery
checkpoint, the "ps" display still said "recovering NNNNN", which made
it look like it was stuck on a particular WAL file.

That led to commit df9274adf, which updated the startup process's "ps"
to say "end-of-recovery checkpoint".

But since the start process no longer does the checkpoint, it still
says:

postgres 19738 11433 5 19:33 ? 00:00:01 postgres: startup recovering 000000010000000C000000FB
postgres 19739 11433 3 19:33 ? 00:00:00 postgres: checkpointer performing end-of-recovery checkpoint

That looks inconsistent. It'd be better if the startup process's "ps"
were cleared.

--
Justin

#20

Justin Pryzby

pryzby@telsasoft.com

over 3 years ago

In reply to: Justin Pryzby (#19)

Re: Background writer and checkpointer in crash recovery

On Sun, Sep 11, 2022 at 07:54:43PM -0500, Justin Pryzby wrote:

On Tue, Aug 03, 2021 at 02:19:22PM +1200, Thomas Munro wrote:

On Tue, Aug 3, 2021 at 9:52 AM Thomas Munro <thomas.munro@gmail.com> wrote:

On Tue, Aug 3, 2021 at 1:17 AM Robert Haas <robertmhaas@gmail.com> wrote:

That's great. I just realized that this leaves us with identical
RequestCheckpoint() calls in two nearby places. Is there any reason
not to further simplify as in the attached?

LGTM.

And pushed.

Gripe: this made the "ps" display worse than before.

7ff23c6d2 Run checkpointer and bgwriter in crash recovery.

A couple years ago, I complained that during the end-of-recovery
checkpoint, the "ps" display still said "recovering NNNNN", which made
it look like it was stuck on a particular WAL file.

That led to commit df9274adf, which updated the startup process's "ps"
to say "end-of-recovery checkpoint".

But since the start process no longer does the checkpoint, it still
says:

postgres 19738 11433 5 19:33 ? 00:00:01 postgres: startup recovering 000000010000000C000000FB
postgres 19739 11433 3 19:33 ? 00:00:00 postgres: checkpointer performing end-of-recovery checkpoint

That looks inconsistent. It'd be better if the startup process's "ps"
were cleared.

Like this, maybe.

It's similar to what I suggested to consider backpatching here:
/messages/by-id/20201214032224.GA30237@telsasoft.com

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 7a710e6490d..4bf8614d45f 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5302,6 +5302,7 @@ StartupXLOG(void)
 	EndOfLogTLI = endOfRecoveryInfo->endOfLogTLI;
 	abortedRecPtr = endOfRecoveryInfo->abortedRecPtr;
 	missingContrecPtr = endOfRecoveryInfo->missingContrecPtr;
+	set_ps_display("");

/*
* When recovering from a backup (we are in recovery, and archive recovery

#21