PATCH: regular logging of checkpoint progress

Started by Tomas Vondraalmost 15 years ago46 messageshackers

tomas.vondra@2ndquadrant.com

almost 15 years ago

Hello,

I'd like to propose a small patch that allows better checkpoint progress
monitoring. The patch is quite simple - it adds a new integer GUC
"checkpoint_update_limit" and every time checkpoint writes this number of
buffers, it does two things:

(a) logs a "checkpoint status" message into the server log, with info
about total number of buffers to write, number of already written buffers,
current and average write speed and estimate of remaining time

(b) sends bgwriter stats (so that the buffers_checkpoint is updated)

I believe this will make checkpoint tuning easier, especially with large
shared bufferers and large when there's other write activity (so that it's
difficult to see checkpoint I/O).

The default value (0) means this continuous logging is disabled.

Tomas

Josh Berkus

josh@agliodbs.com

almost 15 years ago

In reply to: Tomas Vondra (#1)

Re: PATCH: regular logging of checkpoint progress

Tomas,

I'd like to propose a small patch that allows better checkpoint progress
monitoring. The patch is quite simple - it adds a new integer GUC
"checkpoint_update_limit" and every time checkpoint writes this number of
buffers, it does two things:

I'd rather not have a new GUC if we can avoid it. What about just
making this some reasonable value (like 1000 buffers) if log_checkpoints
= on?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 15 years ago

In reply to: Josh Berkus (#2)

Re: PATCH: regular logging of checkpoint progress

On 26 Srpen 2011, 0:18, Josh Berkus wrote:

Tomas,

I'd like to propose a small patch that allows better checkpoint progress
monitoring. The patch is quite simple - it adds a new integer GUC
"checkpoint_update_limit" and every time checkpoint writes this number
of
buffers, it does two things:

I'd rather not have a new GUC if we can avoid it. What about just
making this some reasonable value (like 1000 buffers) if log_checkpoints
= on?

I was thinking about that too, but I think no value can fit all cases
because the systems may have very different I/O subsystems.

With one 7.2k drive I usually get about 25MB/s on average, with big arrays
/ good controllers / fast drives you can write much faster. So either the
value will be too low (and the log will be infested with those messages)
or too high (so it won't be updated very often).

And it depends on the number of shared buffers too. I thought it could be
something like 5% of shared buffers but it really does not solve the
issues. So I think there should be a new GUC.

OTOH, it's probably a good idea to log this only when the log_checkpoints
is enabled as those who tune checkpoints will enable log_checkpoints in
the first case.

Tomas

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 15 years ago

In reply to: Tomas Vondra (#3)

Re: PATCH: regular logging of checkpoint progress

On 26 Srpen 2011, 0:39, Tomas Vondra wrote:

On 26 Srpen 2011, 0:18, Josh Berkus wrote:

Tomas,

I'd like to propose a small patch that allows better checkpoint
progress
monitoring. The patch is quite simple - it adds a new integer GUC
"checkpoint_update_limit" and every time checkpoint writes this number
of
buffers, it does two things:

I'd rather not have a new GUC if we can avoid it. What about just
making this some reasonable value (like 1000 buffers) if log_checkpoints
= on?

I was thinking about that too, but I think no value can fit all cases
because the systems may have very different I/O subsystems.

With one 7.2k drive I usually get about 25MB/s on average, with big arrays
/ good controllers / fast drives you can write much faster. So either the
value will be too low (and the log will be infested with those messages)
or too high (so it won't be updated very often).

Hmmm, maybe we could use time instead of number of buffers? Something like
"every 5 seconds, log the checkpoint progress" instead of "every time 1000
buffers is written ..."

That should work on systems regardless of I/O performance.

But although using time instead of number of buffers seems like a good
idea, I don't think it eliminates the need for a new GUC.

(a) Even with time limit, I find it useful to be able to set the limits
differently.

(b) In some cases it may be useful to enable just basic (current
behaviour) checkpoint logging using log_checkpoints, so using it for this
new patch may not be a good idea. Although this could be fixed by allowing
three values no/basic/detailed instead of just true/false.

Tomas

Greg Smith

gsmith@gregsmith.com

almost 15 years ago

In reply to: Tomas Vondra (#1)

Re: PATCH: regular logging of checkpoint progress

On 08/25/2011 04:57 PM, Tomas Vondra wrote:

(b) sends bgwriter stats (so that the buffers_checkpoint is updated)

The idea behind only updating the stats in one chunk, at the end, is
that it makes one specific thing easier to do. Let's say you're running
a monitoring system that is grabbing snapshots of pg_stat_bgwriter
periodically. If you want to figure out how much work a checkpoint did,
you only need two points of data to compute that right now. Whenever
you see either of the checkpoint count numbers increase, you just
subtract off the previous sample; now you've got a delta for how many
buffers that checkpoint wrote out. You can derive the information about
the buffer counts involved that appears in the logs quite easily this
way. The intent was to make that possible to do, so that people can
figure this out without needing to parse the log data.

Spreading out the updates defeats that idea. It also makes it possible
to see the buffer writes more in real-time, as they happen. You can
make a case for both approaches having their use cases; the above is
just summarizing the logic behind why it's done the way it is right
now. I don't think many people are actually doing things with this to
the level where their tool will care. The most popular consumer of
pg_stat_bgwriter data I see is Munin graphing changes, and I don't think
it will care either way.

Giving people the option of doing it the other way is a reasonable idea,
but I'm not sure there's enough use case there to justify adding a GUC
just for that. My next goal here is to eliminate checkpoint_segments,
not to add yet another tunable extremely few users would ever touch.

As for throwing more log data out, I'm not sure what new analysis you're
thinking of that it allows. I/O gets increasingly spiky as you zoom in
on it; averaging over a shorter period can easily end up providing less
insight about trends. If anything, I spend more time summarizing the
data that's already there, rather than wanting to break them down. It's
already providing way too much detail for most people. Customers tell
me they don't care to see checkpoint stats unless they're across a day
or more of sampling, so even the current "once every ~5 minutes" is way
more info than they want. I have all this log parsing code and things
that look at pg_stat_bgwriter to collect that data and produce higher
level reports. And lots of it would break if any of this patch is added
and people turn it on. I imagine other log/stat parsing programs might
suffer issues too. That's your other hurdle for change here: the new
analysis techniques have to be useful enough to justify that some
downstream tool disruption is inevitable.

If you have an idea for how to use this extra data for something useful,
let's talk about what that is and see if it's possible to build it in
instead. This problem is harder than it looks, mainly because the way
the OS caches writes here makes trying to derive hard numbers from what
the background writer is doing impossible. When the database writes
things out, and when they actually get written to disk, they are not the
same event. The actual write is often during the sync phase, and not
being able to tracking that beast is where I see the most problems at.
The write phase, the easier part to instrument in the database, that is
pretty boring. That's why the last extra logging I added here focused
on adding visibility to the sync activity instead.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

Magnus Hagander

magnus@hagander.net

almost 15 years ago

In reply to: Tomas Vondra (#1)

Re: PATCH: regular logging of checkpoint progress

On Thu, Aug 25, 2011 at 22:57, Tomas Vondra <tv@fuzzy.cz> wrote:

Hello,

I'd like to propose a small patch that allows better checkpoint progress
monitoring. The patch is quite simple - it adds a new integer GUC
"checkpoint_update_limit" and every time checkpoint writes this number of
buffers, it does two things:

(a) logs a "checkpoint status" message into the server log, with info
about total number of buffers to write, number of already written buffers,
current and average write speed and estimate of remaining time

(b) sends bgwriter stats (so that the buffers_checkpoint is updated)

I believe this will make checkpoint tuning easier, especially with large
shared bufferers and large when there's other write activity (so that it's
difficult to see checkpoint I/O).

The default value (0) means this continuous logging is disabled.

This seems like the wrong thing to write to the log. It's really only
useful *during* the checkpoint run, isn't it? If so, I think it should
go in a pg_stat view. In theory, this could be the "progress view" or
"progress field" talked about around Gregs previous patch - or it
could just be modifying the commandstring in pg_stat_activity. Either
way, it should be updated in shared memory in that case (like current
query is), and not sent with a message to the collector.

IMHO, of course ;)

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 15 years ago

In reply to: Greg Smith (#5)

Re: PATCH: regular logging of checkpoint progress

On 26 Srpen 2011, 9:35, Greg Smith wrote:

On 08/25/2011 04:57 PM, Tomas Vondra wrote:

(b) sends bgwriter stats (so that the buffers_checkpoint is updated)

As for throwing more log data out, I'm not sure what new analysis you're
thinking of that it allows. I/O gets increasingly spiky as you zoom in
on it; averaging over a shorter period can easily end up providing less
insight about trends. If anything, I spend more time summarizing the
data that's already there, rather than wanting to break them down. It's
already providing way too much detail for most people. Customers tell
me they don't care to see checkpoint stats unless they're across a day
or more of sampling, so even the current "once every ~5 minutes" is way
more info than they want. I have all this log parsing code and things
that look at pg_stat_bgwriter to collect that data and produce higher
level reports. And lots of it would break if any of this patch is added
and people turn it on. I imagine other log/stat parsing programs might
suffer issues too. That's your other hurdle for change here: the new
analysis techniques have to be useful enough to justify that some
downstream tool disruption is inevitable.

I was aware that by continuously updating pg_stat_bgwriter, the data won't
be synchronized (i.e. the buffers_checkpoint counters will change while
the number of requested/timed checkpoints remain the same).

But does that really break the tools that process the data? When you're
working with summarized data, the result should be more or less the same
as the difference will be smoothed out by averaging etc. You can always
see just one "in progress" checkpoint, so if you get 24 checkpoints a day,
the difference will be 1/24 of a checkpoint. Yes, it's a difference.

A really crazy workaround would be to change checkpoints_requested /
checkpoints_timed to double, and use that to indicate current progress of
the checkpoint. So for example 10.54 would mean 10 checkpoints completed,
one checkpoint in progress, already written 54% of blocks. But yes, that's
a bit crazy.

If you have an idea for how to use this extra data for something useful,
let's talk about what that is and see if it's possible to build it in
instead. This problem is harder than it looks, mainly because the way
the OS caches writes here makes trying to derive hard numbers from what
the background writer is doing impossible. When the database writes
things out, and when they actually get written to disk, they are not the
same event. The actual write is often during the sync phase, and not
being able to tracking that beast is where I see the most problems at.
The write phase, the easier part to instrument in the database, that is
pretty boring. That's why the last extra logging I added here focused
on adding visibility to the sync activity instead.

Hmmm, let me explain what led me to this patch - right now I'm doing a
comparison of filesystems with various block sizes (both fs and db
blocks). I've realized that the db block size significantly influences
frequency of checkpoints and amount of data to write, so I'm collecting
data from pg_stat_bgwriter too. The benchmark goes like this

1. collect pg_stat_bgwriter stats
2. run pgbench for 10 minutes
3. collect pg_stat_bgwriter stats (to compute difference with (1))
4. kill the postmaster

The problem is that when checkpoint stats are collected, there might be a
checkpoint in progress and in that case the stats are incomplete. In some
cases (especially with very small db blocks) this has significant impact
because the checkpoints are less frequent.

I can't infer this from other data (e.g. iostat), because that does allow
me what I/O is caused by the checkpoint.

Yes, this does not consider sync timing, but in my case that's not a big
issue (the page cache is rather small so the data are actually forced to
the drive soon).

I could probably live with keeping the current pg_stat_bgwriter logic
(i.e. updating just once) and writing checkpoint status just to the log. I
don't think that should break any current tools that parse logs, because
the message is completely different (prefixed with 'checkpoint status') so
any reasonably written tool should be OK.

Tomas

Greg Smith

gsmith@gregsmith.com

almost 15 years ago

In reply to: Magnus Hagander (#6)

Re: PATCH: regular logging of checkpoint progress

On 08/26/2011 03:54 AM, Magnus Hagander wrote:

In theory, this could be the "progress view" or
"progress field" talked about around Gregs previous patch - or it
could just be modifying the commandstring in pg_stat_activity.

Right. The whole progress indicator idea is hard to do for queries in
general. But there's enough of these other progress indicator ideas
around now that it may be worth putting a standard way to handle them in
here. It sounds like that would be sufficient to address the area Tomas
is trying to instrument better. I badly want a progress indicator on
CREATE INDEX CONCURRENTLY too, to at least let me know what phase of the
build process it's on. That's turned into a major headache recently.

If we run with the idea of just allowing backends to publish a progress
text string, I think this one maps into a similar space as the
autovacuum one. Publishing how many seconds the operation has been
running for may be reasonable too. Whether the overhead of the timing
calls necessary to compute that will be high or not depends on the
refresh rate of the progress info. My suggestion before was to name
these as key=value pairs for easy parsing; here's three examples now:

autovacumm: pgbench_accounts h=182701 m=301515 d=321345 s=62.231
(cache hits, cache misses, dirty writes, seconds)

background writer: checkpoint b=511 t=3072 s=5.321
(buffers written, total, seconds)

create index concurrently: pgbench_accounts p=1 b=62 t=6213 s=81.232
(phase, blocks processed, total block estimate, seconds)

I think that the idea of making this easily human readable is
optimistic, because it will make all these strings big enough to start
mattering. Given that, we almost have to assume the only consumers of
this data will be able to interpret it using the documentation. I'd be
happy with just the minimal data set in each case, not including any
statistics you can easily derive from the values given (like the MB/s
readings). Adding that figure in particular to more of the log messages
would be nice though.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 15 years ago

In reply to: Magnus Hagander (#6)

Re: PATCH: regular logging of checkpoint progress

On 26 Srpen 2011, 9:54, Magnus Hagander wrote:

This seems like the wrong thing to write to the log. It's really only
useful *during* the checkpoint run, isn't it? If so, I think it should
go in a pg_stat view. In theory, this could be the "progress view" or
"progress field" talked about around Gregs previous patch - or it
could just be modifying the commandstring in pg_stat_activity. Either
way, it should be updated in shared memory in that case (like current
query is), and not sent with a message to the collector.

I personally find it handy for example when I need to find out why
performance degraded at a certain point in the past. Now I can see there
was a checkpoint (thanks to log_checkpoints=on), but I don't know any
details about it's progress.

I already collect info from the pg_stat_bgwriter, that's why I thought I
could update it more often. The log file is a natural destination for such
information, IMHO. I see that as an extension to the current checkpoint
messages that are written to the log.

The commandstring approach is probably fine for interactive work, but not
very handy when you need to analyze something that already happened.
Unless you collected the data, of course.

Tomas

#10

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 15 years ago

In reply to: Greg Smith (#8)

Re: PATCH: regular logging of checkpoint progress

On 26 Srpen 2011, 19:17, Greg Smith wrote:

On 08/26/2011 03:54 AM, Magnus Hagander wrote:

In theory, this could be the "progress view" or
"progress field" talked about around Gregs previous patch - or it
could just be modifying the commandstring in pg_stat_activity.

Right. The whole progress indicator idea is hard to do for queries in
general. But there's enough of these other progress indicator ideas
around now that it may be worth putting a standard way to handle them in
here. It sounds like that would be sufficient to address the area Tomas
is trying to instrument better. I badly want a progress indicator on
CREATE INDEX CONCURRENTLY too, to at least let me know what phase of the
build process it's on. That's turned into a major headache recently.

If we run with the idea of just allowing backends to publish a progress
text string, I think this one maps into a similar space as the
autovacuum one. Publishing how many seconds the operation has been
running for may be reasonable too. Whether the overhead of the timing
calls necessary to compute that will be high or not depends on the
refresh rate of the progress info. My suggestion before was to name
these as key=value pairs for easy parsing; here's three examples now:

autovacumm: pgbench_accounts h=182701 m=301515 d=321345 s=62.231
(cache hits, cache misses, dirty writes, seconds)

background writer: checkpoint b=511 t=3072 s=5.321
(buffers written, total, seconds)

create index concurrently: pgbench_accounts p=1 b=62 t=6213 s=81.232
(phase, blocks processed, total block estimate, seconds)

I think that the idea of making this easily human readable is
optimistic, because it will make all these strings big enough to start
mattering. Given that, we almost have to assume the only consumers of
this data will be able to interpret it using the documentation. I'd be
happy with just the minimal data set in each case, not including any
statistics you can easily derive from the values given (like the MB/s
readings). Adding that figure in particular to more of the log messages
would be nice though.

I'm a bit confused - are you talking about updating process title or about
writing the info to log? The process title is probably fine for watching
the progress interactively, but it really does not solve what I need.

I need to be able to infer the progress for past events, so I'd have to
sample the 'ps ax' output regularly.

Tomas

#11

Noah Misch

noah@leadboat.com

almost 15 years ago

In reply to: Tomas Vondra (#7)

Re: PATCH: regular logging of checkpoint progress

On Fri, Aug 26, 2011 at 10:46:33AM +0200, Tomas Vondra wrote:

Hmmm, let me explain what led me to this patch - right now I'm doing a
comparison of filesystems with various block sizes (both fs and db
blocks). I've realized that the db block size significantly influences
frequency of checkpoints and amount of data to write, so I'm collecting
data from pg_stat_bgwriter too. The benchmark goes like this

1. collect pg_stat_bgwriter stats
2. run pgbench for 10 minutes
3. collect pg_stat_bgwriter stats (to compute difference with (1))
4. kill the postmaster

The problem is that when checkpoint stats are collected, there might be a
checkpoint in progress and in that case the stats are incomplete. In some
cases (especially with very small db blocks) this has significant impact
because the checkpoints are less frequent.

Could you remove this hazard by adding a step "2a. psql -c CHECKPOINT"?

--
Noah Misch http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#12

Greg Smith

gsmith@gregsmith.com

almost 15 years ago

In reply to: Noah Misch (#11)

Re: PATCH: regular logging of checkpoint progress

On 08/27/2011 12:01 AM, Noah Misch wrote:

On Fri, Aug 26, 2011 at 10:46:33AM +0200, Tomas Vondra wrote:

1. collect pg_stat_bgwriter stats
2. run pgbench for 10 minutes
3. collect pg_stat_bgwriter stats (to compute difference with (1))
4. kill the postmaster

The problem is that when checkpoint stats are collected, there might be a
checkpoint in progress and in that case the stats are incomplete. In some
cases (especially with very small db blocks) this has significant impact
because the checkpoints are less frequent.

Could you remove this hazard by adding a step "2a. psql -c CHECKPOINT"?

That's what I do in pgbench-tools, and it helps a lot. It makes it
easier to identify when the checkpoint kicks in if you know it's
approximately the same time after each test run begins, given similar
testing parameters. That said, it's hard to eliminate all of the edge
conditions here.

For example, imagine that you're consuming WAL files such that you hit
checkpoint_segments every 4 minutes. In a 10 minute test run, a
checkpoint will start at 4:00 and finish at around 6:00 (with
checkpoint_completion_target=0.5). The next will start at 8:00 and
should finish at around 10:00--right at the end of when the test ends.
Given the variation that sync timing and rounding issues in the write
phase adds to things, you can expect that some test runs will include
stats from 2 checkpoints, while others will end the test just before the
second one finishes. It does throw the numbers off a bit.

To avoid this when it pops up, I normally aim to push up to where there
are >=4 checkpoints per test run, just so whether I get n or n-1 of them
doesn't impact results as much. But that normally takes doubling the
length of the test to 20 minutes. As it will often take me days of test
time to plow through exploring just a couple of parameters, I'm
sympathetic to Tomas trying to improve accuracy here without having to
run for quite so long. There's few people who have this problem to
worry about though, it's a common issue with benchmarking but not many
other contexts.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

#13

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 15 years ago

In reply to: Noah Misch (#11)

Re: PATCH: regular logging of checkpoint progress

On 27 Srpen 2011, 6:01, Noah Misch wrote:

On Fri, Aug 26, 2011 at 10:46:33AM +0200, Tomas Vondra wrote:

1. collect pg_stat_bgwriter stats
2. run pgbench for 10 minutes
3. collect pg_stat_bgwriter stats (to compute difference with (1))
4. kill the postmaster

The problem is that when checkpoint stats are collected, there might be
a
checkpoint in progress and in that case the stats are incomplete. In
some
cases (especially with very small db blocks) this has significant impact
because the checkpoints are less frequent.

Could you remove this hazard by adding a step "2a. psql -c CHECKPOINT"?

I already do that, but it really does not solve the issue. It just aligns
the first expected 'timed' checkpoint, it does not solve the problem with
in-progress checkpoints unless the runs behave exactly the same (and
that's the boring case).

Tomas

#14

Noah Misch

noah@leadboat.com

almost 15 years ago

In reply to: Tomas Vondra (#13)

Re: PATCH: regular logging of checkpoint progress

On Sat, Aug 27, 2011 at 03:57:16PM +0200, Tomas Vondra wrote:

On 27 Srpen 2011, 6:01, Noah Misch wrote:

On Fri, Aug 26, 2011 at 10:46:33AM +0200, Tomas Vondra wrote:

1. collect pg_stat_bgwriter stats
2. run pgbench for 10 minutes
3. collect pg_stat_bgwriter stats (to compute difference with (1))
4. kill the postmaster

The problem is that when checkpoint stats are collected, there might be
a
checkpoint in progress and in that case the stats are incomplete. In
some
cases (especially with very small db blocks) this has significant impact
because the checkpoints are less frequent.

Could you remove this hazard by adding a step "2a. psql -c CHECKPOINT"?

I already do that, but it really does not solve the issue. It just aligns
the first expected 'timed' checkpoint, it does not solve the problem with
in-progress checkpoints unless the runs behave exactly the same (and
that's the boring case).

To clarify, run that command _after_ the 10-minute pgbench run. It
blocks until completion of both the in-progress checkpoint, if any,
and the requested checkpoint.

--
Noah Misch http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#15

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 15 years ago

In reply to: Noah Misch (#14)

Re: PATCH: regular logging of checkpoint progress

On 29 Srpen 2011, 7:47, Noah Misch wrote:

On Sat, Aug 27, 2011 at 03:57:16PM +0200, Tomas Vondra wrote:

On 27 Srpen 2011, 6:01, Noah Misch wrote:

Could you remove this hazard by adding a step "2a. psql -c

CHECKPOINT"?

I already do that, but it really does not solve the issue. It just
aligns
the first expected 'timed' checkpoint, it does not solve the problem
with
in-progress checkpoints unless the runs behave exactly the same (and
that's the boring case).

To clarify, run that command _after_ the 10-minute pgbench run. It
blocks until completion of both the in-progress checkpoint, if any,
and the requested checkpoint.

I don't think this is going to help when there is a checkpoint in
progress, as the new CHECKPOINT will wait till the already running one
completes. It might force it to finish faster, but I still can't say how
much data was written when the pgbench was running.

Tomas

#16

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 14 years ago

In reply to: Tomas Vondra (#1)

Re: PATCH: regular logging of checkpoint progress

I've prepared a significantly simplified version of the patch. The two
main changes are

(a) it does not update the pg_stat_bgwriter anymore, it just prints an
info to the server log

(b) a new GUC is not required, it's driven by the log_checkpoints

This version will log at least 10 'checkpoint status' lines (at 10%, 20%,
30%, ...) and whenever 5 seconds since the last log elapses. The time is
not checked for each buffer but for 128 buffers.

So if the checkpoint is very slow, you'll get a message every 5 seconds,
if it's fast you'll get 10 messages.

Tomas

#17

Magnus Hagander

magnus@hagander.net

over 14 years ago

In reply to: Tomas Vondra (#16)

Re: PATCH: regular logging of checkpoint progress

On Thu, Sep 1, 2011 at 21:59, Tomas Vondra <tv@fuzzy.cz> wrote:

I've prepared a significantly simplified version of the patch. The two
main changes are

(a) it does not update the pg_stat_bgwriter anymore, it just prints an
info to the server log
(b) a new GUC is not required, it's driven by the log_checkpoints

The comment still refers to the checkpoint_update_limit.

This version will log at least 10 'checkpoint status' lines (at 10%, 20%,
30%, ...) and whenever 5 seconds since the last log elapses. The time is
not checked for each buffer but for 128 buffers.

So if the checkpoint is very slow, you'll get a message every 5 seconds,
if it's fast you'll get 10 messages.

I would personally find this very annoying. If I read it correctly,
anybody with a database with no problem at all but that has
log_checkpoints on suddenly got at least 10 times as many messages? I
generally try to advise my clients to *not* log excessively because
then they will end up not bothering to read the logs...

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#18

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 14 years ago

In reply to: Magnus Hagander (#17)

Re: PATCH: regular logging of checkpoint progress

On 2 Září 2011, 9:47, Magnus Hagander wrote:

On Thu, Sep 1, 2011 at 21:59, Tomas Vondra <tv@fuzzy.cz> wrote:

I've prepared a significantly simplified version of the patch. The two
main changes are

(a) it does not update the pg_stat_bgwriter anymore, it just prints an
info to the server log
(b) a new GUC is not required, it's driven by the log_checkpoints

The comment still refers to the checkpoint_update_limit.

OK, I'll fix that.

This version will log at least 10 'checkpoint status' lines (at 10%,
20%,
30%, ...) and whenever 5 seconds since the last log elapses. The time is
not checked for each buffer but for 128 buffers.

So if the checkpoint is very slow, you'll get a message every 5 seconds,
if it's fast you'll get 10 messages.

I would personally find this very annoying. If I read it correctly,
anybody with a database with no problem at all but that has
log_checkpoints on suddenly got at least 10 times as many messages? I
generally try to advise my clients to *not* log excessively because
then they will end up not bothering to read the logs...

What about logging it with a lower level, e.g. NOTICE instead of the
current LOG? If that's not a solution then a new GUC is needed I guess.

Tomas

#19

Magnus Hagander

magnus@hagander.net

over 14 years ago

In reply to: Tomas Vondra (#18)

Re: PATCH: regular logging of checkpoint progress

On Fri, Sep 2, 2011 at 11:01, Tomas Vondra <tv@fuzzy.cz> wrote:

On 2 Září 2011, 9:47, Magnus Hagander wrote:

On Thu, Sep 1, 2011 at 21:59, Tomas Vondra <tv@fuzzy.cz> wrote:

I've prepared a significantly simplified version of the patch. The two
main changes are

(a) it does not update the pg_stat_bgwriter anymore, it just prints an
info to the server log
(b) a new GUC is not required, it's driven by the log_checkpoints

The comment still refers to the checkpoint_update_limit.

OK, I'll fix that.

This version will log at least 10 'checkpoint status' lines (at 10%,
20%,
30%, ...) and whenever 5 seconds since the last log elapses. The time is
not checked for each buffer but for 128 buffers.

So if the checkpoint is very slow, you'll get a message every 5 seconds,
if it's fast you'll get 10 messages.

I would personally find this very annoying. If I read it correctly,
anybody with a database with no problem at all but that has
log_checkpoints on suddenly got at least 10 times as many messages? I
generally try to advise my clients to *not* log excessively because
then they will end up not bothering to read the logs...

What about logging it with a lower level, e.g. NOTICE instead of the
current LOG? If that's not a solution then a new GUC is needed I guess.

I guess if it's at a DEBUG level it won't annoy anybody who doesn't
need it. Not sure if NOTICE is low enough..

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#20

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 14 years ago

In reply to: Magnus Hagander (#19)

Re: PATCH: regular logging of checkpoint progress

On 2 Září 2011, 12:45, Magnus Hagander wrote:

On Fri, Sep 2, 2011 at 11:01, Tomas Vondra <tv@fuzzy.cz> wrote:

What about logging it with a lower level, e.g. NOTICE instead of the
current LOG? If that's not a solution then a new GUC is needed I guess.

I guess if it's at a DEBUG level it won't annoy anybody who doesn't
need it. Not sure if NOTICE is low enough..

I've changed the level to NOTICE. I guess I could put that to info, but
the debug levels seem too low to me. Is there a rule of a thumb about
where to put messages?

Tomas

#21