Displaying accumulated autovacuum cost

Started by Greg Smithover 14 years ago44 messageshackers
Jump to latest
#1Greg Smith
gsmith@gregsmith.com

Attached is a patch that tracks and displays the accumulated cost when
autovacuum is running. Code by Noah Misch and myself. I hope this idea
will bring a formal process to vacuum tuning, which is currently too
hard to do. I was about to add "without..." to that, but I then
realized it needs no disclaimer; it's just too hard, period. Vacuum
issues are enemy #1 at all the terabyte scale customer sites I've been
fighting with lately.

The patch updates the command string just before the workers sleep to
show how much work they've done so far. And at the end, it adds a few
new lines to the information written to the logs, when the autovacuum is
notable enough to be logged at all. The overhead it adds is at most a
few integer operations per buffer processed and a slower title string
update once per sleep. It's trivial compared to both the vacuum itself,
and to the instrumentation's value to sites with vacuum issues.

To demonstrate the patch in action, here's a test case using a 6.4GB
pgbench_accounts table:

$ createdb pgbench
$ pgbench -i -s 500 pgbench
$ psql -d pgbench -c "select pg_relation_size('pgbench_accounts');"
pg_relation_size
------------------
6714761216
$ psql -d pgbench -c "select relname,relpages from pg_class where
relname='pgbench_accounts';"
relname | relpages
------------------+----------
pgbench_accounts | 819673
$psql -d pgbench -c "delete from pgbench_accounts where aid<20000000"

You can see the new information in the command string with ps and grep:

$ while [ 1 ] ; do (ps -eaf | grep "[a]utovacuum worker" && sleep 60) ; done
gsmith 2687 17718 0 15:44 ? 00:00:00 postgres: autovacuum
worker process h=19 m=14196 d=14185
...
gsmith 2687 17718 0 15:44 ? 00:00:09 postgres: autovacuum
worker process h=182701 m=301515 d=321345
...
gsmith 2687 17718 1 15:44 ? 00:00:23 postgres: autovacuum
worker process h=740359 m=679987 d=617559
...

That's accumulated hit/miss/dirty counts, the raw numbers. When the
autovacuum is finished, those totals appear as a new line in the log entry:

LOG: automatic vacuum of table "pgbench.public.pgbench_accounts": index
scans: 1
pages: 0 removed, 819673 remain
tuples: 19999999 removed, 30000022 remain
buffer usage: 809537 hits, 749340 misses, 686660 dirtied
system usage: CPU 5.70s/19.73u sec elapsed 2211.60 sec

To check if this makes sense, we need the relevant parameters at the
time, which were the defaults (I only tweaked some basic config bits
here, including shared_buffers=400MB so a bit more was cached):

vacuum_cost_page_hit = 1 # 0-10000 credits
vacuum_cost_page_miss = 10 # 0-10000 credits
vacuum_cost_page_dirty = 20 # 0-10000 credits
vacuum_cost_limit = 200 # 1-10000 credits
autovacuum_vacuum_cost_delay = 20ms

Every 20ms equals 50 times/second. That means the cost accumulation
should be 200 * 50 = 10000 / second, or 600K/minute. That's how fast
the cost should be increasing here. Given a runtime of 2211.60 seconds,
that's a total estimated cost of 2209.15 * 10000 = 22,091,500. Now we
check that against the totals printed at the end of the vacuum:

1 * 809537 hits=809,537
10 * 749340 misses=7,493,400
20 * 686607 dirtied=13,732,140

And that gives a directly computed total of 22,035,077. Close enough to
show this is working as expected. And how I computed all that should
give you an idea how you might use these numbers to extract other useful
statistics, if you'd like to tune the balance of various cost_page_*
parameters as one example. I have no idea how anyone could ever set
those relative to one another without this data, it would take epic
guessing skills.

What else can do you do with this data?

-Figure out if the VACUUM is still making progress when it appears stuck
-Estimate how long it will take to finish, based on current progress and
whatever total cost was logged last time VACUUM ran against this relation.
-Compute approximate hit rate on the read side. OS caching issues and
the ring buffer are obviously a problem with that, this isn't too valuable.
-Can see the cost split when multiple vacuums are running. This problem
is why sites can't just use "total time to vacuum" as a useful proxy to
estimate how long one will take to run.
-Easy to track the read/write ratio
-Directly measure the write rate

That last one is I think the part people are most perplexed by right
now, and this makes it trivial. How do you turn all these cost figures
into real-world read/write rates? It's been hard to do.

Now, you can take a bunch of samples of the data at 1 minute intervals,
like my little "ps | grep" example above does. The delta in the
"dirty=" column is how much was written per minute, in units of 8K
(usually) buffers. Multiply that by 8192/(60*1024*1024), and you get
MB/s out of there. I collected that data for a cleanup run of the
pgbench_accounts damage done above, CSV file with all the statistics is
attached.

I also collected OS level stats from Linux about the actual read/write
rate of the process, converted into "Write Mbps" (those are actually in
MB/s, sloppy capitalization is via OpenOffice "autocorrect"). Those
numbers are close enough to make me confident the dirty buffer totals
tracked here do turn into useful MB/s values. Sample of the most
interesting part:

Cost Delta Dirty Mbps Write Mbps
589,890 2.56 2.73
591,151 2.57 2.73
589,035 2.56 2.72
593,775 3.14 0.20
599,420 2.05 0.00
598,503 2.05 0.00
599,421 2.05 0.00
574,046 0.60 0.01
574,779 0.64 0.67
609,140 2.56 2.68
612,397 2.57 2.69
611,744 2.57 2.69
610,008 2.56 2.68

This shows the expected 600K/minute cost accumulation. And using the
dirty= numbers to compute MB/s of write speed closely matches the total
write speed of this process. Some of the difference might be I/O to
other things besides the main table here, some of it is just because OS
write caching will influence the write rate. In the spots where the OS
value and what's derived from the dirty rate diverge most, it appears to
be because vacuum is filling Linux's write cache. Actual writes
accumulated against the process them block for a while. It's a small
difference most of the time.

I'd be willing to accept a "Dirty MB/s" figure computed this way as
accurate enough for most purposes. And this patch lets you get that
data, currently unavailable without poking into the OS statistics (if at
all), just by doing a little log file and/or command string scraping.
Total at the end or real-time monitoring, based on how much work you
want to put into it. For a busy site where one or more autovacuum
processes are running most of the time, being able to monitor the vacuum
portion of the I/O this way will be a huge improvement over the current
state of things. I already have a stack of tools built on top of this
data I'm working on, and they're making it much easier to come up with
an iterative tuning process for autovacuum.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

Attachments:

pgbench-vacuum-stats.csvtext/csv; name=pgbench-vacuum-stats.csvDownload
vacuum_stats_v2.patchtext/x-patch; name=vacuum_stats_v2.patchDownload+66-14
In reply to: Greg Smith (#1)
Re: Displaying accumulated autovacuum cost

Em 17-08-2011 18:04, Greg Smith escreveu:

Attached is a patch that tracks and displays the accumulated cost when
autovacuum is running. Code by Noah Misch and myself. I hope this idea
will bring a formal process to vacuum tuning, which is currently too
hard to do. I was about to add "without..." to that, but I then realized
it needs no disclaimer; it's just too hard, period. Vacuum issues are
enemy #1 at all the terabyte scale customer sites I've been fighting
with lately.

Interesting patch. I drafted a similar idea but didn't have a chance to
publish it. It is a complement to the idea about autovacuum tuning [1]http://archives.postgresql.org/pgsql-hackers/2011-06/msg00678.php. Hope I
will have time to post something for the next CF. And, of course, I will
review this patch.

The patch updates the command string just before the workers sleep to
show how much work they've done so far. And at the end, it adds a few
new lines to the information written to the logs, when the autovacuum is
notable enough to be logged at all. The overhead it adds is at most a
few integer operations per buffer processed and a slower title string
update once per sleep. It's trivial compared to both the vacuum itself,
and to the instrumentation's value to sites with vacuum issues.

I don't like exposing this information only on title processes. It would be
difficult for client apps (for example, PGAdmin) to track this kind of
information and it is restricted to local access. I'm not objecting to display
this information in process title; I'm just saying that that information
should be exposed in functions (say pg_stat_get_vacuum_[hit|miss|dirty]) too.
I'm not sure about adding this information to incremental counters but that
would be useful to trace a vacuum work pattern.

[1]: http://archives.postgresql.org/pgsql-hackers/2011-06/msg00678.php

--
Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

#3Greg Smith
gsmith@gregsmith.com
In reply to: Euler Taveira de Oliveira (#2)
Re: Displaying accumulated autovacuum cost

On 08/17/2011 07:42 PM, Euler Taveira de Oliveira wrote:

I don't like exposing this information only on title processes. It
would be difficult for client apps (for example, PGAdmin) to track
this kind of information and it is restricted to local access. I'm not
objecting to display this information in process title; I'm just
saying that that information should be exposed in functions (say
pg_stat_get_vacuum_[hit|miss|dirty]) too.

I tend to build the simplest possible thing that is useful enough to
work. The data is getting stored and shown now, where it wasn't
before. If it's possible to expose that in additional ways later too,
great. The big step up for this information is to go from
"unobtainable" to "obtainable". I'd prefer not to add a quest for
"easily obtainable" to the requirements until that big jump is made, for
fear it will cause nothing to get delivered.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

#4Magnus Hagander
magnus@hagander.net
In reply to: Greg Smith (#3)
Re: Displaying accumulated autovacuum cost

On Thu, Aug 18, 2011 at 03:23, Greg Smith <greg@2ndquadrant.com> wrote:

On 08/17/2011 07:42 PM, Euler Taveira de Oliveira wrote:

I don't like exposing this information only on title processes. It would
be difficult for client apps (for example, PGAdmin) to track this kind of
information and it is restricted to local access. I'm not objecting to
display this information in process title; I'm just saying that that
information should be exposed in  functions (say
pg_stat_get_vacuum_[hit|miss|dirty]) too.

I tend to build the simplest possible thing that is useful enough to work.
 The data is getting stored and shown now, where it wasn't before.  If it's
possible to expose that in additional ways later too, great.  The big step
up for this information is to go from "unobtainable" to "obtainable".  I'd
prefer not to add a quest for "easily obtainable" to the requirements until
that big jump is made, for fear it will cause nothing to get delivered.

By only putting it in the ps display, you exclude all the users who
don't have an easy way to look at that information. The big group
there is Windows, but it's not necessarily easy on all other platforms
as well, afaik. And possibliy even more importantly, it makes it
impossible to view it from tools like pgadmin. I think it's definitely
worthwhile to add support to view it through the stats collector as
well from the beginnig. The question there is if it's enough to just
show it in the current_query (kind of like it's done in the ps
output), or if we want a completely separate view with this info.

Also, unrelated to that, wouldn't this information be interesting for
non-autovacuum queries as well?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#5Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Magnus Hagander (#4)
Re: Displaying accumulated autovacuum cost

Magnus Hagander <magnus@hagander.net> writes:

Also, unrelated to that, wouldn't this information be interesting for
non-autovacuum queries as well?

I was about to say that I would like to see it for normal queries too,
but I guess we already have it:

=> explain (analyze, buffers, costs off)
select * from pg_attribute a join pg_class c on a.attrelid = c.oid;
QUERY PLAN
--------------------------------------------------------------------------------
Hash Join (actual time=0.569..4.255 rows=2158 loops=1)
Hash Cond: (a.attrelid = c.oid)
Buffers: shared hit=48
-> Seq Scan on pg_attribute a (actual time=0.008..0.462 rows=2158 loops=1)
Buffers: shared hit=40
-> Hash (actual time=0.541..0.541 rows=282 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 54kB
Buffers: shared hit=8
-> Seq Scan on pg_class c (actual time=0.010..0.269 rows=282 loops=1)
Buffers: shared hit=8
Total runtime: 4.551 ms
(11 rows)

Also, from where I sit the ps title update for normal queries is about
useless, as I see loads of IDLE postgresql backends in top that are
consuming 20% and more CPU time. The refresh rate is way to low to be
useful, and having the title it updated more frequently would probably
consume enough CPU that it would defeat its purpose (going from
instrumenting to slowing down enough that you can see what's happening
is not where I'd want to go).

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

In reply to: Magnus Hagander (#4)
Re: Displaying accumulated autovacuum cost

Em 18-08-2011 03:39, Magnus Hagander escreveu:

Also, unrelated to that, wouldn't this information be interesting for
non-autovacuum queries as well?

Yes, it would. AFAICS, the patch will display that message in process titles.
However, analyze code also uses the vacuum_delay_point(). How do you handle it?

It would be another patch... autovacuum has an option to display summarized
information but vacuum don't. Isn't it time to be symmetrical here?

--
Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#4)
Re: Displaying accumulated autovacuum cost

Magnus Hagander <magnus@hagander.net> writes:

On Thu, Aug 18, 2011 at 03:23, Greg Smith <greg@2ndquadrant.com> wrote:

On 08/17/2011 07:42 PM, Euler Taveira de Oliveira wrote:

I don't like exposing this information only on title processes.

I tend to build the simplest possible thing that is useful enough to work.

By only putting it in the ps display, you exclude all the users who
don't have an easy way to look at that information. The big group
there is Windows, but it's not necessarily easy on all other platforms
as well, afaik.

Yeah. Also, process title updates are friggin expensive on some
platforms --- so much so, that we have a GUC setting to disable them.
So I don't think we should use that technique at all. Put the info
into the stats collector instead (not "also").

regards, tom lane

#8Robert Haas
robertmhaas@gmail.com
In reply to: Greg Smith (#3)
Re: Displaying accumulated autovacuum cost

On Wed, Aug 17, 2011 at 9:23 PM, Greg Smith <greg@2ndquadrant.com> wrote:

On 08/17/2011 07:42 PM, Euler Taveira de Oliveira wrote:

I don't like exposing this information only on title processes. It would
be difficult for client apps (for example, PGAdmin) to track this kind of
information and it is restricted to local access. I'm not objecting to
display this information in process title; I'm just saying that that
information should be exposed in  functions (say
pg_stat_get_vacuum_[hit|miss|dirty]) too.

I tend to build the simplest possible thing that is useful enough to work.
 The data is getting stored and shown now, where it wasn't before.  If it's
possible to expose that in additional ways later too, great.  The big step
up for this information is to go from "unobtainable" to "obtainable".  I'd
prefer not to add a quest for "easily obtainable" to the requirements until
that big jump is made, for fear it will cause nothing to get delivered.

Perhaps a reasonable way to break up the patch would be:

- Part 1: Gather the information and display it in the
log_autovacuum_min_duration output.
- Part 2: Add the ability to see the information incrementally (via
some mechanism yet to be agreed upon).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#9Peter Eisentraut
peter_e@gmx.net
In reply to: Magnus Hagander (#4)
Re: Displaying accumulated autovacuum cost

On tor, 2011-08-18 at 08:39 +0200, Magnus Hagander wrote:

Also, unrelated to that, wouldn't this information be interesting for
non-autovacuum queries as well?

Last year we were discussing some details on progress reporting, and
some people suggested that instead of printing a single percentage, we
should let each type of activity print out whatever metrics it has that
would allow an experienced DBA to track the progress. Effectively, this
is what this patch is trying to do.

So how about adding a column to pg_stat_activity, progress_metrics or
something like that, and add that information there.

#10Robert Haas
robertmhaas@gmail.com
In reply to: Peter Eisentraut (#9)
Re: Displaying accumulated autovacuum cost

On Thu, Aug 18, 2011 at 10:54 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

On tor, 2011-08-18 at 08:39 +0200, Magnus Hagander wrote:

Also, unrelated to that, wouldn't this information be interesting for
non-autovacuum queries as well?

Last year we were discussing some details on progress reporting, and
some people suggested that instead of printing a single percentage, we
should let each type of activity print out whatever metrics it has that
would allow an experienced DBA to track the progress.  Effectively, this
is what this patch is trying to do.

So how about adding a column to pg_stat_activity, progress_metrics or
something like that, and add that information there.

That might be a good way to go. I don't think we want something like
pg_stat_all_tables for this, because it seems that Greg's use case is
to be able to see how a *particular* autovacuum process is doing
*while it's running*, not to look at aggregate statistics over time.
Putting it in pg_stat_activity would be good for that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#11Magnus Hagander
magnus@hagander.net
In reply to: Robert Haas (#10)
Re: Displaying accumulated autovacuum cost

On Thu, Aug 18, 2011 at 17:13, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Aug 18, 2011 at 10:54 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

On tor, 2011-08-18 at 08:39 +0200, Magnus Hagander wrote:

Also, unrelated to that, wouldn't this information be interesting for
non-autovacuum queries as well?

Last year we were discussing some details on progress reporting, and
some people suggested that instead of printing a single percentage, we
should let each type of activity print out whatever metrics it has that
would allow an experienced DBA to track the progress.  Effectively, this
is what this patch is trying to do.

So how about adding a column to pg_stat_activity, progress_metrics or
something like that, and add that information there.

That might be a good way to go.  I don't think we want something like
pg_stat_all_tables for this, because it seems that Greg's use case is
to be able to see how a *particular* autovacuum process is doing
*while it's running*, not to look at aggregate statistics over time.
Putting it in pg_stat_activity would be good for that.

It's also good to have it broken down into multiple columns, and not
just a freetext column in the view - if tools should be able to parse
it as well.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#12Robert Haas
robertmhaas@gmail.com
In reply to: Magnus Hagander (#11)
Re: Displaying accumulated autovacuum cost

On Thu, Aug 18, 2011 at 11:14 AM, Magnus Hagander <magnus@hagander.net> wrote:

On Thu, Aug 18, 2011 at 17:13, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Aug 18, 2011 at 10:54 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

On tor, 2011-08-18 at 08:39 +0200, Magnus Hagander wrote:

Also, unrelated to that, wouldn't this information be interesting for
non-autovacuum queries as well?

Last year we were discussing some details on progress reporting, and
some people suggested that instead of printing a single percentage, we
should let each type of activity print out whatever metrics it has that
would allow an experienced DBA to track the progress.  Effectively, this
is what this patch is trying to do.

So how about adding a column to pg_stat_activity, progress_metrics or
something like that, and add that information there.

That might be a good way to go.  I don't think we want something like
pg_stat_all_tables for this, because it seems that Greg's use case is
to be able to see how a *particular* autovacuum process is doing
*while it's running*, not to look at aggregate statistics over time.
Putting it in pg_stat_activity would be good for that.

It's also good to have it broken down into multiple columns, and not
just a freetext column in the view - if tools should be able to parse
it as well.

True. We could have a separate system view that only shows the status
of currently-running vacuum proceses. That wouldn't bother me either.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#13Magnus Hagander
magnus@hagander.net
In reply to: Robert Haas (#12)
Re: Displaying accumulated autovacuum cost

On Thu, Aug 18, 2011 at 17:23, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Aug 18, 2011 at 11:14 AM, Magnus Hagander <magnus@hagander.net> wrote:

On Thu, Aug 18, 2011 at 17:13, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Aug 18, 2011 at 10:54 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

On tor, 2011-08-18 at 08:39 +0200, Magnus Hagander wrote:

Also, unrelated to that, wouldn't this information be interesting for
non-autovacuum queries as well?

Last year we were discussing some details on progress reporting, and
some people suggested that instead of printing a single percentage, we
should let each type of activity print out whatever metrics it has that
would allow an experienced DBA to track the progress.  Effectively, this
is what this patch is trying to do.

So how about adding a column to pg_stat_activity, progress_metrics or
something like that, and add that information there.

That might be a good way to go.  I don't think we want something like
pg_stat_all_tables for this, because it seems that Greg's use case is
to be able to see how a *particular* autovacuum process is doing
*while it's running*, not to look at aggregate statistics over time.
Putting it in pg_stat_activity would be good for that.

It's also good to have it broken down into multiple columns, and not
just a freetext column in the view - if tools should be able to parse
it as well.

True.  We could have a separate system view that only shows the status
of currently-running vacuum proceses.  That wouldn't bother me either.

That's what I'd like to have. We could also have aggregate counters on
the table/database level of course.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#14Greg Smith
gsmith@gregsmith.com
In reply to: Robert Haas (#8)
Re: Displaying accumulated autovacuum cost

On 08/18/2011 10:12 AM, Robert Haas wrote:

Perhaps a reasonable way to break up the patch would be:
- Part 1: Gather the information and display it in the
log_autovacuum_min_duration output.
- Part 2: Add the ability to see the information incrementally (via
some mechanism yet to be agreed upon).

My reaction to all the suggestions for redesign is just that: pull out
the part that does the incremental updates altogether, improve the part
that dumps the info into the logs, and resubmit without any incremental
progress for now. This is much more valuable to me if the first commit
that hits is something I can backport trivially. I'm seeing enough
production servers running into this problem right now on earlier
versions to be worried about that, and the log dump at the end would be
a huge help even if that was all they got. I'm going to add directly
computing the write MB/s figure from the dirty data written too, since
that ends up being the thing that I keep deriving by hand anyway.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

#15Greg Smith
gsmith@gregsmith.com
In reply to: Peter Eisentraut (#9)
Re: Displaying accumulated autovacuum cost

On 08/18/2011 10:54 AM, Peter Eisentraut wrote:

So how about adding a column to pg_stat_activity, progress_metrics or
something like that, and add that information there.

Adding a field here (I'd go for the simpler "progress") and updating it
regularly would be a reasonable way to go here. This data doesn't
really need to go into the traditional statistics infrastructure to be
useful. I didn't start there because I was already getting pushback on
overloading the stats collector with constantly updated metrics last
time I did something in this area. I wasn't going to try and argue why
it was worth it in this case, just like I'm not going to argue about the
complaint over the command string overhead being too high--just going to
not do that instead. If the bikeshed I built doesn't look fancy enough
to hold the bike I put in there, I'm not going to build a better one
right now--I'll just put a cheaper bike in there instead.

I was hoping to eventually take the useful summary bits at the end, the
totals, and save those into statistics somewhere each time a VACUUM of
either sort finishes. It would fit with the information shown in
pg_stat_tables, but that's obviously getting too wide. Breaking out a
pg_stat_autovacuum view that contains all the relevant bits currently
shown in that view, plus these 3 new fields, would be a reasonable start.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

#16Magnus Hagander
magnus@hagander.net
In reply to: Greg Smith (#15)
Re: Displaying accumulated autovacuum cost

On Thu, Aug 18, 2011 at 17:54, Greg Smith <greg@2ndquadrant.com> wrote:

On 08/18/2011 10:54 AM, Peter Eisentraut wrote:

So how about adding a column to pg_stat_activity, progress_metrics or
something like that, and add that information there.

Adding a field here (I'd go for the simpler "progress") and updating it
regularly would be a reasonable way to go here.  This data doesn't really
need to go into the traditional statistics infrastructure to be useful.  I
didn't start there because I was already getting pushback on overloading the
stats collector with constantly updated metrics last time I did something in
this area.  I wasn't going to try and argue why it was worth it in this
case, just like I'm not going to argue about the complaint over the command
string overhead being too high--just going to not do that instead.  If the
bikeshed I built doesn't look fancy enough to hold the bike I put in there,
I'm not going to build a better one right now--I'll just put a cheaper bike
in there instead.

The "current values per-backend" thing can go in shared memory. The
reason the per table ones can't is obviously that they go away when
the backend disconnects..

I was hoping to eventually take the useful summary bits at the end, the
totals, and save those into statistics somewhere each time a VACUUM of
either sort finishes.  It would fit with the information shown in
pg_stat_tables, but that's obviously getting too wide.  Breaking out a
pg_stat_autovacuum view that contains all the relevant bits currently shown
in that view, plus these 3 new fields, would be a reasonable start.

That depends on what you mea nby too wide. If it's intended to be
consumed with "SELECT *" or not...

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#17Robert Haas
robertmhaas@gmail.com
In reply to: Greg Smith (#14)
Re: Displaying accumulated autovacuum cost

On Thu, Aug 18, 2011 at 11:41 AM, Greg Smith <greg@2ndquadrant.com> wrote:

On 08/18/2011 10:12 AM, Robert Haas wrote:

Perhaps a reasonable way to break up the patch would be:
- Part 1: Gather the information and display it in the
log_autovacuum_min_duration output.
- Part 2: Add the ability to see the information incrementally (via
some mechanism yet to be agreed upon).

My reaction to all the suggestions for redesign is just that: pull out the
part that does the incremental updates altogether, improve the part that
dumps the info into the logs, and resubmit without any incremental progress
for now.  This is much more valuable to me if the first commit that hits is
something I can backport trivially.  I'm seeing enough production servers
running into this problem right now on earlier versions to be worried about
that, and the log dump at the end would be a huge help even if that was all
they got.  I'm going to add directly computing the write MB/s figure from
the dirty data written too, since that ends up being the thing that I keep
deriving by hand anyway.

By the way, since I forgot to say it earlier, I think it's great that
you are working on some of this instrumentation stuff, so +1 for the
basic concept here.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In reply to: Greg Smith (#15)
Re: Displaying accumulated autovacuum cost

Em 18-08-2011 12:54, Greg Smith escreveu:

I was hoping to eventually take the useful summary bits at the end, the
totals, and save those into statistics somewhere each time a VACUUM of
either sort finishes. It would fit with the information shown in
pg_stat_tables, but that's obviously getting too wide. Breaking out a
pg_stat_autovacuum view that contains all the relevant bits currently
shown in that view, plus these 3 new fields, would be a reasonable start.

IMHO the useful summary bits belongs to log. If you want to add it to stats
collector go for it. But if you go to the latter road, it is recommended to
move some fields (time-related fields) from pg_stat_*_tables to this new view
(pg_stat_maintenance?). I don't know how generic you want to go but have in
mind I would like to cover automatic and manual maintenance commands.

Besides that another view will cover the maintenance activity. This new view
could contain at least datname, schemaname, relname, command_start,
command_schedule, operation, progress (?), procpid, and current_command. The
name has to be generic to cover all maintenance commands (perhaps
pg_maintenance_activity).

--
Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

#19Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Greg Smith (#14)
Re: Displaying accumulated autovacuum cost

On Aug 18, 2011, at 10:41 AM, Greg Smith wrote:

that was all they got. I'm going to add directly computing the write MB/s figure from the dirty data written too, since that ends up being the thing that I keep deriving by hand anyway.

I know folks have talked about progress, but I haven't seen anything specific... could you add info about what table/index vacuum is working on, and how far along it is? I realize that's not very close to an actual % completion, but it's far better than what we have right now.

FWIW, the number I end up caring about isn't so much write traffic as read. Thanks to a good amount of battery-backed write cache (and possibly some iSCSI misconfiguration), our writes are generally much cheaper than our reads.
--
Jim C. Nasby, Database Architect jim@nasby.net
512.569.9461 (cell) http://jim.nasby.net

#20Greg Smith
gsmith@gregsmith.com
In reply to: Jim Nasby (#19)
Re: Displaying accumulated autovacuum cost

On 08/22/2011 05:54 PM, Jim Nasby wrote:

I know folks have talked about progress, but I haven't seen anything
specific... could you add info about what table/index vacuum is
working on, and how far along it is? I realize that's not very close
to an actual % completion, but it's far better than what we have right
now.

Due to complaints about the mechanism the first version used to inform
the user of the progress, I'm yanking that from the next patch
altogether. The goal for now is to get a good report into the logs, and
then maybe that gets extended later with a progress report. (All of the
proposed alternate mechanisms are way more complicated than anything I
have time to do right now)

FWIW, the number I end up caring about isn't so much write traffic as read. Thanks to a good amount of battery-backed write cache (and possibly some iSCSI misconfiguration), our writes are generally much cheaper than our reads.

VACUUM can't really know its true read rate from what's inside the
database. I can add a summary of the accumulated read amounts into the
logs, in more useful figures than what is provided so far, which is
better than nothing. But those will be kind of deceptive, which is one
reason I wasn't so focused on them yet. If the relation is largely in
the OS cache, but not the PostgreSQL one, the summary can show a read
rate even when that isn't actually doing any reads at all. That was
exactly the case in the sample data I posted. VACUUM thought it was
reading anywhere from 2.5 to 6MB/s. But at the OS level, it was
actually reading zero bytes, since the whole thing was in cache already.

What you actually want is a count of the accumulated read counters at
the OS level. I've recently figured out how to track those, too, but
that code is something that lives outside the database. If this is
something useful to you, I think you're about to sign up to be my next
beta tester for that program.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

#21Greg Smith
gsmith@gregsmith.com
In reply to: Greg Smith (#20)
#22Greg Smith
gsmith@gregsmith.com
In reply to: Greg Smith (#21)
#23Robert Haas
robertmhaas@gmail.com
In reply to: Greg Smith (#22)
#24Greg Smith
gsmith@gregsmith.com
In reply to: Robert Haas (#23)
#25Shigeru Hanada
shigeru.hanada@gmail.com
In reply to: Greg Smith (#22)
#26Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Shigeru Hanada (#25)
#27Greg Smith
gsmith@gregsmith.com
In reply to: Shigeru Hanada (#25)
#28Greg Smith
gsmith@gregsmith.com
In reply to: Alvaro Herrera (#26)
#29Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Greg Smith (#28)
#30Greg Smith
gsmith@gregsmith.com
In reply to: Greg Smith (#28)
#31Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Greg Smith (#30)
#32Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#31)
#33Greg Smith
gsmith@gregsmith.com
In reply to: Alvaro Herrera (#31)
#34Greg Smith
gsmith@gregsmith.com
In reply to: Robert Haas (#32)
#35Fujii Masao
masao.fujii@gmail.com
In reply to: Robert Haas (#32)
#36Robert Haas
robertmhaas@gmail.com
In reply to: Fujii Masao (#35)
#37Fujii Masao
masao.fujii@gmail.com
In reply to: Robert Haas (#36)
#38Robert Haas
robertmhaas@gmail.com
In reply to: Fujii Masao (#37)
#39Fujii Masao
masao.fujii@gmail.com
In reply to: Robert Haas (#38)
#40Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Fujii Masao (#39)
#41Greg Smith
gsmith@gregsmith.com
In reply to: Dimitri Fontaine (#40)
#42Greg Smith
gsmith@gregsmith.com
In reply to: Robert Haas (#36)
#43Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Greg Smith (#41)
#44Robert Haas
robertmhaas@gmail.com
In reply to: Greg Smith (#42)