pg_stat_get_last_vacuum_time(): why non-FULL?

Started by CR Lenderabout 13 years ago14 messagesgeneral
Jump to latest
#1CR Lender
crlender@gmail.com

According to the manual (9.1), pg_stat_get_last_vacuum_time() returns

timestamptz | Time of the last non-FULL vacuum initiated by the
| user on this table

Why are full vacuums excluded from this statistic? It looks like there's
no way to get the date of the last manual vacuum, if only full vacuums
are performed.

regards,
crl

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: CR Lender (#1)
Re: pg_stat_get_last_vacuum_time(): why non-FULL?

CR Lender <crlender@gmail.com> wrote:

According to the manual (9.1), pg_stat_get_last_vacuum_time() returns

     timestamptz | Time of the last non-FULL vacuum initiated by the
                 | user on this table

Why are full vacuums excluded from this statistic? It looks like there's
no way to get the date of the last manual vacuum, if only full vacuums
are performed.

Because FULL is a bit of a misnomer -- there are important things a
non-FULL vacuum does which a FULL vacuum does not.  In general, a
VACUUM FULL should be followed by a non-FULL vacuum to keep the
database in good shape.  Also, a VACUUM FULL is an extreme form of
maintenance which should rarely be needed; if you find that you
need to run VACUUM FULL, something is probably being done wrong
which should be fixed so that you don't need to continue to do such
extreme maintenance.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3CR Lender
crlender@gmail.com
In reply to: Kevin Grittner (#2)
Re: pg_stat_get_last_vacuum_time(): why non-FULL?

On 2013-03-26 19:28, Kevin Grittner wrote:

Why are full vacuums excluded from this statistic? It looks like there's
no way to get the date of the last manual vacuum, if only full vacuums
are performed.

Because FULL is a bit of a misnomer -- there are important things a
non-FULL vacuum does which a FULL vacuum does not. In general, a
VACUUM FULL should be followed by a non-FULL vacuum to keep the
database in good shape.

Thank you, that's very helpful. I wasn't aware of that.

Also, a VACUUM FULL is an extreme form of
maintenance which should rarely be needed; if you find that you
need to run VACUUM FULL, something is probably being done wrong
which should be fixed so that you don't need to continue to do such
extreme maintenance.

In this case I was only trying to make sense of an existing database
(8.3). The statistics in pg_stats were way off for some tables, so I
wanted to see if (auto)vacuum and (auto)analyze were being run.
pg_stat_all_tables() showed last_autoanalyze at >400 days for some of
the larger tables. There used to be a weekly cron job with VACUUM FULL
ANALYZE, and I was trying to find out if that cron job was still active.

Thanks,
crl

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Martín Marqués
martin.marques@gmail.com
In reply to: CR Lender (#3)
Re: pg_stat_get_last_vacuum_time(): why non-FULL?

2013/3/27 CR Lender <crlender@gmail.com>:

Also, a VACUUM FULL is an extreme form of
maintenance which should rarely be needed; if you find that you
need to run VACUUM FULL, something is probably being done wrong
which should be fixed so that you don't need to continue to do such
extreme maintenance.

In this case I was only trying to make sense of an existing database
(8.3). The statistics in pg_stats were way off for some tables, so I
wanted to see if (auto)vacuum and (auto)analyze were being run.
pg_stat_all_tables() showed last_autoanalyze at >400 days for some of
the larger tables. There used to be a weekly cron job with VACUUM FULL
ANALYZE, and I was trying to find out if that cron job was still active.

What's your autovacuum configuration? autovacuum_vacuum_threshold?
autovacuum_analyze_threshold? autovacuum_vacuum_scale_factor?
autovacuum_analyze_scale_factor?

Related to your 400+ days not vacuumed tables, are you sure those
tables have data changes (INSERT/UPDATE/DELETE)? I have some static
tables with over a year of no vacuum (and autovacuum field never ran
on that relation).

What does n_dead_tup show?

--
Martín Marqués
select 'martin.marques' || '@' || 'gmail.com'
DBA, Programador, Administrador

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5CR Lender
crlender@gmail.com
In reply to: Martín Marqués (#4)
Re: pg_stat_get_last_vacuum_time(): why non-FULL?

On 2013-03-28 13:11, Martín Marqués wrote:

2013/3/27 CR Lender <crlender@gmail.com>:

In this case I was only trying to make sense of an existing database
(8.3). The statistics in pg_stats were way off for some tables, so I
wanted to see if (auto)vacuum and (auto)analyze were being run.
pg_stat_all_tables() showed last_autoanalyze at >400 days for some of
the larger tables. There used to be a weekly cron job with VACUUM FULL
ANALYZE, and I was trying to find out if that cron job was still active.

What's your autovacuum configuration? autovacuum_vacuum_threshold?
autovacuum_analyze_threshold? autovacuum_vacuum_scale_factor?
autovacuum_analyze_scale_factor?

autovacuum | on
autovacuum_analyze_scale_factor | 0.1
autovacuum_analyze_threshold | 50
autovacuum_freeze_max_age | 200000000
autovacuum_max_workers | 3
autovacuum_naptime | 1min
autovacuum_vacuum_cost_delay | 20ms
autovacuum_vacuum_cost_limit | -1
autovacuum_vacuum_scale_factor | 0.2
autovacuum_vacuum_threshold | 50

The database is running on PostgreSQL 8.3.6.

I don't maintain this server, and my knowledge about the autovacuum
feature and its settings are sketchy. The values above could be the
defaults for 8.3.6, or they may have been adjusted by the admin.

Related to your 400+ days not vacuumed tables, are you sure those
tables have data changes (INSERT/UPDATE/DELETE)? I have some static
tables with over a year of no vacuum (and autovacuum field never ran
on that relation).

Yes. Autovacuum and autoanalyze are active, and tables with frequent
DELETEs and UPDATEs appear to be vacuumed and analyzed at least once a
day. Other tables with relatively frequent INSERTs, but irregular
UPDATEs and rare DELETEs go without vacuum/analyze for long periods of
time. Static tables never get analyzed or vacuumed (as expected).

What does n_dead_tup show?

Here are the statistics for three exemplary tables:

relname: | r____ | oe____ | mv____
| | |
n_tup_ins | 35335 | 179507 | 9562
n_tup_upd | 46727 | 824898 | 0
n_tup_del | 0 | 9709 | 3567
n_tup_hot_upd | 2016 | 793169 | 0
n_live_tup | 206086 | 1132164 | 57964
n_dead_tup | 35583 | 46932 | 5436
last_autovacuum | 2011-05-25 | NULL | NULL
last_autoanalyze | 2013-01-07 | 2012-12-27 | 2012-04-16

I'm not saying that autovacuum/autoanalyze aren't working as designed, I
was just surprised by the long delays.

Concerning the earlier reply to my question...

On 2013-03-26 19:28, Kevin Grittner wrote:

Because FULL is a bit of a misnomer -- there are important things a
non-FULL vacuum does which a FULL vacuum does not. In general, a
VACUUM FULL should be followed by a non-FULL vacuum to keep the
database in good shape.

I've read the manual more carefully now, and I can't see any mention of
what VACUUM does that VACUUM FULL does not. The point about extreme
maintainance is taken, but from what I read, VACUUM FULL should include
everything a normal VACUUM does.

Thanks,
crl

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#6Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: CR Lender (#5)
Re: pg_stat_get_last_vacuum_time(): why non-FULL?

CR Lender <crlender@gmail.com> wrote:

The database is running on PostgreSQL 8.3.6.

I've read the manual more carefully now, and I can't see any mention of
what VACUUM does that VACUUM FULL does not. The point about extreme
maintainance is taken, but from what I read, VACUUM FULL should include
everything a normal VACUUM does.

Prior to release 9.0 that is probably true.  Sorry for not asking
about the version first.  But you should read this page:

http://www.postgresql.org/support/versioning/

8.3 is out of support now.  Even for the 8.3 release, 8.3.6 is
missing over four years of fixes for bugs and security
vulnerabilities.  There is a very good chance that any problem you
see already fixed and you are just choosing to run without the fix.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#7Martín Marqués
martin.marques@gmail.com
In reply to: CR Lender (#5)
Re: pg_stat_get_last_vacuum_time(): why non-FULL?

2013/3/28 CR Lender <crlender@gmail.com>:

On 2013-03-28 13:11, Martín Marqués wrote:

2013/3/27 CR Lender <crlender@gmail.com>:

In this case I was only trying to make sense of an existing database
(8.3). The statistics in pg_stats were way off for some tables, so I
wanted to see if (auto)vacuum and (auto)analyze were being run.
pg_stat_all_tables() showed last_autoanalyze at >400 days for some of
the larger tables. There used to be a weekly cron job with VACUUM FULL
ANALYZE, and I was trying to find out if that cron job was still active.

What's your autovacuum configuration? autovacuum_vacuum_threshold?
autovacuum_analyze_threshold? autovacuum_vacuum_scale_factor?
autovacuum_analyze_scale_factor?

autovacuum | on
autovacuum_analyze_scale_factor | 0.1
autovacuum_analyze_threshold | 50
autovacuum_freeze_max_age | 200000000
autovacuum_max_workers | 3
autovacuum_naptime | 1min
autovacuum_vacuum_cost_delay | 20ms
autovacuum_vacuum_cost_limit | -1
autovacuum_vacuum_scale_factor | 0.2
autovacuum_vacuum_threshold | 50

The database is running on PostgreSQL 8.3.6.

What you have to look at is autovacuum_analyze_scale_factor which in
your case is 10% of the total tuples of the relation. So when 50
tuples (the threshold) over the 10% of the total tuples are
analyzable, autovacuum analyzes the relation.

Same thing with vacuum, but with 20% in that case.

If you want autovacuum to really clean and analize more, you will have
to lower autovacuum_analyze_scale_factor and
autovacuum_vacuum_scale_factor.

--
Martín Marqués
select 'martin.marques' || '@' || 'gmail.com'
DBA, Programador, Administrador

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#8CR Lender
crlender@gmail.com
In reply to: Kevin Grittner (#6)
Re: pg_stat_get_last_vacuum_time(): why non-FULL?

On 2013-03-28 20:44, Kevin Grittner wrote:

CR Lender <crlender@gmail.com> wrote:

The database is running on PostgreSQL 8.3.6.

I've read the manual more carefully now, and I can't see any mention of
what VACUUM does that VACUUM FULL does not. The point about extreme
maintainance is taken, but from what I read, VACUUM FULL should include
everything a normal VACUUM does.

Prior to release 9.0 that is probably true.

Hm, I can't find it, even in the manual for 9.2.
http://www.postgresql.org/docs/current/static/sql-vacuum.html

If VACUUM FULL is just a more aggressive VACCUM (including writing new
data files), then I don't understand the "non-FULL" restriction in
pg_stat_get_last_vacuum_time()... unless that information is somehow
lost when table files are rewritten.

8.3 is out of support now. Even for the 8.3 release, 8.3.6 is
missing over four years of fixes for bugs and security
vulnerabilities. There is a very good chance that any problem you
see already fixed and you are just choosing to run without the fix.

You're right of course, the PostgreSQL version on the server is rather
old. We're redesigning the whole application, and migrating to 9.2 will
be part of the process (I'm running 9.1 locally).

Thanks,
crl

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#9CR Lender
crlender@gmail.com
In reply to: CR Lender (#8)
Re: pg_stat_get_last_vacuum_time(): why non-FULL?

On 2013-03-31 18:31, CR Lender wrote:

On 2013-03-28 20:44, Kevin Grittner wrote:

CR Lender <crlender@gmail.com> wrote:

I've read the manual more carefully now, and I can't see any mention of
what VACUUM does that VACUUM FULL does not. The point about extreme
maintainance is taken, but from what I read, VACUUM FULL should include
everything a normal VACUUM does.

Prior to release 9.0 that is probably true.

Hm, I can't find it, even in the manual for 9.2.
http://www.postgresql.org/docs/current/static/sql-vacuum.html

If VACUUM FULL is just a more aggressive VACCUM (including writing new
data files), then I don't understand the "non-FULL" restriction in
pg_stat_get_last_vacuum_time()... unless that information is somehow
lost when table files are rewritten.

I don't mean to be pushy, but I have a meeting with the admin of that
database tomorrow, and it would be nice if I had something concrete to
tell him. I still don't know what it is that VACCUM does but VACUUM full
doesn't do. There's nothing in the manual about that.

Thanks,
crl

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#10Amit Kapila
amit.kapila16@gmail.com
In reply to: CR Lender (#9)
Re: pg_stat_get_last_vacuum_time(): why non-FULL?

On Monday, April 08, 2013 4:40 AM CR Lender wrote:

On 2013-03-31 18:31, CR Lender wrote:

On 2013-03-28 20:44, Kevin Grittner wrote:

CR Lender <crlender@gmail.com> wrote:

I've read the manual more carefully now, and I can't see any

mention of

what VACUUM does that VACUUM FULL does not. The point about extreme
maintainance is taken, but from what I read, VACUUM FULL should

include

everything a normal VACUUM does.

Prior to release 9.0 that is probably true.

Hm, I can't find it, even in the manual for 9.2.
http://www.postgresql.org/docs/current/static/sql-vacuum.html

If VACUUM FULL is just a more aggressive VACCUM (including writing

new

data files), then I don't understand the "non-FULL" restriction in
pg_stat_get_last_vacuum_time()... unless that information is somehow
lost when table files are rewritten.

I don't mean to be pushy, but I have a meeting with the admin of that
database tomorrow, and it would be nice if I had something concrete to
tell him. I still don't know what it is that VACCUM does but VACUUM
full
doesn't do. There's nothing in the manual about that.

One of the important difference is that during the time VACUUM FULL is operating on a relation,
no other operations will be allowed on that relation. Most of admin care about this point, because
they don't want to stop operations for background garbage collect.
VACUUM FULL is only done in rare cases when the relation size has grown too bigger than it's actual
Contents.

With Regards,
Amit Kapila.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#11Jeff Janes
jeff.janes@gmail.com
In reply to: CR Lender (#1)
Re: pg_stat_get_last_vacuum_time(): why non-FULL?

On Sun, Apr 7, 2013 at 8:55 PM, Amit Kapila <amit.kapila@huawei.com> wrote:

One of the important difference is that during the time VACUUM FULL is
operating on a relation,
no other operations will be allowed on that relation. Most of admin care
about this point, because
they don't want to stop operations for background garbage collect.

While that is true, it is not a reason not to update
pg_stat_get_last_vacuum_time.

I'm having a hard time coming up with a reason not to update
pg_stat_get_last_vacuum_time with a full vacuum.

On version 8.4 and below, you could justify it by saying that VACUUM FULL
bloated the indexes and then left them that way, and so we shouldn't update
the time field. But that is no longer the case. And even then, doing a
ordinary vacuum afterwards isn't going to fix the index bloat, so even that
argument is a bit sketchy.

Cheers,

Jeff

#12Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Jeff Janes (#11)
Re: pg_stat_get_last_vacuum_time(): why non-FULL?

Jeff Janes <jeff.janes@gmail.com> wrote:

Amit Kapila <amit.kapila@huawei.com> wrote:

One of the important difference is that during the time VACUUM
FULL is operating on a relation, no other operations will be
allowed on that relation. Most of admin care about this point,
because they don't want to stop operations for background
garbage collect.

While that is true, it is not a reason not to update
pg_stat_get_last_vacuum_time.  I'm having a hard time coming up
with a reason not to update pg_stat_get_last_vacuum_time with a
full vacuum.

On version 8.4 and below, you could justify it by saying that
VACUUM FULL bloated the indexes and then left them that way, and
so we shouldn't update the time field.  But that is no longer the
case.  And even then, doing a ordinary vacuum afterwards isn't
going to fix the index bloat, so even that argument is a bit
sketchy.

I'm not sure that what we're doing now is correct, but updating
things as if a normal vacuum had been done would *not* be the thing
to do.  For starters, VACUUM FULL blows away the free space map and
visibility map for a table.  Among other things, that means that
index-only scans will cease to work until the table has a normal
vacuum.  A normal vacuum (or autovacuum) will restore those, so a
VACUUM FULL should probably set things up to show that the table is
in need of a vacuum soon.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#13Jeff Janes
jeff.janes@gmail.com
In reply to: Kevin Grittner (#12)
Re: pg_stat_get_last_vacuum_time(): why non-FULL?

On Monday, April 8, 2013, Kevin Grittner wrote:

Jeff Janes <jeff.janes@gmail.com <javascript:;>> wrote:

Amit Kapila <amit.kapila@huawei.com <javascript:;>> wrote:

One of the important difference is that during the time VACUUM
FULL is operating on a relation, no other operations will be
allowed on that relation. Most of admin care about this point,
because they don't want to stop operations for background
garbage collect.

While that is true, it is not a reason not to update
pg_stat_get_last_vacuum_time. I'm having a hard time coming up
with a reason not to update pg_stat_get_last_vacuum_time with a
full vacuum.

On version 8.4 and below, you could justify it by saying that
VACUUM FULL bloated the indexes and then left them that way, and
so we shouldn't update the time field. But that is no longer the
case. And even then, doing a ordinary vacuum afterwards isn't
going to fix the index bloat, so even that argument is a bit
sketchy.

I'm not sure that what we're doing now is correct, but updating
things as if a normal vacuum had been done would *not* be the thing
to do. For starters, VACUUM FULL blows away the free space map and
visibility map for a table.

Ah, OK, that is obvious in retrospect. I was wracking my brain for
stats-collector-aspects and completely forgot about those.

I don't know how hard (for the hackers) or extra work (for the server) it
would be to make VACUUM FULL reset those things to reasonable values. But
it should be fairly easy to at least document them.

One often uses VACUUM FULL when one is up to ones elbows in alligators, so
it is understandable that we would not want to impose another burden on the
server at that particular moment. So I'm leaning towards documenting the
issue. Or are they already, and I'm just missing it?

Cheers,

Jeff

#14CR Lender
crlender@gmail.com
In reply to: Kevin Grittner (#12)
Re: pg_stat_get_last_vacuum_time(): why non-FULL?

On 2013-04-09 00:09, Kevin Grittner wrote:

I'm not sure that what we're doing now is correct, but updating
things as if a normal vacuum had been done would *not* be the thing
to do. For starters, VACUUM FULL blows away the free space map and
visibility map for a table. Among other things, that means that
index-only scans will cease to work until the table has a normal
vacuum.

Ah, now it makes sense. Thank you, that's what I was looking for.

And I agree with Jeff that this could be documented in more detail.

Thanks,
crl

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general