Vacuum, analyze, and setting reltuples of pg_class

Started by Greg Sabino Mullaneabout 19 years ago7 messages
#1Greg Sabino Mullane
greg@turnstep.com

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Short version: is it optimal for vacuum to always populate reltuples
with live rows + dead rows?

I came across a problem in which I noticed that a vacuum did not change
the reltuples value as I expected. A vacuum analyze indicated a correct
estimated number of rows, but the number put into reltuples was not
similar. Running analyze alone did put a more accurate number. After
some IRC talk and digging through the code, it appears that because the
system is busy, the dead rows could not be removed at that time, and
vacuum (and vacuum analyze) (and vacuum full analyze) uses the number of
live rows + dead rows to populate reltuples. Are there any alternatives to
running analyze outside of vacuum every time to ensure a better count? Is
there serious drawbacks in vacuum using the live versus the live vs. dead?
Is there any way to encourage those dead rows to go away, or to figure out what
is preventing them from being reaped? This is cluster-wide, and happens
even on newly created tables, but here is a real-life example on a busy table:

greg=# select reltuples, relpages from pg_class where relname = 'q';
reltuples | relpages
- -----------+----------
970 | 5724

greg=# select count(*) from q;
count
- -------
979

greg=# vacuum q;
VACUUM

greg=# select reltuples, relpages from pg_class where relname = 'q';
reltuples | relpages
- -----------+----------
2100 | 5724

greg=# vacuum full analyze q;
VACUUM

greg=# select reltuples, relpages from pg_class where relname = 'q';
reltuples | relpages
- -----------+----------
2116 | 5724

greg=# analyze q;
ANALYZE

greg=# select reltuples, relpages from pg_class where relname = 'q';
reltuples | relpages
- -----------+----------
897 | 5724

We've got much bigger tables that are affected worse than the example
above, of course. I'm pretty sure this is what Jeff Boes was experiencing
in 7.2, from this old thread:

http://svr5.postgresql.org/pgsql-bugs/2002-10/msg00138.php

I presume that the non-duplication was because Tom's database was not
so busy as to have dead rows laying around at the end of the vacuum
runs.

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200612111128
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iD8DBQFFfYq4vJuQZxSWSsgRAtoZAKDngqVnt77SLXmp/nvuOnUGfoEMOgCcD8lE
jjB7atW6824o6vd85wl6+ps=
=O7N/
-----END PGP SIGNATURE-----

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Sabino Mullane (#1)
Re: Vacuum, analyze, and setting reltuples of pg_class

"Greg Sabino Mullane" <greg@turnstep.com> writes:

Short version: is it optimal for vacuum to always populate reltuples
with live rows + dead rows?

If we didn't do that, it would tend to encourage the use of seqscans on
tables with lots of dead rows, which is probably a bad thing.

Is there any way to encourage those dead rows to go away,

Close your open transactions.

regards, tom lane

#3Greg Sabino Mullane
greg@turnstep.com
In reply to: Tom Lane (#2)
Re: Vacuum, analyze, and setting reltuples of pg_class

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Tom Lane replied:

Short version: is it optimal for vacuum to always populate reltuples
with live rows + dead rows?

If we didn't do that, it would tend to encourage the use of seqscans on
tables with lots of dead rows, which is probably a bad thing.

Bleh. Isn't that what a plain analyze would encourage then? Should analyze
be considering the dead rows somehow as well?

Is there any way to encourage those dead rows to go away,

Close your open transactions.

There are no long-running transactions running, but it is a very busy database,
so the chances of something else on the cluster being in a transaction at
any point in time is very high. Still, why would an open transaction elsewhere
block other databases / other tables for a vacuum full? E.g.:

prod=# create database gtest;
CREATE DATABASE

prod=# \c gtest
You are now connected to database "gtest".

gtest=# create table gtest(a int);
CREATE TABLE

gtest=# insert into gtest select 1 from generate_series(1,10);
INSERT 0 10

gtest=# delete from gtest;
DELETE 10

gtest=# vacuum full gtest;
VACUUM

gtest=# analyze verbose gtest;
INFO: analyzing "public.gtest"
INFO: "gtest": scanned 1 of 1 pages, containing 0 live rows and
10 dead rows; 0 rows in sample, 0 estimated total rows
ANALYZE

This is 8.1.3, by the way. At the very least, I'll submit a doc patch at
the end of all this. :)

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200612111226
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iD8DBQFFfZYLvJuQZxSWSsgRAmeDAKCPK2h9trzLn+1V6yN7cUjsnd/3VwCfT3Il
hdCrUGCVso01xkDRDKLUlpI=
=VOrr
-----END PGP SIGNATURE-----

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Sabino Mullane (#3)
Re: Vacuum, analyze, and setting reltuples of pg_class

"Greg Sabino Mullane" <greg@turnstep.com> writes:

Bleh. Isn't that what a plain analyze would encourage then? Should analyze
be considering the dead rows somehow as well?

Very possibly, at least for counting purposes (it mustn't try to analyze
the content of such rows, since they could be incompatible with the
table's current rowtype).

Still, why would an open transaction elsewhere
block other databases / other tables for a vacuum full?

The tracking of global xmin isn't specific enough to distinguish which
database a transaction is in. VACUUM does ignore the xmins of xacts in
other databases, but unfortunately the advertised xmin of another xact
in our *own* database will still include them.

There's been some discussion of advertising both a global and local xmin
in the PGPROC array, but this would impose extra complexity on every
single transaction start, and it's not clear that the benefit is worth that.

regards, tom lane

#5Jim C. Nasby
jim@nasby.net
In reply to: Tom Lane (#2)
Re: Vacuum, analyze, and setting reltuples of pg_class

On Mon, Dec 11, 2006 at 12:08:30PM -0500, Tom Lane wrote:

"Greg Sabino Mullane" <greg@turnstep.com> writes:

Short version: is it optimal for vacuum to always populate reltuples
with live rows + dead rows?

If we didn't do that, it would tend to encourage the use of seqscans on
tables with lots of dead rows, which is probably a bad thing.

So then why does vacuum do that? ISTM that it makes more sense for it to
act the same as analyze and only count live rows.
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim C. Nasby (#5)
Re: Vacuum, analyze, and setting reltuples of pg_class

"Jim C. Nasby" <jim@nasby.net> writes:

On Mon, Dec 11, 2006 at 12:08:30PM -0500, Tom Lane wrote:

"Greg Sabino Mullane" <greg@turnstep.com> writes:

Short version: is it optimal for vacuum to always populate reltuples
with live rows + dead rows?

If we didn't do that, it would tend to encourage the use of seqscans on
tables with lots of dead rows, which is probably a bad thing.

So then why does vacuum do that? ISTM that it makes more sense for it to
act the same as analyze and only count live rows.

I think what you misread what I said: it's better to have the larger
count in reltuples so that the planner won't try to use a seqscan when
there are, say, 3 live tuples and 100K dead ones. The real problem is
that analyze ought to act more like vacuum, but since it presently
ignores deaders altogether, it fails to.

regards, tom lane

#7Zeugswetter Andreas ADI SD
ZeugswetterA@spardat.at
In reply to: Tom Lane (#6)
Re: Vacuum, analyze, and setting reltuples of pg_class

Short version: is it optimal for vacuum to always populate

reltuples

with live rows + dead rows?

If we didn't do that, it would tend to encourage the use of

seqscans on

tables with lots of dead rows, which is probably a bad thing.

So then why does vacuum do that? ISTM that it makes more sense for

it to

act the same as analyze and only count live rows.

I think what you misread what I said: it's better to have the larger
count in reltuples so that the planner won't try to use a seqscan when
there are, say, 3 live tuples and 100K dead ones.

I don't agree. The metric to avoid scans should be/is table size.
(number of pages needed to be read for expected number of rows)
The number of tuples is relevant to estimate call frequency of
related nodes. So from that perspective we do not want dead tuples
in the count.
Maybe we need to improve the estimate in the large table few live
tuples case, but I think we should adjust vacuum and not analyze.

If you have a join with the said table with 3 rows and join it
to a same size but lots of visible tuples table, you would want to
start with the table with 3 rows.

Andreas