Excessive growth of pg_attribute and other system tables

Started by Steve Crawfordabout 21 years ago20 messageshackers
Jump to latest
#1Steve Crawford
scrawford@pinpointresearch.com

I'm having trouble with physical growth of postgresql system tables.
Server is 7.4.6 and there are several databases in the cluster. The
autovacuum daemon has been running since the data was restored after
an upgrade a few months ago. Unfortunately my system tables are
taking an unreasonable amount of space.

For example, on one of the databases pg_attribute holds fewer than
10,000 records but is using more than 600 megabytes and the
associated indexes are huge, too. Reindexing dropped the total usage
for that database from 3.2G to 2.5G and a vacuum full (when I can do
it off hours) will probably drop it to around 1.9G. In other words,
one system table alone was accounting for around 40% of the storage
used by that database.

Now that 1.9G still includes other oversized files like pg_index for
which the table alone dropped from 48M to 78K with vacuum full.

Vacuum full + index on a selection of other tables yielded savings of:
pg_depend: 200M
pg_type: 120M
pg_class: 50M

My autovacuum config is running and I do see regular periodic vacuums
of these pg_ tables but still they grow.

Any ideas on why, in spite of autovacuum, these files are becoming so
huge and, more importantly, the best way to keep them under control.

Cheers,
Steve

#2Steve Crawford
scrawford@pinpointresearch.com
In reply to: Steve Crawford (#1)
Re: Excessive growth of pg_attribute and other system tables

On Thursday 17 March 2005 3:15 pm, Steve Crawford wrote:

I'm having trouble with physical growth of postgresql system
tables....

Additional info. The most recent autovacuum entries for the
pg_attribute table are:

[2005...] Performing: VACUUM ANALYZE "pg_catalog"."pg_attribute"
[2005...] table name: tati."pg_catalog"."pg_attribute"
[2005...] relid: 1249; relisshared: 0
[2005...] reltuples: 9334.000000; relpages: 82282
[2005...] curr_analyze_count: 6647115; curr_vacuum_count: 861454
[2005...] last_analyze_count: 6647115; last_vacuum_count: 861454
[2005...] analyze_threshold: 9834; vacuum_threshold: 19668

and

[2005...] Performing: ANALYZE "pg_catalog"."pg_attribute"
[2005...] table name: foo."pg_catalog"."pg_attribute"
[2005...] relid: 1249; relisshared: 0
[2005...] reltuples: 4843240.000000; relpages: 82284
[2005...] curr_analyze_count: 6657041; curr_vacuum_count: 862897
[2005...] last_analyze_count: 6657041; last_vacuum_count: 861454
[2005...] analyze_threshold: 4843740; vacuum_threshold: 19668

(Both within past 1 day - dates truncated to avoid line-wrap.) The
table currently has just over 9,000 tuples and I have no reason to
believe that should have changed substantially. The thresholds and
counts seem way off - especially in the second pass.

Cheers,
Steve

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Steve Crawford (#1)
Re: Excessive growth of pg_attribute and other system tables

Steve Crawford <scrawford@pinpointresearch.com> writes:

My autovacuum config is running and I do see regular periodic vacuums
of these pg_ tables but still they grow.

Do you have the FSM settings set large enough to account for all the
free space?

Also you might want to check for newer versions of autovacuum. I recall
that the earlier releases had bugs that sometimes made it skip vacuuming
tables that should be vacuumed.

regards, tom lane

#4Steve Crawford
scrawford@pinpointresearch.com
In reply to: Tom Lane (#3)
Re: Excessive growth of pg_attribute and other system tables

On Thursday 17 March 2005 3:51 pm, Tom Lane wrote:

Steve Crawford <scrawford@pinpointresearch.com> writes:

My autovacuum config is running and I do see regular periodic
vacuums of these pg_ tables but still they grow.

Do you have the FSM settings set large enough to account for all
the free space?

max_fsm_pages = 20000
max_fsm_relations = 1000

I just poked through the cluster. I have 6 user databases plus
template0 and template1. Total tables in the cluster has reached
slightly over 1,000 user tables plus all the system tables in the 8
databases. Does this indicate that I need to increase
max_fsm_relations to greater than the total number of tables in the
cluster? If so it's not a problem as bumping it to 2000 will only
cost me ~50k of my 4G memory plus a bit more since max_fsm_pages
would have to be increased to at least 32000.

Also you might want to check for newer versions of autovacuum. I
recall that the earlier releases had bugs that sometimes made it
skip vacuuming tables that should be vacuumed.

pg_autovacuum is from 7.4.6 release and is showing periodic vacuums of
the problem tables. I thought that bug was in some release prior to
7.4.6. Does the bug allow it to show a vacuum taking place but not do
it?

Cheers,
Steve

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Steve Crawford (#4)
Re: Excessive growth of pg_attribute and other system tables

Steve Crawford <scrawford@pinpointresearch.com> writes:

On Thursday 17 March 2005 3:51 pm, Tom Lane wrote:

Do you have the FSM settings set large enough to account for all
the free space?

max_fsm_pages = 20000
max_fsm_relations = 1000

That doesn't sound like nearly enough pages for a 2G database.
20000 * 8K = 160MB, so if more than 10% of the pages in your DB
are dirty you are going to be leaking free space. You probably
want max_fsm_pages up around 250,000.

I just poked through the cluster. I have 6 user databases plus
template0 and template1. Total tables in the cluster has reached
slightly over 1,000 user tables plus all the system tables in the 8
databases. Does this indicate that I need to increase
max_fsm_relations to greater than the total number of tables in the
cluster?

That would be a good plan too.

pg_autovacuum is from 7.4.6 release and is showing periodic vacuums of
the problem tables. I thought that bug was in some release prior to
7.4.6. Does the bug allow it to show a vacuum taking place but not do
it?

I don't recall the details of the changes; you could look at the CVS
logs to see what got fixed when. But in any case I think we see the
main problem: max_fsm_pages too small.

regards, tom lane

#6Chris Browne
cbbrowne@acm.org
In reply to: Steve Crawford (#1)
Re: Excessive growth of pg_attribute and other system tables

After takin a swig o' Arrakan spice grog, scrawford@pinpointresearch.com (Steve Crawford) belched out:

On Thursday 17 March 2005 3:51 pm, Tom Lane wrote:

Steve Crawford <scrawford@pinpointresearch.com> writes:

My autovacuum config is running and I do see regular periodic
vacuums of these pg_ tables but still they grow.

Do you have the FSM settings set large enough to account for all
the free space?

max_fsm_pages = 20000
max_fsm_relations = 1000

20000 is definitely way too low. It's not enough to track the dead
pages in pg_attribute alone, which looks to have the better part of
80K dead pages.

I'd increase that to about 200000, straight off.

It seems curious that you have so many tuples getting killed off in
this table; are you generating a lot of temp tables continually?

By the way, you should be vacuuming pg_attribute _way_ more often, as
it shouldn't have gotten as big if you did so...

Doing a reindex and doing (once!) a VACUUM FULL should help bring the
size down; vacuuming the table more often should keep size down...
--
output = reverse("moc.liamg" "@" "enworbbc")
http://cbbrowne.com/info/postgresql.html
"To do is to be." -- Aristotle
"To be is to do." -- Socrates
"Do be do be do." -- Sinatra
"Do be a do bee." -- Miss Sally of Romper Room fame.
"Yabba dabba do." -- Fred Flintstone
"DO...BEGIN..END" -- Niklaus Wirth

#7Matthew T. O'Connor
matthew@zeut.net
In reply to: Steve Crawford (#2)
Re: Excessive growth of pg_attribute and other system tables

Steve Crawford wrote:

On Thursday 17 March 2005 3:15 pm, Steve Crawford wrote:

I'm having trouble with physical growth of postgresql system
tables....

Additional info. The most recent autovacuum entries for the
pg_attribute table are:

[2005...] Performing: VACUUM ANALYZE "pg_catalog"."pg_attribute"
[2005...] table name: tati."pg_catalog"."pg_attribute"
[2005...] relid: 1249; relisshared: 0
[2005...] reltuples: 9334.000000; relpages: 82282
[2005...] curr_analyze_count: 6647115; curr_vacuum_count: 861454
[2005...] last_analyze_count: 6647115; last_vacuum_count: 861454
[2005...] analyze_threshold: 9834; vacuum_threshold: 19668

and

[2005...] Performing: ANALYZE "pg_catalog"."pg_attribute"
[2005...] table name: foo."pg_catalog"."pg_attribute"
[2005...] relid: 1249; relisshared: 0
[2005...] reltuples: 4843240.000000; relpages: 82284
[2005...] curr_analyze_count: 6657041; curr_vacuum_count: 862897
[2005...] last_analyze_count: 6657041; last_vacuum_count: 861454
[2005...] analyze_threshold: 4843740; vacuum_threshold: 19668

(Both within past 1 day - dates truncated to avoid line-wrap.) The
table currently has just over 9,000 tuples and I have no reason to
believe that should have changed substantially. The thresholds and
counts seem way off - especially in the second pass.

I believe this discrepancy has to do with the fact that ANALYZE can
return some very bogus values for reltuples, where as vacuum always
returns an accurate count. I'm not sure how to best handle this.

#8Matthew T. O'Connor
matthew@zeut.net
In reply to: Tom Lane (#5)
Re: Excessive growth of pg_attribute and other system tables

Tom Lane wrote:

Steve Crawford <scrawford@pinpointresearch.com> writes:

pg_autovacuum is from 7.4.6 release and is showing periodic vacuums of
the problem tables. I thought that bug was in some release prior to
7.4.6. Does the bug allow it to show a vacuum taking place but not do
it?

I don't recall the details of the changes; you could look at the CVS
logs to see what got fixed when. But in any case I think we see the
main problem: max_fsm_pages too small.

The relevant pg_autovacuum bugs where fixed prior to 7.4.6

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matthew T. O'Connor (#7)
Re: Excessive growth of pg_attribute and other system tables

"Matthew T. O'Connor" <matthew@zeut.net> writes:

I believe this discrepancy has to do with the fact that ANALYZE can
return some very bogus values for reltuples, where as vacuum always
returns an accurate count. I'm not sure how to best handle this.

I think 8.0's ANALYZE will do a better estimation job ... at least,
Manfred Koizar rewrote the sampling algorithm in hopes of making it
more robust.

However, given that there are 9334 tuples in 82282 pages, I'd say
that autovacuum has already failed Steve rather badly :-(. There
shouldn't be more than a couple hundred pages given that number of
rows. Perhaps the FSM settings are too small?

regards, tom lane

#10Steve Crawford
scrawford@pinpointresearch.com
In reply to: Tom Lane (#9)
Re: Excessive growth of pg_attribute and other system tables

On Monday 21 March 2005 11:40 am, Tom Lane wrote:

However, given that there are 9334 tuples in 82282 pages, I'd say
that autovacuum has already failed Steve rather badly :-(. There
shouldn't be more than a couple hundred pages given that number of
rows. Perhaps the FSM settings are too small?

Yup, we've pretty well established that my fsm settings were way too
low. I've bumped them up:
max_fsm_relations from 1,000 to 3,000
max_fsm_pages from 20,000 to 1,000,000

The slight expenditure of a few meg of RAM on a 4G ram machine will
hurt me far less than the incomplete vacuums. I have to schedule some
low-volume time to restart the server and vacum-full before I'll see
the result.

Just to make sure I'm understanding things correctly this time...I
originally (mis)understood these as settings related to resources
used _during_ vacuuming. My current understanding is that they are
basically pointers that track what space is available for reclamation
by vaccum and that the amount of fsm resources required depends on
both frequency of vacuums and volume of updates/deletes.

Questions:

1) Is my revised understanding correct?

And if the answer to 1 is yes...

2) What happens with all that free-space information at server restart
(ie. does a server restart lead to dead-tuple leakage)?

3) Is (or should) there be logging of the fact that a server has run
out of resources to track dead space?

4) Is there a way to query what proportion of the fsm resources are in
use and would access to that info be useful to the autovacuum daemon
or a system tuner?

Cheers,
Steve

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Steve Crawford (#10)
Re: Excessive growth of pg_attribute and other system tables

Steve Crawford <scrawford@pinpointresearch.com> writes:

Just to make sure I'm understanding things correctly this time...I
originally (mis)understood these as settings related to resources
used _during_ vacuuming. My current understanding is that they are
basically pointers that track what space is available for reclamation
by vaccum and that the amount of fsm resources required depends on
both frequency of vacuums and volume of updates/deletes.

The FSM is where VACUUM stores pointers to the free space it's found
(or created) in each table. Subsequent INSERTs/UPDATEs will use this
free space instead of appending to the file. So to prevent table
growth, you need enough FSM slots to remember enough free space to
satisfy all the INSERTs/UPDATEs between successive VACUUM runs.

In practice people tend to allocate enough FSM to cover all of their
database, instead of worrying about exactly which pages might contain
free space. In a low-update-volume situation you could probably get
away with less.

2) What happens with all that free-space information at server restart
(ie. does a server restart lead to dead-tuple leakage)?

Assuming you had a normal database shutdown rather than a crash, it's
written out at shutdown and reloaded. In any case, a VACUUM recomputes
the info from scratch.

4) Is there a way to query what proportion of the fsm resources are in
use and would access to that info be useful to the autovacuum daemon
or a system tuner?

VACUUM VERBOSE will tell you about this.

regards, tom lane

#12Steve Crawford
scrawford@pinpointresearch.com
In reply to: Tom Lane (#9)
Re: Excessive growth of pg_attribute and other system tables

On Monday 21 March 2005 11:40 am, Tom Lane wrote:

"Matthew T. O'Connor" <matthew@zeut.net> writes:

I believe this discrepancy has to do with the fact that ANALYZE
can return some very bogus values for reltuples, where as vacuum
always returns an accurate count. I'm not sure how to best
handle this.

I think 8.0's ANALYZE will do a better estimation job ... at least,
Manfred Koizar rewrote the sampling algorithm in hopes of making it
more robust.

However, given that there are 9334 tuples in 82282 pages, I'd say
that autovacuum has already failed Steve rather badly :-(. There
shouldn't be more than a couple hundred pages given that number of
rows. Perhaps the FSM settings are too small?

Results time. FSM settings were too small but the real problem seems
to be that pg_autovacuum isn't getting the job done.

I ran VACUUM VERBOSE and set postgresql.conf as follows (roughly 2.5
times the amounts suggested by VACUUM VERBOSE results):
max_fsm_pages = 500000
max_fsm_relations = 2500

I restarted the server last Friday night and manually did a vacuum
full and reindex of each user and system table in every database on
Monday. The zzz.pg_attribute table's file size dropped from over
600MB to less than 2MB.

It's now three days later and that table has already increased to
29MB. Processes accessing this database do create many temporary
tables so the exact count in pg_attribute varies a bit from time to
time but always hovers around 9500. I just did a manual VACUUM FULL
on pg_attribute and it's back to 1.3MB.

Upon completion of the vacuum, I restarted the pg_autovacuum daemon.
Following are the autovacuum log entries related to zzz.pg_attribute.
Note the growth of reltuples from 9532.000000 to 184720.000000 in
that time (what is a millionth of a tuple, anyway?) and the lack of
any vacuums performed.

28 12:12 PM] table name: zzz."pg_catalog"."pg_attribute"
28 12:12 PM] relid: 1249; relisshared: 0
28 12:12 PM] reltuples: 9532.000000; relpages: 157
28 12:12 PM] curr_analyze_count: 176294; curr_vacuum_count: 15447
28 12:12 PM] last_analyze_count: 176294; last_vacuum_count: 15447
28 12:12 PM] analyze_threshold: 10032; vacuum_threshold: 20064
28 12:12 PM] added table: zzz."pg_catalog"."pg_attribute"
--
28 03:42 PM] Performing: ANALYZE "pg_catalog"."pg_attribute"
28 03:42 PM] table name: zzz."pg_catalog"."pg_attribute"
28 03:42 PM] relid: 1249; relisshared: 0
28 03:42 PM] reltuples: 20270.000000; relpages: 336
28 03:42 PM] curr_analyze_count: 186374; curr_vacuum_count: 16329
28 03:42 PM] last_analyze_count: 186374; last_vacuum_count: 15447
28 03:42 PM] analyze_threshold: 20770; vacuum_threshold: 20064
--
28 10:59 PM] Performing: ANALYZE "pg_catalog"."pg_attribute"
28 10:59 PM] table name: zzz."pg_catalog"."pg_attribute"
28 10:59 PM] relid: 1249; relisshared: 0
28 10:59 PM] reltuples: 42591.000000; relpages: 706
28 10:59 PM] curr_analyze_count: 207254; curr_vacuum_count: 18156
28 10:59 PM] last_analyze_count: 207254; last_vacuum_count: 15447
28 10:59 PM] analyze_threshold: 43091; vacuum_threshold: 20064
--
29 02:03 PM] Performing: ANALYZE "pg_catalog"."pg_attribute"
29 02:03 PM] table name: zzz."pg_catalog"."pg_attribute"
29 02:03 PM] relid: 1249; relisshared: 0
29 02:03 PM] reltuples: 89464.000000; relpages: 1483
29 02:03 PM] curr_analyze_count: 250664; curr_vacuum_count: 21999
29 02:03 PM] last_analyze_count: 250664; last_vacuum_count: 15447
29 02:03 PM] analyze_threshold: 89964; vacuum_threshold: 20064
--
30 09:20 PM] Performing: ANALYZE "pg_catalog"."pg_attribute"
30 09:20 PM] table name: zzz."pg_catalog"."pg_attribute"
30 09:20 PM] relid: 1249; relisshared: 0
30 09:20 PM] reltuples: 184720.000000; relpages: 3062
30 09:20 PM] curr_analyze_count: 340791; curr_vacuum_count: 29886
30 09:20 PM] last_analyze_count: 340791; last_vacuum_count: 15447
30 09:20 PM] analyze_threshold: 185220; vacuum_threshold: 20064

Should I abandon pg_autovacuum and just do periodic VACUUM ANALYZEs of
everything? Should I use settings other than the defaults for
pg_autovacuum? Are temporary tables evil and their use to be avoided?

Just checked and zzz.pg_attribute is 50% larger than it was when I did
the VACUUM FULL at the start of this email.

Cheers,
Steve

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Steve Crawford (#12)
Re: [ADMIN] Excessive growth of pg_attribute and other system tables

Steve Crawford <scrawford@pinpointresearch.com> writes:

On Monday 21 March 2005 11:40 am, Tom Lane wrote:

However, given that there are 9334 tuples in 82282 pages, I'd say
that autovacuum has already failed Steve rather badly :-(. There
shouldn't be more than a couple hundred pages given that number of
rows. Perhaps the FSM settings are too small?

Results time. FSM settings were too small but the real problem seems
to be that pg_autovacuum isn't getting the job done.

The light just went on ... system catalog updates don't generate
statistics reports. Hence, autovacuum doesn't know any work is needed.

Should we fix that, or change autovacuum to special-case the system
catalogs somehow, or ???

regards, tom lane

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#13)
Re: [ADMIN] Excessive growth of pg_attribute and other system tables

I wrote:

The light just went on ... system catalog updates don't generate
statistics reports. Hence, autovacuum doesn't know any work is needed.

The above claim is too strong --- they do normally generate stats
updates. However, in a simple test I observed that
pg_stat_all_tables.n_tup_del did not seem to increment for the deletes
that occur when a temp table is dropped during backend exit. (Most
likely we aren't flushing out the final stats messages...)

Steve, is your app in the habit of creating lots of temp tables that are
not dropped explicitly? That would explain why you are getting bit more
than other people.

regards, tom lane

#15Steve Crawford
scrawford@pinpointresearch.com
In reply to: Tom Lane (#14)
Re: [ADMIN] Excessive growth of pg_attribute and other system tables

On Thursday 31 March 2005 12:06 pm, Tom Lane wrote:

I wrote:

The light just went on ... system catalog updates don't generate
statistics reports. Hence, autovacuum doesn't know any work is
needed.

The above claim is too strong --- they do normally generate stats
updates. However, in a simple test I observed that
pg_stat_all_tables.n_tup_del did not seem to increment for the
deletes that occur when a temp table is dropped during backend
exit. (Most likely we aren't flushing out the final stats
messages...)

Steve, is your app in the habit of creating lots of temp tables
that are not dropped explicitly? That would explain why you are
getting bit more than other people.

Yes, various processes create in total well over 100 temporary tables
every hour. None of them are explicitly dropped.

Cheers,
Steve

#16Matthew T. O'Connor
matthew@zeut.net
In reply to: Steve Crawford (#12)
Re: Excessive growth of pg_attribute and other system tables

Steve Crawford wrote:

On Monday 21 March 2005 11:40 am, Tom Lane wrote:

"Matthew T. O'Connor" <matthew@zeut.net> writes:

I believe this discrepancy has to do with the fact that ANALYZE
can return some very bogus values for reltuples, where as vacuum
always returns an accurate count. I'm not sure how to best
handle this.

I think 8.0's ANALYZE will do a better estimation job ... at least,
Manfred Koizar rewrote the sampling algorithm in hopes of making it
more robust.

However, given that there are 9334 tuples in 82282 pages, I'd say
that autovacuum has already failed Steve rather badly :-(. There
shouldn't be more than a couple hundred pages given that number of
rows. Perhaps the FSM settings are too small?

Results time. FSM settings were too small but the real problem seems
to be that pg_autovacuum isn't getting the job done.

Yeah, it certainly looks that way...

I ran VACUUM VERBOSE and set postgresql.conf as follows (roughly 2.5
times the amounts suggested by VACUUM VERBOSE results):
max_fsm_pages = 500000
max_fsm_relations = 2500

I restarted the server last Friday night and manually did a vacuum
full and reindex of each user and system table in every database on
Monday. The zzz.pg_attribute table's file size dropped from over
600MB to less than 2MB.

Wow, that is some serious bloat.

It's now three days later and that table has already increased to
29MB. Processes accessing this database do create many temporary
tables so the exact count in pg_attribute varies a bit from time to
time but always hovers around 9500. I just did a manual VACUUM FULL
on pg_attribute and it's back to 1.3MB.

Depending on several factors the "steady state" size of pg_attribute may
be several times larger than it's size right after a vacuum full. The
problem is not that it may be 29M, but rather that it continues to grow.

Upon completion of the vacuum, I restarted the pg_autovacuum daemon.
Following are the autovacuum log entries related to zzz.pg_attribute.
Note the growth of reltuples from 9532.000000 to 184720.000000 in
that time (what is a millionth of a tuple, anyway?) and the lack of
any vacuums performed.

I assume that pg_attribute didn't actually grow to 184720 tuples?

28 12:12 PM] table name: zzz."pg_catalog"."pg_attribute"
28 12:12 PM] relid: 1249; relisshared: 0
28 12:12 PM] reltuples: 9532.000000; relpages: 157
28 12:12 PM] curr_analyze_count: 176294; curr_vacuum_count: 15447
28 12:12 PM] last_analyze_count: 176294; last_vacuum_count: 15447
28 12:12 PM] analyze_threshold: 10032; vacuum_threshold: 20064
28 12:12 PM] added table: zzz."pg_catalog"."pg_attribute"
--
28 03:42 PM] Performing: ANALYZE "pg_catalog"."pg_attribute"
28 03:42 PM] table name: zzz."pg_catalog"."pg_attribute"
28 03:42 PM] relid: 1249; relisshared: 0
28 03:42 PM] reltuples: 20270.000000; relpages: 336
28 03:42 PM] curr_analyze_count: 186374; curr_vacuum_count: 16329
28 03:42 PM] last_analyze_count: 186374; last_vacuum_count: 15447
28 03:42 PM] analyze_threshold: 20770; vacuum_threshold: 20064
--
28 10:59 PM] Performing: ANALYZE "pg_catalog"."pg_attribute"
28 10:59 PM] table name: zzz."pg_catalog"."pg_attribute"
28 10:59 PM] relid: 1249; relisshared: 0
28 10:59 PM] reltuples: 42591.000000; relpages: 706
28 10:59 PM] curr_analyze_count: 207254; curr_vacuum_count: 18156
28 10:59 PM] last_analyze_count: 207254; last_vacuum_count: 15447
28 10:59 PM] analyze_threshold: 43091; vacuum_threshold: 20064
--
29 02:03 PM] Performing: ANALYZE "pg_catalog"."pg_attribute"
29 02:03 PM] table name: zzz."pg_catalog"."pg_attribute"
29 02:03 PM] relid: 1249; relisshared: 0
29 02:03 PM] reltuples: 89464.000000; relpages: 1483
29 02:03 PM] curr_analyze_count: 250664; curr_vacuum_count: 21999
29 02:03 PM] last_analyze_count: 250664; last_vacuum_count: 15447
29 02:03 PM] analyze_threshold: 89964; vacuum_threshold: 20064
--
30 09:20 PM] Performing: ANALYZE "pg_catalog"."pg_attribute"
30 09:20 PM] table name: zzz."pg_catalog"."pg_attribute"
30 09:20 PM] relid: 1249; relisshared: 0
30 09:20 PM] reltuples: 184720.000000; relpages: 3062
30 09:20 PM] curr_analyze_count: 340791; curr_vacuum_count: 29886
30 09:20 PM] last_analyze_count: 340791; last_vacuum_count: 15447
30 09:20 PM] analyze_threshold: 185220; vacuum_threshold: 20064

Should I abandon pg_autovacuum and just do periodic VACUUM ANALYZEs of
everything? Should I use settings other than the defaults for
pg_autovacuum? Are temporary tables evil and their use to be avoided?

Just checked and zzz.pg_attribute is 50% larger than it was when I did
the VACUUM FULL at the start of this email.

I don't think you should you abandon pg_autovacuum, but I do think you
should run periodic vacuum commands from cron until this is resolved.

The 1st thing you should do it change the pg_autovacuum settings. The
default values are very (far too) conservative for a lot of people.
Most people seem to have success with settings like:
pg_autovacuum -v 300 -V 0.1 -a 200 -A 0.1
Give that a try and see if it helps.

From pg_autovacuum's prospective, from 3/28 12:12 PM to 3/30 09:20 PM
there have been 29886 - 15447 = 14439 deletes + update commands against
the pg_attribute table. Once it sees 20064 deletes + updates it will
perform a vacuum. If there have been more during that time, we need to
figure out why pg_autovacuum is not seeing them.

The thing I don't understand in your numbers is why reltuples is
constantly increasing. When pg_autovacuum is reporting a reltuples of
184720, how many tuples does pg_attribute really have? This might be
related to the new reltuples estimation code that Tom was talking about
recently, if so, 8.0.2 might help.

Anyone else have some insight here?

Matt

#17Matthew T. O'Connor
matthew@zeut.net
In reply to: Tom Lane (#13)
Re: [ADMIN] Excessive growth of pg_attribute and other system tables

Tom Lane wrote:

Steve Crawford <scrawford@pinpointresearch.com> writes:

On Monday 21 March 2005 11:40 am, Tom Lane wrote:

However, given that there are 9334 tuples in 82282 pages, I'd say
that autovacuum has already failed Steve rather badly :-(. There
shouldn't be more than a couple hundred pages given that number of
rows. Perhaps the FSM settings are too small?

Results time. FSM settings were too small but the real problem seems
to be that pg_autovacuum isn't getting the job done.

The light just went on ... system catalog updates don't generate
statistics reports. Hence, autovacuum doesn't know any work is needed.

Should we fix that, or change autovacuum to special-case the system
catalogs somehow, or ???

Really?!?!?!? Wow, if that is true, that is a big gaping hole in the
autovacuum design. Is that true for all types of system catalog
updates? The reason I ask is that the stats system is reporting at
least some of activity on pg_attribute in this example. So why would it
report some but not all?

Is there any chance fixing the stats system to include system catalog
updates would be simple enough to put into the 8.0.x branch? I don't
know a another way for pg_autovacuum know if it's time for an vacuum.

Hopefully the autovacuum in 8.1 will be a fairly different animal.

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Steve Crawford (#15)
Re: [ADMIN] Excessive growth of pg_attribute and other system tables

Steve Crawford <scrawford@pinpointresearch.com> writes:

On Thursday 31 March 2005 12:06 pm, Tom Lane wrote:

Steve, is your app in the habit of creating lots of temp tables
that are not dropped explicitly? That would explain why you are
getting bit more than other people.

Yes, various processes create in total well over 100 temporary tables
every hour. None of them are explicitly dropped.

Try this patch (it's against 8.0, but applies with some fuzz to 7.4
too).

regards, tom lane

Index: pgstat.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/postmaster/pgstat.c,v
retrieving revision 1.87.4.1
diff -c -r1.87.4.1 pgstat.c
*** pgstat.c	25 Mar 2005 00:35:14 -0000	1.87.4.1
--- pgstat.c	31 Mar 2005 23:12:21 -0000
***************
*** 160,165 ****
--- 160,166 ----
  static void pgstat_recvbuffer(void);
  static void pgstat_exit(SIGNAL_ARGS);
  static void pgstat_die(SIGNAL_ARGS);
+ static void pgstat_beshutdown_hook(int code, Datum arg);
  static int	pgstat_add_backend(PgStat_MsgHdr *msg);
  static void pgstat_sub_backend(int procpid);
***************
*** 670,675 ****
--- 671,695 ----
  	pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_BESTART);
  	pgstat_send(&msg, sizeof(msg));
+ 
+ 	/*
+ 	 * Set up a process-exit hook to ensure we flush the last batch of
+ 	 * statistics to the collector.
+ 	 */
+ 	on_proc_exit(pgstat_beshutdown_hook, 0);
+ }
+ 
+ /*
+  * Flush any remaining statistics counts out to the collector at process
+  * exit.   Without this, operations triggered during backend exit (such as
+  * temp table deletions) won't be counted.  This is an on_proc_exit hook,
+  * not on_shmem_exit, so that everything interesting must have happened
+  * already.
+  */
+ static void
+ pgstat_beshutdown_hook(int code, Datum arg)
+ {
+ 	pgstat_report_tabstat();
  }
#19Qingqing Zhou
zhouqq@cs.toronto.edu
In reply to: Steve Crawford (#1)
Re: [ADMIN] Excessive growth of pg_attribute and other system tables

"Tom Lane" <tgl@sss.pgh.pa.us> writes

Steve Crawford <scrawford@pinpointresearch.com> writes:

On Monday 21 March 2005 11:40 am, Tom Lane wrote:

However, given that there are 9334 tuples in 82282 pages, I'd say
that autovacuum has already failed Steve rather badly :-(. There
shouldn't be more than a couple hundred pages given that number of
rows. Perhaps the FSM settings are too small?

Seems this is another question pointing to the inproper setting of
"can-be-avoided" shared memory parameters. Maybe we should eliminate GUC
parameters related to the FSM. Can we follow Alvaro's idea like spilling
some data of FSM into disk while keeping the indices and maybe part of data
in the memory? So no free page would be discarded due to no space to record
them in FSM? Also, in this handling, efficiency should not be a problem.

Regards,
Qingqing

#20Ben K.
bkim@coe.tamu.edu
In reply to: Chris Browne (#6)
Changing

I just wondered if there is a way to change the variable sign from : to $,
like

\set avar 32
select :avar ; ---> select $avar ;

so I can copy and paste my perl code directly to psql when debugging,
assuming I do a proper \set.

TIA.

Ben Kim
Developer
College of Education
Texas A&M University