Indexes and Tables: Growth and Treatment (Modified by Thomas F. O'Connell)
I'm helping manage a postgres installation that continually consumes a
considerable amount of disk space, and I'm hoping to learn a bit more
about both treating the symptoms and addressing the causes.
Here are the basics:
It's a pg 7.4.1 installation on a Debian stable GNU/Linux 2.6.2 box
with 4GB RAM with 4 2.4 GHz processors and 36 GB of disk space.
There are thousands of tables, many of which are object-relational
(I.e., many are subclasses of sets of top-level tables). There are
indexes in place for joins that apply to many of the columns in the
subclassed tables.
It's a high turnover database, in that the applications that use it
perform thousands of inserts, updates, and deletes on a daily basis.
We're seeing about 5-10 GB of increased disk space used on a daily
basis if a vacuum (full) or reindexdb is not performed. We were doing
one vacuum analyze full a week with nightly vacuum analyzes. We began
manually reindexing the worst offenders once we passed 50% disk usage
regularly.
So here are my questions:
1. Is adding reindexdb to cron to reindex the entire database nightly
overkill?
2. If we turn on pg_autovacuum and leave in place one weekly vacuum
full, is that a reasonable strategy?
3. Otherwise, is it better in general to vacuum prior to reindexing?
4. What are the best places to look for causes of the velocity of
growth?
Thanks!
-tfo
"Thomas F.O'Connell" <tfo@sitening.com> writes:
It's a high turnover database, in that the applications that use it
perform thousands of inserts, updates, and deletes on a daily basis.
We're seeing about 5-10 GB of increased disk space used on a daily
basis if a vacuum (full) or reindexdb is not performed. We were doing
one vacuum analyze full a week with nightly vacuum analyzes.
Try hourly vacuums. If that doesn't stem the tide, make it more often
(or try autovacuum). Also make sure that your FSM settings are large
enough; if they're not then no amount of plain vacuuming will keep you
out of trouble.
With sufficiently frequent plain vacuums you really shouldn't need
vacuum full at all.
I can't recommend an analyze frequency on what you've told us.
regards, tom lane
On Jul 13, 2004, at 6:58 PM, Tom Lane wrote:
Try hourly vacuums. If that doesn't stem the tide, make it more often
(or try autovacuum).
I will try autovacuum.
Also make sure that your FSM settings are large
enough; if they're not then no amount of plain vacuuming will keep you
out of trouble.
I was just reading up on FSM settings today. In fact, here's the output
of a recent VACUUM VERBOSE:
INFO: free space map: 1000 relations, 11599 pages stored; 100064 total
pages needed
DETAIL: Allocated FSM size: 1000 relations + 20000 pages = 178 kB
shared memory.
So clearly we need to increase max_fsm_pages. How is this related to
vacuuming? And is it related at all to index growth?
With sufficiently frequent plain vacuums you really shouldn't need
vacuum full at all.
So is the only benefit to that the extreme optimizations of disk space
it undertakes? Is there any point at which the extra compacting
actually results in a performance enhancement?
I can't recommend an analyze frequency on what you've told us.
What more information would you need to make a recommendation?
Thanks for all the tips!
-tfo
Matthew,
Here's some more feedback on our use of pg_autovaccum. It's clear that
it's working and that it's helping, but even after increasing our
max_fsm_pages substantially (to in excess of what vacuum verbose
suggests is needed), we're still seeing pretty a rapid increase in disk
usage.
It used to be that nightly reindexing helped substantially, but am I
wrong in thinking that the frequency of dynamic analysis is helping
keep index size down?
Anyway, the bottom line is that it still seems like a vacuum full on a
periodic basis is necessary to keep growth rates in check. We track
both table and index size, and our max table grows pretty rapidly (over
the course of a few days) from about 4 MB to about 11 MB with
pg_autovacuum running. After a vacuum full, it returns to about 4 MB.
Are there any other postgresql.conf parameters that I should be looking
at as far as allowing pg_autovacuum to work more effective in terms of
disk reclamation? Is it possible that my max_fsm_pages is still too
low?
Thanks!
-tfo
Begin forwarded message:
Show quoted text
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: July 13, 2004 6:58:24 PM CDT
To: "Thomas F.O'Connell" <tfo@sitening.com>
Cc: Scott Holdren <scott@holdren.com>, pgsql-general@postgreSQL.org
Subject: Re: [GENERAL] Indexes and Tables: Growth and Treatment
(Modified by Thomas F. O'Connell)"Thomas F.O'Connell" <tfo@sitening.com> writes:
It's a high turnover database, in that the applications that use it
perform thousands of inserts, updates, and deletes on a daily basis.We're seeing about 5-10 GB of increased disk space used on a daily
basis if a vacuum (full) or reindexdb is not performed. We were doing
one vacuum analyze full a week with nightly vacuum analyzes.Try hourly vacuums. If that doesn't stem the tide, make it more often
(or try autovacuum). Also make sure that your FSM settings are large
enough; if they're not then no amount of plain vacuuming will keep you
out of trouble.With sufficiently frequent plain vacuums you really shouldn't need
vacuum full at all.I can't recommend an analyze frequency on what you've told us.
regards, tom lane
Import Notes
Resolved by subject fallback
Tom,
If I've got the RAM, should I have max_fsm_relations be large enough to
cover _all_ user tables and indexes?
Thanks!
-tfo
On Jul 13, 2004, at 6:58 PM, Tom Lane wrote:
Show quoted text
Try hourly vacuums. If that doesn't stem the tide, make it more often
(or try autovacuum). Also make sure that your FSM settings are large
enough; if they're not then no amount of plain vacuuming will keep you
out of trouble.
Regular vacuum will (almost) never return your table to it's minimum
size. I don't think it's unreasonable for a table that is 4MB after a
vacuum full, to grow to 11MB, especially if it's a very active table.
The important question is does it keep growing? Or does it reach a
steady state size? There is no point in reclaiming space via VACUUM
FULL when that space will need to be reallocated shortly.
So does the table growth eventually plateau? Or do they grow
indefinitely? If they are growing without bound, then you might try
some more aggressive thresholds for pg_autovacuum, or perhaps more FSM
space.
Matthew
Show quoted text
On Sun, 2004-07-18 at 12:13, Thomas F.O'Connell wrote:
Matthew,
Here's some more feedback on our use of pg_autovaccum. It's clear
thatit's working and that it's helping, but even after increasing
ourmax_fsm_pages substantially (to in excess of what vacuum
verbosesuggests is needed), we're still seeing pretty a rapid increase
indisk usage.It used to be that nightly reindexing helped substantially, but am
Iwrong in thinking that the frequency of dynamic analysis is
helpingkeep index size down?Anyway, the bottom line is that it still seems like a vacuum full on
aperiodic basis is necessary to keep growth rates in check. We
trackboth table and index size, and our max table grows pretty
rapidly(over the course of a few days) from about 4 MB to about 11 MB
withpg_autovacuum running. After a vacuum full, it returns to about 4
MB.Are there any other postgresql.conf parameters that I should belooking
at as far as allowing pg_autovacuum to work more effective interms of
disk reclamation? Is it possible that my max_fsm_pages isstill too
low?
Import Notes
Reply to msg id not found: 4600539.1090167427981.JavaMail.root@herculesReference msg id not found: 4600539.1090167427981.JavaMail.root@hercules | Resolved by subject fallback
On Jul 18, 2004, at 6:46 PM, Matthew T. O'Connor wrote:
Regular vacuum will (almost) never return your table to it's minimum
size. I don't think it's unreasonable for a table that is 4MB after a
vacuum full, to grow to 11MB, especially if it's a very active table.
That's good to know.
The important question is does it keep growing? Or does it reach a
steady state size? There is no point in reclaiming space via VACUUM
FULL when that space will need to be reallocated shortly.
Things seem to be growing much more naturally now that we have more
robust FSM settings.
Thanks again for your help!
-tfo