Why does autovacuum clean fewer rows than I expect?
Hi,
I'm having trouble making sense of the these two autovacuum log entries.
I'm running PostgreSQL 8.4.
[2014-05-22 04:56:43.486 EST] {537cf2c6.30f9} LOG: automatic vacuum of
table "postgres.globaldicom.requests": index scans: 1
pages: 0 removed, 163600 remain
tuples: 5916 removed, 8423598 remain
system usage: CPU 3.62s/9.84u sec elapsed 1060.49 sec
[2014-05-24 05:02:12.707 EST] {537f90c3.7144} LOG: automatic vacuum of
table "postgres.globaldicom.requests": index scans: 2
pages: 0 removed, 510725 remain
tuples: 1554 removed, 8583343 remain
system usage: CPU 9.19s/16.86u sec elapsed 2673.32 sec
I've verified that no vacuum (manual or auto) took place between them unless
they were basically no-ops, since log_autovacuum_min_duration=250.
Otherwise
we're using the stock autovacuum config settings, thus 3 workers, 1min
naptime,
autovacuum_vacuum_scale_factor=0.2, vacuum_cost_delay=0ms, etc.
First question: how do we get a 3x page usage increase with a <1% tuple
increase? The only thing I can think of is that a lot of very big values
were
stored. To my knowledge that didn't happen, but I can't rule it out
entirely.
Since I judge it quite unlikely I'm looking for alternate explanations of
what
might have happened.
Second question: What triggered the above autovacuums? From what I
understand
it must be one of imminent txid wraparound or exceeding
autovacuum_vacuum_scale_factor. If it was the latter, suggesting that 20%
of
the 8.5M tuples had changed, then why were only a few thousand tuples
removed?
I'm not sure how to eliminate the possibility that it was the former after
the
fact. However, I know that there was a lot of activity, going by the
auto-analyzes that took place (autovacuum_vacuum_analyze_factor=0.1):
[2014-05-20 11:08:59.632 EST] {537aab0a.b704} LOG: automatic analyze of
table "postgres.globaldicom.requests" system usage: CPU 0.36s/0.45u sec
elapsed 32.98 sec
[2014-05-23 21:12:49.164 EST] {537f2d12.459b} LOG: automatic analyze of
table "postgres.globaldicom.requests" system usage: CPU 0.08s/0.36u sec
elapsed 30.60 sec
[2014-05-24 00:56:00.927 EST] {537f6161.2b77} LOG: automatic analyze of
table "postgres.globaldicom.requests" system usage: CPU 0.16s/0.38u sec
elapsed 31.42 sec
[2014-05-24 06:33:16.972 EST] {537fb06d.e9e3} LOG: automatic analyze of
table "postgres.globaldicom.requests" system usage: CPU 0.15s/0.35u sec
elapsed 31.13 sec
This isn't an isolated incident, I see many autovacuum log entries where
only a
few tuples are removed from a very large table, and I can't figure out why
it
was being vacuumed if that's all there was to be removed.
Nick Cabatoff <nick.cabatoff@gmail.com> writes:
I'm having trouble making sense of the these two autovacuum log entries.
I'm running PostgreSQL 8.4.
8.4.what?
It'd probably be a good idea to monitor the counts in pg_stat_all_tables
for the troublesome table(s). I suspect there are a whole lot of HOT
updates going on in this table, which'd account for autoanalyze activity
happening even though autovacuum finds few dead tuples to remove. That
still doesn't directly explain physical table bloat, but it might be
a piece of the puzzle.
Also, have you had any database crashes in between these autovacuums?
I'm wondering if the WAL-replay issue fixed in commits 6f2aead1f and
siblings could have produced the bloat.
regards, tom lane
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Hi Tom,
Thanks for the quick reply. 8.4.9. No db crashes, in fact not even a
restart in May until the 28th. I don't believe any HOT updates were
occurring in that table: all updates modify one or more indexed fields
(update status from pending to done, or update to store a stack trace and
bump the time at which to retry the row into future). We don't delete from
the table, just insert and update.
During the time interval in question a bug was leading to an unusually high
level of activity, both inserts and updates. I'm thinking this is what
was responsible for a huge amount of table and index bloat, but haven't yet
connected all the dots. Since I don't see autovacuum failing, or failing
to run, it's a bit of a mystery.
I guess the next step is to attempt to reproduce the bug on a test system
and see if we get the bloat. I'll let you know what turns up.
On Thu, Jul 3, 2014 at 5:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Show quoted text
Nick Cabatoff <nick.cabatoff@gmail.com> writes:
I'm having trouble making sense of the these two autovacuum log entries.
I'm running PostgreSQL 8.4.8.4.what?
It'd probably be a good idea to monitor the counts in pg_stat_all_tables
for the troublesome table(s). I suspect there are a whole lot of HOT
updates going on in this table, which'd account for autoanalyze activity
happening even though autovacuum finds few dead tuples to remove. That
still doesn't directly explain physical table bloat, but it might be
a piece of the puzzle.Also, have you had any database crashes in between these autovacuums?
I'm wondering if the WAL-replay issue fixed in commits 6f2aead1f and
siblings could have produced the bloat.regards, tom lane
On Fri, Jul 4, 2014 at 7:41 AM, Nick Cabatoff <nick.cabatoff@gmail.com> wrote:
8.4.9.
This has been released in 2011, so you are missing 3 years worth of
bug fixes. It may be a good idea to first update to 8.4.21 before
trying to work more on that (consider as well an upgrade as 8.4 will
be EOL soon).
Regards,
--
Michael
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
We've been running 9.2 on our later branches for a while now. We're
overdue to bump the older branches to a later 8.4 though, you're right.
Thanks for the reminder.
On Thu, Jul 3, 2014 at 7:46 PM, Michael Paquier <michael.paquier@gmail.com>
wrote:
Show quoted text
On Fri, Jul 4, 2014 at 7:41 AM, Nick Cabatoff <nick.cabatoff@gmail.com>
wrote:8.4.9.
This has been released in 2011, so you are missing 3 years worth of
bug fixes. It may be a good idea to first update to 8.4.21 before
trying to work more on that (consider as well an upgrade as 8.4 will
be EOL soon).Regards,
--
Michael