auto-vacuum & Negative "anl" Values
Greetings all,
I have been spending some time looking into how auto-vacuum is
performing on one of our servers. After putting the PostgreSQL logs
in debug I noticed that the threshold for ANALYZE was never being hit
for a particular table because the calculated value becomes
increasingly negative.
We have an entry in the pg_autovacuum table for a table that has
quite frequent inserts and updates, but not deletes. Here are the
values:
-[ RECORD 1 ]----+------
vacrelid | #####
enabled | t
vac_base_thresh | 500
vac_scale_factor | 0.1
anl_base_thresh | 200
anl_scale_factor | 0.05
vac_cost_delay | -1
vac_cost_limit | -1
I've noticed that the threshold for ANALYZE never gets met because
the threshold for VACUUM is hit first, therefore resetting the
counters. Here is a snippet of the log that shows what's happening:
DEBUG: tablename: vac: 961 (threshold 14217), anl: -9756 (threshold
7058)
DEBUG: tablename: vac: 1924 (threshold 14217), anl: -8792 (threshold
7058)
DEBUG: tablename: vac: 2953 (threshold 14217), anl: -7763 (threshold
7058)
DEBUG: tablename: vac: 3998 (threshold 14217), anl: -6718 (threshold
7058)
DEBUG: tablename: vac: 5170 (threshold 14217), anl: -5546 (threshold
7058)
DEBUG: tablename: vac: 6405 (threshold 14217), anl: -4311 (threshold
7058)
DEBUG: tablename: vac: 7635 (threshold 14217), anl: -3081 (threshold
7058)
DEBUG: tablename: vac: 8818 (threshold 14217), anl: -1898 (threshold
7058)
DEBUG: tablename: vac: 9917 (threshold 14217), anl: -798 (threshold
7058)
DEBUG: tablename: vac: 10987 (threshold 14217), anl: 272 (threshold
7058)
DEBUG: tablename: vac: 12016 (threshold 14217), anl: 1301 (threshold
7058)
DEBUG: tablename: vac: 12929 (threshold 14217), anl: 2214 (threshold
7058)
DEBUG: tablename: vac: 13717 (threshold 14217), anl: 3002 (threshold
7058)
DEBUG: tablename: vac: 14441 (threshold 14217), anl: 3727 (threshold
7058)
...vacuum threshold is hit...
DEBUG: tablename: vac: 752 (threshold 14217), anl: -9962 (threshold
7058)
DEBUG: tablename: vac: 1491 (threshold 14217), anl: -9223 (threshold
7058)
DEBUG: tablename: vac: 2213 (threshold 14217), anl: -8501 (threshold
7058)
DEBUG: tablename: vac: 2984 (threshold 14217), anl: -7730 (threshold
7058)
The outcome of this is that an ANALYZE is never run, as after the
counters are reset for each VACUUM the counter for ANALYZE gets
increasingly larger. But as you can see from our entries in
pg_autovacuum above, an ANALYZE should occur much more frequently
than a VACUUM.
We're running PostgreSQL 8.1.4 on a RHEL-3 with a 2.4.21-27.0.2.ELsmp
kernel.
Input is appreciated to explain exactly what is happening here.
Thanks so much!
--
Dylan Hansen
Enterprise Systems Developer
Dylan Hansen <dhansen@pixpo.com> writes:
I have been spending some time looking into how auto-vacuum is
performing on one of our servers. After putting the PostgreSQL logs
in debug I noticed that the threshold for ANALYZE was never being hit
for a particular table because the calculated value becomes
increasingly negative.
Hmm, it shouldn't ever be negative at all, I would think. The
calculation in question is
anltuples = tabentry->n_live_tuples + tabentry->n_dead_tuples -
tabentry->last_anl_tuples;
Apparently somehow last_anl_tuples has managed to get to be bigger than
n_live_tuples, which maybe could happen after a delete. Should we be
clamping last_anl_tuples to not exceed n_live_tuples somewhere?
Alvaro and Matthew, what do you think?
regards, tom lane
Tom Lane wrote:
Dylan Hansen <dhansen@pixpo.com> writes:
I have been spending some time looking into how auto-vacuum is
performing on one of our servers. After putting the PostgreSQL logs
in debug I noticed that the threshold for ANALYZE was never being hit
for a particular table because the calculated value becomes
increasingly negative.Hmm, it shouldn't ever be negative at all, I would think. The
calculation in question isanltuples = tabentry->n_live_tuples + tabentry->n_dead_tuples -
tabentry->last_anl_tuples;Apparently somehow last_anl_tuples has managed to get to be bigger than
n_live_tuples, which maybe could happen after a delete. Should we be
clamping last_anl_tuples to not exceed n_live_tuples somewhere?
Alvaro and Matthew, what do you think?
Hmm ... I'd think that the number of dead tuples plus live tuples should
never be smaller than the number of tuples seen at last analyze. Unless
some stats messages are lost (say, stop the server, start with stats
disabled, do a big DELETE, stop, restart normally).
I think there's a bug elsewhere. On pgstat_recv_tabstat, I see
tabentry->n_live_tuples = tabmsg[i].t_tuples_inserted;
tabentry->n_dead_tuples = tabmsg[i].t_tuples_updated +
tabmsg[i].t_tuples_deleted;
But really n_live_tuples should be decreased by t_tuples_deleted as
well, so this should read
tabentry->n_live_tuples = tabmsg[i].t_tuples_inserted -
tabmsg[i].t_tuples_deleted;
tabentry->n_dead_tuples = tabmsg[i].t_tuples_updated +
tabmsg[i].t_tuples_deleted;
On the other hand I don't see how this would explain the problem Dylan
is seeing, because the effect would be the inverse -- anltuples would be
erroneously calculated too high, so it wouldn't become negative.
I think we should catch a negative anltuples value, log a WARNING, and
force an analyze, because it's indicative of a bug.
I'm interested in what Dylan can tell about the activity over this
table? What does pg_stat_user_tables say about it?
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes:
Tom Lane wrote:
Apparently somehow last_anl_tuples has managed to get to be bigger than
n_live_tuples, which maybe could happen after a delete. Should we be
clamping last_anl_tuples to not exceed n_live_tuples somewhere?
Alvaro and Matthew, what do you think?
Hmm ... I'd think that the number of dead tuples plus live tuples should
never be smaller than the number of tuples seen at last analyze.
The scenario I was imagining was big delete followed by
VACUUM-without-ANALYZE. In this situation, it looks to me like
pgstat_recv_vacuum updates n_live_tuples to the new smaller value
and doesn't do anything to last_anl_tuples. I'm thinking we need
tabentry->n_live_tuples = msg->m_tuples;
tabentry->n_dead_tuples = 0;
if (msg->m_analyze)
{
tabentry->last_anl_tuples = msg->m_tuples;
if (msg->m_autovacuum)
tabentry->autovac_analyze_timestamp = msg->m_vacuumtime;
else
tabentry->analyze_timestamp = msg->m_vacuumtime;
}
+ else
+ {
+ /* last_anl_tuples must never exceed n_live_tuples */
+ tabentry->last_anl_tuples = Min(tabentry->last_anl_tuples,
+ msg->m_tuples);
+ }
}
but perhaps I'm confused.
regards, tom lane
Hi Tom, Alvaro. Thanks so much for your replies.
On 22-Jun-06, at 12:56 PM, Alvaro Herrera wrote:
Hmm ... I'd think that the number of dead tuples plus live tuples
should
never be smaller than the number of tuples seen at last analyze.
Unless
some stats messages are lost (say, stop the server, start with stats
disabled, do a big DELETE, stop, restart normally).
We had to restart the server to upgrade to 8.1.4 when it was
released, but since then we have not restarted our server. We
haven't changed our stats gathering recently, although we've made
some changes to auto-vacuum to do it more frequently (outlined in my
first e-mail). Unfortunately, I do not have any data to supply
previous to the upgrade as we have just recently changed our logging
to be more verbose.
I'm interested in what Dylan can tell about the activity over this
table? What does pg_stat_user_tables say about it?
Here's the pg_stat_user_tables entry for this table:
-[ RECORD 1 ]-+----------
relid | #####
schemaname | public
relname | tablename
seq_scan | 345
seq_tup_read | 46699390
idx_scan | 33731562
idx_tup_fetch | 171166937
n_tup_ins | 3359
n_tup_upd | 8205786
n_tup_del | 0
As you can see, the number of deleted tuples is 0, so I don't believe
that it could be an issue with DELETEs on the table because there
aren't any!
It has to be something that occurs when the VACCUM threshold is hit,
because it resets both the VACUUM and ANALYZE counters, and it decreases
Yesterday we tried running the ANALYZE command and it did reset the
counter to a non-negative value. However, the next time that the
VACUUM threshold was reached, the ANALYZE counter was decreased to a
negative number.
Thanks for looking into this, please let me know if there's any
information I can provide!
--
Dylan Hansen
Enterprise Systems Developer
Tom Lane wrote:
Dylan Hansen <dhansen@pixpo.com> writes:
I have been spending some time looking into how auto-vacuum is
performing on one of our servers. After putting the PostgreSQL logs
in debug I noticed that the threshold for ANALYZE was never being hit
for a particular table because the calculated value becomes
increasingly negative.Hmm, it shouldn't ever be negative at all, I would think. The
calculation in question isanltuples = tabentry->n_live_tuples + tabentry->n_dead_tuples -
tabentry->last_anl_tuples;Apparently somehow last_anl_tuples has managed to get to be bigger than
n_live_tuples, which maybe could happen after a delete. Should we be
clamping last_anl_tuples to not exceed n_live_tuples somewhere?
Alvaro and Matthew, what do you think?
I think I had something in the contrib version that checked this. I
always assumed it would be caused by a stats reset which was more common
in earlier PGSQL releases since stats_reset_on_startup (or whatever the
correct spelling of that is) was enabled by default.
So can I assume that this is a bug?
The only resolution I can see right now is to setup a cron job that
will perform an ANALYZE periodically, as the pg_autovacuum ANALYZE
threshold is never reached.
Any other suggestions? Thanks for the input!
--
Dylan Hansen
Enterprise Systems Developer
On 24-Jun-06, at 4:09 PM, Matthew T. O'Connor wrote:
Show quoted text
Tom Lane wrote:
Dylan Hansen <dhansen@pixpo.com> writes:
I have been spending some time looking into how auto-vacuum is
performing on one of our servers. After putting the PostgreSQL
logs in debug I noticed that the threshold for ANALYZE was never
being hit for a particular table because the calculated value
becomes increasingly negative.Hmm, it shouldn't ever be negative at all, I would think. The
calculation in question isanltuples = tabentry->n_live_tuples + tabentry->n_dead_tuples -
tabentry->last_anl_tuples;Apparently somehow last_anl_tuples has managed to get to be bigger
than
n_live_tuples, which maybe could happen after a delete. Should we be
clamping last_anl_tuples to not exceed n_live_tuples somewhere?
Alvaro and Matthew, what do you think?I think I had something in the contrib version that checked this.
I always assumed it would be caused by a stats reset which was more
common in earlier PGSQL releases since stats_reset_on_startup (or
whatever the correct spelling of that is) was enabled by default.---------------------------(end of
broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
Dylan Hansen wrote:
So can I assume that this is a bug?
Definitively a bug.
The only resolution I can see right now is to setup a cron job that
will perform an ANALYZE periodically, as the pg_autovacuum ANALYZE
threshold is never reached.Any other suggestions? Thanks for the input!
I just committed a fix, so the other alternative is get a CVS checkout
from the 8.1 branch and put it up to see if it fixes your problem. The
relevant patch is below.
Index: src/backend/postmaster/pgstat.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/postmaster/pgstat.c,v
retrieving revision 1.111.2.3
diff -c -p -r1.111.2.3 pgstat.c
*** src/backend/postmaster/pgstat.c 19 May 2006 15:15:38 -0000 1.111.2.3
--- src/backend/postmaster/pgstat.c 27 Jun 2006 03:36:03 -0000
*************** pgstat_recv_vacuum(PgStat_MsgVacuum *msg
*** 2919,2924 ****
--- 2919,2930 ----
tabentry->n_dead_tuples = 0;
if (msg->m_analyze)
tabentry->last_anl_tuples = msg->m_tuples;
+ else
+ {
+ /* last_anl_tuples must never exceed n_live_tuples */
+ tabentry->last_anl_tuplse = Min(tabentry->last_anl_tuples,
+ msg->m_tuples);
+ }
}
/* ----------
*************** pgstat_recv_tabstat(PgStat_MsgTabstat *m
*** 3055,3061 ****
tabentry->tuples_updated += tabmsg[i].t_tuples_updated;
tabentry->tuples_deleted += tabmsg[i].t_tuples_deleted;
! tabentry->n_live_tuples += tabmsg[i].t_tuples_inserted;
tabentry->n_dead_tuples += tabmsg[i].t_tuples_updated +
tabmsg[i].t_tuples_deleted;
--- 3061,3068 ----
tabentry->tuples_updated += tabmsg[i].t_tuples_updated;
tabentry->tuples_deleted += tabmsg[i].t_tuples_deleted;
! tabentry->n_live_tuples += tabmsg[i].t_tuples_inserted -
! tabmsg[i].t_tuples_deleted;
tabentry->n_dead_tuples += tabmsg[i].t_tuples_updated +
tabmsg[i].t_tuples_deleted;
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes:
+ /* last_anl_tuples must never exceed n_live_tuples */
If we actually believe the above statement, it seems like your patch
to pgstat_recv_tabstat() opens a new issue: with that patch, it is
possible for pgstat_recv_tabstat() to decrease n_live_tuples, and
therefore a clamp needs to be applied in pgstat_recv_tabstat() too.
No?
The reason I didn't patch it myself is that I'm not quite clear on what
*should* be happening here. What effect should a large delete have on
the ANALYZE threshold, exactly? You could argue that a deletion
potentially changes the statistics (by omission), and therefore inserts,
updates, and deletes should equally count +1 towards the analyze
threshold. I don't think we are implementing that though. If we want
to do it that way, I suspect last_anl_tuples as currently defined is not
the right comparison point.
regards, tom lane
Tom Lane wrote:
Alvaro Herrera <alvherre@commandprompt.com> writes:
+ /* last_anl_tuples must never exceed n_live_tuples */
If we actually believe the above statement, it seems like your patch
to pgstat_recv_tabstat() opens a new issue: with that patch, it is
possible for pgstat_recv_tabstat() to decrease n_live_tuples, and
therefore a clamp needs to be applied in pgstat_recv_tabstat() too.
No?
Hmm, yeah.
The reason I didn't patch it myself is that I'm not quite clear on what
*should* be happening here. What effect should a large delete have on
the ANALYZE threshold, exactly? You could argue that a deletion
potentially changes the statistics (by omission), and therefore inserts,
updates, and deletes should equally count +1 towards the analyze
threshold. I don't think we are implementing that though. If we want
to do it that way, I suspect last_anl_tuples as currently defined is not
the right comparison point.
Maybe what we should do is revert the pgstat_recv_tabstat() part of the
patch in 8.1, and consider redefining last_anl_tuples in HEAD. Caffeine
is not high enough yet to propose anything sensible, but I'll think
about it a bit later.
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Hi Tom, Alvaro,
Thanks for your work on this. Please keep me posted as to which
version in CVS this fix will be applied to and I will do my best to
test it.
Thanks again!
--
Dylan Hansen
Enterprise Systems Developer
On 27-Jun-06, at 5:42 AM, Alvaro Herrera wrote:
Show quoted text
Tom Lane wrote:
Alvaro Herrera <alvherre@commandprompt.com> writes:
+ /* last_anl_tuples must never exceed n_live_tuples */
If we actually believe the above statement, it seems like your patch
to pgstat_recv_tabstat() opens a new issue: with that patch, it is
possible for pgstat_recv_tabstat() to decrease n_live_tuples, and
therefore a clamp needs to be applied in pgstat_recv_tabstat() too.
No?Hmm, yeah.
The reason I didn't patch it myself is that I'm not quite clear on
what
*should* be happening here. What effect should a large delete
have on
the ANALYZE threshold, exactly? You could argue that a deletion
potentially changes the statistics (by omission), and therefore
inserts,
updates, and deletes should equally count +1 towards the analyze
threshold. I don't think we are implementing that though. If we
want
to do it that way, I suspect last_anl_tuples as currently defined
is not
the right comparison point.Maybe what we should do is revert the pgstat_recv_tabstat() part of
the
patch in 8.1, and consider redefining last_anl_tuples in HEAD.
Caffeine
is not high enough yet to propose anything sensible, but I'll think
about it a bit later.--
Alvaro Herrera http://
www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Tom Lane wrote:
The reason I didn't patch it myself is that I'm not quite clear on what
*should* be happening here. What effect should a large delete have on
the ANALYZE threshold, exactly? You could argue that a deletion
potentially changes the statistics (by omission), and therefore inserts,
updates, and deletes should equally count +1 towards the analyze
threshold. I don't think we are implementing that though. If we want
to do it that way, I suspect last_anl_tuples as currently defined is not
the right comparison point.
Just as a point of reference, the old contrib pg_autovacuum counts ins +
upd + del against the analyze threshold where as the vacuum threshold
only compares against upd + del.