vac truncation scan problems
After freeing up the rows at the end of the table so it is eligible for
truncation, then running a manual VACUUM to actually release the space, I
kept running into the problem that the truncation scan was consistently
suspended and then aborted due to a conflicting lock requested/held.
But the perversity is that that conflicting lock request can only be
coming, as far as I can tell, from the autovac process. I'm not sure how
this happens, as I thought autovac never waited for locks but only obtained
one if it were instantaneously available, but that it is the only
explanation I can think of.
I'm not seeing this in 9.4, but I'm not sure how deterministic it is so
maybe that is just luck.
On an otherwise idle system:
pgbench -i -s 1000 -n
alter table pgbench_accounts drop CONSTRAINT pgbench_accounts_pkey ;
delete from pgbench_accounts;
vacuum verbose pgbench_accounts;
As soon as truncation scan starts, it suspends and then quickly terminates.
Anyone have theories on what's going on?
Cheers,
Jeff
On Mon, Mar 30, 2015 at 8:54 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
After freeing up the rows at the end of the table so it is eligible for
truncation, then running a manual VACUUM to actually release the space, I
kept running into the problem that the truncation scan was consistently
suspended and then aborted due to a conflicting lock requested/held.But the perversity is that that conflicting lock request can only be
coming, as far as I can tell, from the autovac process. I'm not sure how
this happens, as I thought autovac never waited for locks but only obtained
one if it were instantaneously available, but that it is the only
explanation I can think of.I'm not seeing this in 9.4, but I'm not sure how deterministic it is so
maybe that is just luck.
It looks like the culprit is this:
commit 0d831389749a3baaced7b984205b9894a82444b9
Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Wed Mar 18 11:52:33 2015 -0300
Rationalize vacuuming options and parameters
I'd guess the autovac nature of the autovac process is getting lost in
there where, but I don't see where.
Cheers,
Jeff
Hi, this is a bug in the commit 0d831389749a3baaced7b984205b9894a82444b9 .
It allows vucuum freeze to be skipped and inversely lets regular
vacuum wait for lock. The attched patch fixes it.
In table_recheck_autovac, vacuum options are determined as following,
tab->at_vacoptions = VACOPT_SKIPTOAST |
(dovacuum ? VACOPT_VACUUM : 0) |
(doanalyze ? VACOPT_ANALYZE : 0) |
!> (wraparound ? VACOPT_NOWAIT : 0);
The line prefixed by '!' looks inverted.
At Mon, 30 Mar 2015 23:42:51 -0700, Jeff Janes <jeff.janes@gmail.com> wrote in <CAMkU=1xmTEiaY=5oMHsSQo5vd9V1Ze4kNLL0qN2eH0P_GXOaYw@mail.gmail.com>
jeff.janes> On Mon, Mar 30, 2015 at 8:54 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
But the perversity is that that conflicting lock request can only be
coming, as far as I can tell, from the autovac process. I'm not sure how
this happens, as I thought autovac never waited for locks but only obtained
one if it were instantaneously available, but that it is the only
explanation I can think of.I'm not seeing this in 9.4, but I'm not sure how deterministic it is so
maybe that is just luck.
Good catch!
It looks like the culprit is this:
commit 0d831389749a3baaced7b984205b9894a82444b9
Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Wed Mar 18 11:52:33 2015 -0300Rationalize vacuuming options and parameters
I'd guess the autovac nature of the autovac process is getting lost in
there where, but I don't see where.
You're right, the patch does something like following,
+ tab->at_vacoptions = VACOPT_SKIPTOAST |
+ (dovacuum ? VACOPT_VACUUM : 0) |
+ (doanalyze ? VACOPT_ANALYZE : 0) |
+ (wraparound ? VACOPT_NOWAIT : 0);
..
- if (!tab->at_wraparound)
- vacstmt.options = VACOPT_NOWAIT;
This inverses the meaning of at_wraparound. No previous version
isn't affected by this.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
0001-Fix-polarity-of-wraparound-for-vacuum.patchtext/x-patch; charset=us-asciiDownload
>From 603651a9dfaa22b8a6d4983226257fcf570963a3 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Tue, 31 Mar 2015 16:58:53 +0900
Subject: [PATCH] Fix polarity of wraparound for vacuum.
The commit 0d831389749a3baaced7b984205b9894a82444b9 has inversed the
meaning of wraparound. This causes vacuums that *no* wraparound
required to wait for locks to be acquired. What is worse, this allows
the vacuums for freezing to be skipped.
---
src/backend/postmaster/autovacuum.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c93b412..080c3e6 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2526,7 +2526,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
tab->at_vacoptions = VACOPT_SKIPTOAST |
(dovacuum ? VACOPT_VACUUM : 0) |
(doanalyze ? VACOPT_ANALYZE : 0) |
- (wraparound ? VACOPT_NOWAIT : 0);
+ (!wraparound ? VACOPT_NOWAIT : 0);
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
--
2.1.0.GIT
On Tue, Mar 31, 2015 at 3:42 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
On Mon, Mar 30, 2015 at 8:54 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
After freeing up the rows at the end of the table so it is eligible for
truncation, then running a manual VACUUM to actually release the space, I
kept running into the problem that the truncation scan was consistently
suspended and then aborted due to a conflicting lock requested/held.But the perversity is that that conflicting lock request can only be
coming, as far as I can tell, from the autovac process. I'm not sure how
this happens, as I thought autovac never waited for locks but only obtained
one if it were instantaneously available, but that it is the only
explanation I can think of.I'm not seeing this in 9.4, but I'm not sure how deterministic it is so
maybe that is just luck.It looks like the culprit is this:
commit 0d831389749a3baaced7b984205b9894a82444b9
Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Wed Mar 18 11:52:33 2015 -0300Rationalize vacuuming options and parameters
I'd guess the autovac nature of the autovac process is getting lost in
there where, but I don't see where.
Hm. I ran a couple of tests and am noticing that a manual VACUUM is not
able to truncate all the pages (it should, no?)... For example with your
test case on REL9_4_STABLE VACUUM VERBOSE reports that all the pages are
truncated:
INFO: 00000: "pgbench_accounts": truncated 16394 to 0 pages
OK, on HEAD this does not seem to work:
INFO: 00000: "pgbench_accounts": truncated 16394 to 13554 pages
But if I try as well with 4559167c (0d831389~1) I am getting a similar
result:
INFO: 00000: "pgbench_accounts": truncated 16394 to 3309 pages
I will try to bisect to the origin of that. This may be related to what you
are seeing.
Regards,
--
Michael
On Tue, Mar 31, 2015 at 1:28 AM, Kyotaro HORIGUCHI <
horiguchi.kyotaro@lab.ntt.co.jp> wrote:
Hi, this is a bug in the commit 0d831389749a3baaced7b984205b9894a82444b9 .
It allows vucuum freeze to be skipped and inversely lets regular
vacuum wait for lock. The attched patch fixes it.In table_recheck_autovac, vacuum options are determined as following,
tab->at_vacoptions = VACOPT_SKIPTOAST |
(dovacuum ? VACOPT_VACUUM : 0) |
(doanalyze ? VACOPT_ANALYZE : 0) |!> (wraparound ? VACOPT_NOWAIT : 0);
The line prefixed by '!' looks inverted.
Thanks, it is obvious once you see it!
Your patch solved the problem for me.
Cheers,
Jeff
On Tue, Mar 31, 2015 at 1:29 AM, Michael Paquier <michael.paquier@gmail.com>
wrote:
On Tue, Mar 31, 2015 at 3:42 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
On Mon, Mar 30, 2015 at 8:54 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
After freeing up the rows at the end of the table so it is eligible for
truncation, then running a manual VACUUM to actually release the space, I
kept running into the problem that the truncation scan was consistently
suspended and then aborted due to a conflicting lock requested/held.But the perversity is that that conflicting lock request can only be
coming, as far as I can tell, from the autovac process. I'm not sure how
this happens, as I thought autovac never waited for locks but only obtained
one if it were instantaneously available, but that it is the only
explanation I can think of.I'm not seeing this in 9.4, but I'm not sure how deterministic it is so
maybe that is just luck.It looks like the culprit is this:
commit 0d831389749a3baaced7b984205b9894a82444b9
Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Wed Mar 18 11:52:33 2015 -0300Rationalize vacuuming options and parameters
I'd guess the autovac nature of the autovac process is getting lost in
there where, but I don't see where.Hm. I ran a couple of tests and am noticing that a manual VACUUM is not
able to truncate all the pages (it should, no?)... For example with your
test case on REL9_4_STABLE VACUUM VERBOSE reports that all the pages are
truncated:
INFO: 00000: "pgbench_accounts": truncated 16394 to 0 pagesOK, on HEAD this does not seem to work:
INFO: 00000: "pgbench_accounts": truncated 16394 to 13554 pagesBut if I try as well with 4559167c (0d831389~1) I am getting a similar
result:
INFO: 00000: "pgbench_accounts": truncated 16394 to 3309 pages
I was not seeing that on commits before 0d831389--although often once the
truncation scan had run for a while without problem, I would abort the
session and call it good, as my failures were always quick ones.
Did it tell you why? If it surrendered the lock to a competing process, it
should report that as previous INFO messages. If it doesn't give one of
those, then it probably just thinks there are some tuples it can't remove
yet somewhere. What did it give earlier up in the verbose output, for the
number of removed and nonremovable tuples?
Cheers,
Jeff
On Wed, Apr 1, 2015 at 2:18 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
On Tue, Mar 31, 2015 at 1:28 AM, Kyotaro HORIGUCHI <
horiguchi.kyotaro@lab.ntt.co.jp> wrote:Hi, this is a bug in the commit 0d831389749a3baaced7b984205b9894a82444b9 .
It allows vucuum freeze to be skipped and inversely lets regular
vacuum wait for lock. The attched patch fixes it.In table_recheck_autovac, vacuum options are determined as following,
tab->at_vacoptions = VACOPT_SKIPTOAST |
(dovacuum ? VACOPT_VACUUM : 0) |
(doanalyze ? VACOPT_ANALYZE : 0) |!> (wraparound ? VACOPT_NOWAIT : 0);
The line prefixed by '!' looks inverted.
Thanks, it is obvious once you see it!
Nice catch, Horiguchi-san.
--
Michael
On Wed, Apr 1, 2015 at 2:26 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
Did it tell you why? If it surrendered the lock to a competing process,
it should report that as previous INFO messages. If it doesn't give one of
those, then it probably just thinks there are some tuples it can't remove
yet somewhere. What did it give earlier up in the verbose output, for the
number of removed and nonremovable tuples?
I just had an extra look at that, and I just got trapped a0f5954a that
bumped max_wal_size from 128GB to 1GB.. Sorry for the noise.
--
Michael
By the way, what shoud we do about this?
- Waiting for someone's picking up this.
- Making another thread to attract notice
- Otherwise..
At Wed, 1 Apr 2015 10:49:55 +0900, Michael Paquier <michael.paquier@gmail.com> wrote in <CAB7nPqTMBd6=5i3ZOg9t6A0km4VZ=wNt4_rwOzPVm683-aQ=Qw@mail.gmail.com>
On Wed, Apr 1, 2015 at 2:18 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
On Tue, Mar 31, 2015 at 1:28 AM, Kyotaro HORIGUCHI <
horiguchi.kyotaro@lab.ntt.co.jp> wrote:Hi, this is a bug in the commit 0d831389749a3baaced7b984205b9894a82444b9 .
It allows vucuum freeze to be skipped and inversely lets regular
vacuum wait for lock. The attched patch fixes it.In table_recheck_autovac, vacuum options are determined as following,
tab->at_vacoptions = VACOPT_SKIPTOAST |
(dovacuum ? VACOPT_VACUUM : 0) |
(doanalyze ? VACOPT_ANALYZE : 0) |!> (wraparound ? VACOPT_NOWAIT : 0);
The line prefixed by '!' looks inverted.
Thanks, it is obvious once you see it!
Nice catch, Horiguchi-san.
Thank you:)
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Apr 1, 2015 at 4:35 PM, Kyotaro HORIGUCHI <
horiguchi.kyotaro@lab.ntt.co.jp> wrote:
By the way, what should we do about this?
- Waiting for someone's picking up this.
- Making another thread to attract notice
- Otherwise..
I am sure someone will show up quickly and push the fix you provided.
--
Michael
Hi,
At Wed, 1 Apr 2015 16:50:41 +0900, Michael Paquier <michael.paquier@gmail.com> wrote in <CAB7nPqTxVdpju+a5rk3p2VGe_gHaVk+ht97_HUgWfG9ULyho9w@mail.gmail.com>
I am sure someone will show up quickly and push the fix you provided.
Ok, I'll be a good boy.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Kyotaro HORIGUCHI wrote:
Hi, this is a bug in the commit 0d831389749a3baaced7b984205b9894a82444b9 .
It allows vucuum freeze to be skipped and inversely lets regular
vacuum wait for lock. The attched patch fixes it.In table_recheck_autovac, vacuum options are determined as following,
tab->at_vacoptions = VACOPT_SKIPTOAST |
(dovacuum ? VACOPT_VACUUM : 0) |
(doanalyze ? VACOPT_ANALYZE : 0) |!> (wraparound ? VACOPT_NOWAIT : 0);
The line prefixed by '!' looks inverted.
You're absolutely right. My mistake. Pushed your patch, thanks.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
Kyotaro HORIGUCHI wrote:
The line prefixed by '!' looks inverted.
You're absolutely right. My mistake. Pushed your patch, thanks.
Don't see any such commit from here?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers