allowing VACUUM to be cancelled for conflicting locks

Started by Abhijit Menon-Senover 11 years ago12 messages
#1Abhijit Menon-Sen
ams@2ndquadrant.com

Hi.

In the past, we've had situations where "everything is hung" turned out
to be because of a script that ran manual VACUUM that was holding some
lock. It's admittedly not a huge problem, but it might be useful if a
manual VACUUM could be cancelled the way autovacuum can be.

One way to do this would be to add an "allow_vacuum_cancel" GUC that, if
set, would cause VACUUM to set the (suitably renamed) PROC_IS_AUTOVACUUM
flag on its own process. Another would be to add an extra option to the
VACUUM command that enables this behaviour.

The former has the advantage of being backwards-compatible with existing
scripts that run manual VACUUM. The latter is arguably nicer.

Any thoughts?

-- Abhijit

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Abhijit Menon-Sen (#1)
Re: allowing VACUUM to be cancelled for conflicting locks

Abhijit Menon-Sen <ams@2ndquadrant.com> writes:

In the past, we've had situations where "everything is hung" turned out
to be because of a script that ran manual VACUUM that was holding some
lock. It's admittedly not a huge problem, but it might be useful if a
manual VACUUM could be cancelled the way autovacuum can be.

I think the real answer to that is "stop using manual VACUUM".

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#2)
Re: allowing VACUUM to be cancelled for conflicting locks

Tom Lane wrote:

Abhijit Menon-Sen <ams@2ndquadrant.com> writes:

In the past, we've had situations where "everything is hung" turned out
to be because of a script that ran manual VACUUM that was holding some
lock. It's admittedly not a huge problem, but it might be useful if a
manual VACUUM could be cancelled the way autovacuum can be.

I think the real answer to that is "stop using manual VACUUM".

As much as I'm a fan of autovacuum, that's not always possible.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Claudio Freire
klaussfreire@gmail.com
In reply to: Alvaro Herrera (#3)
Re: allowing VACUUM to be cancelled for conflicting locks

On Mon, Apr 28, 2014 at 12:52 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

Tom Lane wrote:

Abhijit Menon-Sen <ams@2ndquadrant.com> writes:

In the past, we've had situations where "everything is hung" turned out
to be because of a script that ran manual VACUUM that was holding some
lock. It's admittedly not a huge problem, but it might be useful if a
manual VACUUM could be cancelled the way autovacuum can be.

I think the real answer to that is "stop using manual VACUUM".

As much as I'm a fan of autovacuum, that's not always possible.

Or even recommended, unless the docs changed radically in the last
couple of weeks.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Andres Freund
andres@2ndquadrant.com
In reply to: Tom Lane (#2)
Re: allowing VACUUM to be cancelled for conflicting locks

On 2014-04-28 09:54:49 -0400, Tom Lane wrote:

Abhijit Menon-Sen <ams@2ndquadrant.com> writes:

In the past, we've had situations where "everything is hung" turned out
to be because of a script that ran manual VACUUM that was holding some
lock. It's admittedly not a huge problem, but it might be useful if a
manual VACUUM could be cancelled the way autovacuum can be.

I think the real answer to that is "stop using manual VACUUM".

E.g. manually scheduling the full table vacuums to happen during low
activity periods is a very good idea on busy servers.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Claudio Freire (#4)
Re: allowing VACUUM to be cancelled for conflicting locks

Claudio Freire <klaussfreire@gmail.com> writes:

On Mon, Apr 28, 2014 at 12:52 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

Tom Lane wrote:

Abhijit Menon-Sen <ams@2ndquadrant.com> writes:

In the past, we've had situations where "everything is hung" turned out
to be because of a script that ran manual VACUUM that was holding some
lock. It's admittedly not a huge problem, but it might be useful if a
manual VACUUM could be cancelled the way autovacuum can be.

I think the real answer to that is "stop using manual VACUUM".

As much as I'm a fan of autovacuum, that's not always possible.

Or even recommended, unless the docs changed radically in the last
couple of weeks.

Actually, having just looked at the code in question, I think this whole
thread is based on an obsolete assumption. AFAICS, since commit b19e4250b
manual vacuum behaves exactly like autovacuum as far as getting kicked off
the exclusive lock is concerned. There's certainly not any tests for
autovacuum in lazy_truncate_heap() today.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Jeff Janes
jeff.janes@gmail.com
In reply to: Tom Lane (#2)
Re: allowing VACUUM to be cancelled for conflicting locks

On Mon, Apr 28, 2014 at 6:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Abhijit Menon-Sen <ams@2ndquadrant.com> writes:

In the past, we've had situations where "everything is hung" turned out
to be because of a script that ran manual VACUUM that was holding some
lock. It's admittedly not a huge problem, but it might be useful if a
manual VACUUM could be cancelled the way autovacuum can be.

I think the real answer to that is "stop using manual VACUUM".

Autovac is also going to promote itself to uninterruptible once every 150e6
transactions (by the default settings).

To stop using manual vacuums is not going to be a complete cure anyway.

It would be nice to know why the scripts are doing the manual vacuum. Just
out of mythology, or is there an identifiable reason?

Cheers,

Jeff

#8Andres Freund
andres@2ndquadrant.com
In reply to: Tom Lane (#6)
Re: allowing VACUUM to be cancelled for conflicting locks

On 2014-04-28 13:58:10 -0400, Tom Lane wrote:

Claudio Freire <klaussfreire@gmail.com> writes:

On Mon, Apr 28, 2014 at 12:52 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

Tom Lane wrote:

Abhijit Menon-Sen <ams@2ndquadrant.com> writes:

In the past, we've had situations where "everything is hung" turned out
to be because of a script that ran manual VACUUM that was holding some
lock. It's admittedly not a huge problem, but it might be useful if a
manual VACUUM could be cancelled the way autovacuum can be.

I think the real answer to that is "stop using manual VACUUM".

As much as I'm a fan of autovacuum, that's not always possible.

Or even recommended, unless the docs changed radically in the last
couple of weeks.

Actually, having just looked at the code in question, I think this whole
thread is based on an obsolete assumption. AFAICS, since commit b19e4250b
manual vacuum behaves exactly like autovacuum as far as getting kicked off
the exclusive lock is concerned. There's certainly not any tests for
autovacuum in lazy_truncate_heap() today.

I don't think this is about the truncation thing, but about the
deadlock.c/proc.c logic around DS_BLOCKED_BY_AUTOVACUUM. I.e. that a
autovacuum is cancelled if user code tries to acquire a conflicting
lock.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Jeff Janes
jeff.janes@gmail.com
In reply to: Tom Lane (#6)
Re: allowing VACUUM to be cancelled for conflicting locks

On Mon, Apr 28, 2014 at 10:58 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Claudio Freire <klaussfreire@gmail.com> writes:

On Mon, Apr 28, 2014 at 12:52 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

Tom Lane wrote:

Abhijit Menon-Sen <ams@2ndquadrant.com> writes:

In the past, we've had situations where "everything is hung" turned

out

to be because of a script that ran manual VACUUM that was holding some
lock. It's admittedly not a huge problem, but it might be useful if a
manual VACUUM could be cancelled the way autovacuum can be.

I think the real answer to that is "stop using manual VACUUM".

As much as I'm a fan of autovacuum, that's not always possible.

Or even recommended, unless the docs changed radically in the last
couple of weeks.

Actually, having just looked at the code in question, I think this whole
thread is based on an obsolete assumption. AFAICS, since commit b19e4250b
manual vacuum behaves exactly like autovacuum as far as getting kicked off
the exclusive lock is concerned. There's certainly not any tests for
autovacuum in lazy_truncate_heap() today.

I assumed he was a talking about the SHARE UPDATE EXCLUSIVE used during the
main work, not the ACCESS EXCLUSIVE one used during truncation.

Cheers,

Jeff

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#8)
Re: allowing VACUUM to be cancelled for conflicting locks

Andres Freund <andres@2ndquadrant.com> writes:

I don't think this is about the truncation thing, but about the
deadlock.c/proc.c logic around DS_BLOCKED_BY_AUTOVACUUM. I.e. that a
autovacuum is cancelled if user code tries to acquire a conflicting
lock.

It's a bit of a stretch to claim that a manual VACUUM should be cancelled
by a manual DDL action elsewhere. Who's to say which of those things
should have priority?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Andres Freund
andres@2ndquadrant.com
In reply to: Tom Lane (#10)
Re: allowing VACUUM to be cancelled for conflicting locks

On 2014-04-28 14:05:04 -0400, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

I don't think this is about the truncation thing, but about the
deadlock.c/proc.c logic around DS_BLOCKED_BY_AUTOVACUUM. I.e. that a
autovacuum is cancelled if user code tries to acquire a conflicting
lock.

It's a bit of a stretch to claim that a manual VACUUM should be cancelled
by a manual DDL action elsewhere. Who's to say which of those things
should have priority?

Yea, I am not that sure about the feature either. It sure would need to
be optional. Often enough VACUUMs are scripted to run during off hours,
for those it possibly makes sense.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Jeff Janes
jeff.janes@gmail.com
In reply to: Tom Lane (#10)
Re: allowing VACUUM to be cancelled for conflicting locks

On Mon, Apr 28, 2014 at 11:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andres Freund <andres@2ndquadrant.com> writes:

I don't think this is about the truncation thing, but about the
deadlock.c/proc.c logic around DS_BLOCKED_BY_AUTOVACUUM. I.e. that a
autovacuum is cancelled if user code tries to acquire a conflicting
lock.

It's a bit of a stretch to claim that a manual VACUUM should be cancelled
by a manual DDL action elsewhere. Who's to say which of those things
should have priority?

The proposal was to add either a GUC, or a syntax to the vacuum command, so
it would be either DBA or the invoker of the vacuum which is the one to
say. Either one does seem a reasonable place to have such a say, although
perhaps not worth the effort to implement.

Cheers,

Jeff