Minimum tuple threshold to decide last pass of VACUUM

Started by Michael Paquierover 10 years ago9 messages
#1Michael Paquier
michael.paquier@gmail.com

Hi all,

Commit 4046e58c (dated of 2001) has introduced the following comment
in vacuumlazy.c:
+       /* If any tuples need to be deleted, perform final vacuum cycle */
+       /* XXX put a threshold on min nuber of tuples here? */
+       if (vacrelstats->num_dead_tuples > 0)
In short, we may want to have a reloption to decide if we do or not
the last pass of VACUUM or not depending on a given number of
remaining tuples. Is this still something we would like to have?

Regards,
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Simon Riggs
simon@2ndQuadrant.com
In reply to: Michael Paquier (#1)
Re: Minimum tuple threshold to decide last pass of VACUUM

On 2 August 2015 at 13:13, Michael Paquier <michael.paquier@gmail.com>
wrote:

Hi all,

Commit 4046e58c (dated of 2001) has introduced the following comment
in vacuumlazy.c:
+       /* If any tuples need to be deleted, perform final vacuum cycle */
+       /* XXX put a threshold on min nuber of tuples here? */
+       if (vacrelstats->num_dead_tuples > 0)
In short, we may want to have a reloption to decide if we do or not
the last pass of VACUUM or not depending on a given number of
remaining tuples. Is this still something we would like to have?

I don't think we want a new user parameter, but we should have an internal
limit with a heuristic, similar to how we decide whether to truncate.

I would suggest this internal logic...

* If its a VACUUM FREEZE then index_scan_threshold = 0, i.e. always scan if
needed, since the user is requesting maximum vacuum

* For emergency anti-wraparound VACUUMs we shouldn't scan indexes at all,
since they aren't critical path activities at that point

* For normal VACUUMs we should scan indexes only if (num_dead_tuples * 20)

(blocks to be scanned in any one index), which allows some index bloat

but not much

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#3Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Simon Riggs (#2)
Re: Minimum tuple threshold to decide last pass of VACUUM

Simon Riggs wrote:

* For emergency anti-wraparound VACUUMs we shouldn't scan indexes at all,
since they aren't critical path activities at that point

It is not possible to skip scanning indexes completely, unless no tuples
are to be removed from the heap. Otherwise, index tuples become
lingering pointers (and when such heap address are later refilled, they
become corrupted indexscans).

But actually this is an interesting point and I don't think we do this:
if in emergency mode, maybe we shouldn't try to remove any dead tuples
at all, and instead only freeze very old tuples. That would make such
vacuums go much quicker. (More accurately, if the updating xid is older
than the freeze point, then remove the tuple, but otherwise keep it.)

My point is that emergency vacuums are troublesome for various reasons
and it would be better if they did only the minimum possible.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#3)
Re: Minimum tuple threshold to decide last pass of VACUUM

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Simon Riggs wrote:

* For emergency anti-wraparound VACUUMs we shouldn't scan indexes at all,
since they aren't critical path activities at that point

It is not possible to skip scanning indexes completely, unless no tuples
are to be removed from the heap.

Right.

But actually this is an interesting point and I don't think we do this:
if in emergency mode, maybe we shouldn't try to remove any dead tuples
at all, and instead only freeze very old tuples.

+1 ... not sure if that's what Simon had in mind exactly, but it seems
like a correct statement of what he was getting at.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#4)
Re: Minimum tuple threshold to decide last pass of VACUUM

On 3 August 2015 at 17:36, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Simon Riggs wrote:

* For emergency anti-wraparound VACUUMs we shouldn't scan indexes at

all,

since they aren't critical path activities at that point

It is not possible to skip scanning indexes completely, unless no tuples
are to be removed from the heap.

Right.

But actually this is an interesting point and I don't think we do this:
if in emergency mode, maybe we shouldn't try to remove any dead tuples
at all, and instead only freeze very old tuples.

+1 ... not sure if that's what Simon had in mind exactly, but it seems
like a correct statement of what he was getting at.

Yes, that's what I was thinking, I just didn't say actually it. I'd been
thinking about having VACUUM do just Phase 1 for some time, since its so
much faster to do that. Will code.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#6Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#3)
Re: Minimum tuple threshold to decide last pass of VACUUM

On Mon, Aug 3, 2015 at 11:52 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

Simon Riggs wrote:

* For emergency anti-wraparound VACUUMs we shouldn't scan indexes at all,
since they aren't critical path activities at that point

It is not possible to skip scanning indexes completely, unless no tuples
are to be removed from the heap. Otherwise, index tuples become
lingering pointers (and when such heap address are later refilled, they
become corrupted indexscans).

Well, if we skip the index scans, we can't do the second heap pass
either, but that's OK. I think we're all talking about the same thing
here, which is to do only the first heap pass in some cases. That
will prune dead tuples to line pointers, freeze old XIDs, and mark
pages all-visible where appropriate.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#2)
Re: Minimum tuple threshold to decide last pass of VACUUM

On Mon, Aug 3, 2015 at 4:13 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

* For normal VACUUMs we should scan indexes only if (num_dead_tuples * 20) >
(blocks to be scanned in any one index), which allows some index bloat but
not much

I think this kind of heuristic is good, but I think we should expose a
setting for it. There's no way for us to know without testing whether
the right value for that multiplier is 2 or 20 or 200 or 2000, and if
we don't make it easy to tweak, we'll never find out. It may even be
workload-dependent.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Simon Riggs (#5)
Re: Minimum tuple threshold to decide last pass of VACUUM

On 8/3/15 12:04 PM, Simon Riggs wrote:

Yes, that's what I was thinking, I just didn't say actually it. I'd been
thinking about having VACUUM do just Phase 1 for some time, since its so
much faster to do that. Will code.

I'd like to see that exposed as an option as well. There are certain
situations where you'd really like to just freeze things as fast as
possible, without waiting for a full vacuum.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Michael Paquier
michael.paquier@gmail.com
In reply to: Simon Riggs (#5)
Re: Minimum tuple threshold to decide last pass of VACUUM

On Tue, Aug 4, 2015 at 2:04 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 3 August 2015 at 17:36, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Simon Riggs wrote:

* For emergency anti-wraparound VACUUMs we shouldn't scan indexes at
all,
since they aren't critical path activities at that point

It is not possible to skip scanning indexes completely, unless no tuples
are to be removed from the heap.

Right.

But actually this is an interesting point and I don't think we do this:
if in emergency mode, maybe we shouldn't try to remove any dead tuples
at all, and instead only freeze very old tuples.

+1 ... not sure if that's what Simon had in mind exactly, but it seems
like a correct statement of what he was getting at.

Yes, that's what I was thinking, I just didn't say actually it. I'd been
thinking about having VACUUM do just Phase 1 for some time, since its so
much faster to do that. Will code.

Interesting. I'll be happy to have a look at any patch produced,
that's surely something we want to improve in emergency mode.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers