PATCH to allow concurrent VACUUMs to not lock each other out from cleaning old tuples
The attached patch allows VACUUMS's on small relations to clean up dead
tuples while VACUUM or ANALYSE is running for a long time on some big
table.
This is done by adding a "bool inVacuum" to PGPROC and then making use
of it in GetOldestXmin.
This patch is against current CVS head, but should also apply to 8.0.2
with minorpach warnings.
--
Hannu Krosing <hannu@tm.ee>
Attachments:
vacuum-patch.difftext/x-patch; charset=ISO-8859-15; name=vacuum-patch.diffDownload+39-12
On K, 2005-05-18 at 11:54 +0300, Hannu Krosing wrote:
The attached patch allows VACUUMS's on small relations to clean up dead
tuples while VACUUM or ANALYSE is running for a long time on some big
table.This is done by adding a "bool inVacuum" to PGPROC and then making use
of it in GetOldestXmin.This patch is against current CVS head, but should also apply to 8.0.2
with minorpach warnings.
Could this patch be applied (or rejected if something is badly wrong
with it) ?
Or should I move the discussion back to pgsql-hackers ad try to make it
a TODO first ?
This patch implements what I described in
http://archives.postgresql.org/pgsql-hackers/2005-05/msg00704.php
plus a small change to make it work for simple ANALYSE too.
--
Hannu Krosing <hannu@skype.net>
Hannu Krosing wrote:
On K, 2005-05-18 at 11:54 +0300, Hannu Krosing wrote:
The attached patch allows VACUUMS's on small relations to clean up dead
tuples while VACUUM or ANALYSE is running for a long time on some big
table.This is done by adding a "bool inVacuum" to PGPROC and then making use
of it in GetOldestXmin.This patch is against current CVS head, but should also apply to 8.0.2
with minorpach warnings.Could this patch be applied (or rejected if something is badly wrong
with it) ?Or should I move the discussion back to pgsql-hackers ad try to make it
a TODO first ?This patch implements what I described in
http://archives.postgresql.org/pgsql-hackers/2005-05/msg00704.php
plus a small change to make it work for simple ANALYSE too.
I have it in my mailbox and will get to it. Thanks.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Hannu Krosing wrote:
*** src/backend/access/transam/xact.c 28 Apr 2005 21:47:10 -0000 1.200 --- src/backend/access/transam/xact.c 17 May 2005 22:06:34 -0000 *************** *** 1411,1416 **** --- 1411,1424 ---- AfterTriggerBeginXact();/* + * mark the transaction as not VACUUM (vacuum_rel will set isVacuum to true + * directly after calling BeginTransactionCommand() ) + */ + if (MyProc != NULL) + { + MyProc->inVacuum = false; + }
I'm a little worried about having this set to "true" after a VACUUM is
executed, and only reset to false when the next transaction is begun: it
shouldn't affect correctness right now, but it seems like asking for
trouble. Resetting the flag to "false" after processing a transaction
would probably be worth doing.
*** src/backend/commands/vacuum.c 6 May 2005 17:24:53 -0000 1.308 --- src/backend/commands/vacuum.c 17 May 2005 22:06:35 -0000 *************** *** 420,425 **** --- 418,428 ---- if (use_own_xacts) { StartTransactionCommand(); + if (MyProc != NULL) /* is this needed here ? */ + { + /* so other vacuums don't look at our xid/xmin in GetOldestXmin() */ + MyProc->inVacuum = true; + } /* functions in indexes may want a snapshot set */ ActiveSnapshot = CopySnapshot(GetTransactionSnapshot()); }
Is it valid to apply this optimization to ANALYZE? Since ANALYZE updates
pg_statistic, ISTM it can affect tuple visibility.
*** src/backend/storage/ipc/sinval.c 31 Dec 2004 22:00:56 -0000 1.75 --- src/backend/storage/ipc/sinval.c 17 May 2005 22:06:36 -0000 *************** *** 845,854 **** * them as running anyway. We also assume that such xacts * can't compute an xmin older than ours, so they needn't be * considered in computing globalxmin. */ if (proc == MyProc || !TransactionIdIsNormal(xid) || ! TransactionIdFollowsOrEquals(xid, xmax)) continue;if (TransactionIdPrecedes(xid, xmin)) --- 845,858 ---- * them as running anyway. We also assume that such xacts * can't compute an xmin older than ours, so they needn't be * considered in computing globalxmin. + * + * there is also no need to consider transaxtions runnibg the + * vacuum command as it will not affect tuple visibility */
Typos.
-Neil
Neil Conway <neilc@samurai.com> writes:
I'm a little worried about having this set to "true" after a VACUUM is
executed, and only reset to false when the next transaction is begun: it
shouldn't affect correctness right now, but it seems like asking for
trouble. Resetting the flag to "false" after processing a transaction
would probably be worth doing.
These days I'd be inclined to use a PG_TRY construct to guarantee the
flag is cleared, rather than loading another cleanup operation onto
unrelated code.
The MyProc != NULL tests are a waste of code space. You can't even
acquire an LWLock without MyProc being set, let alone access tables.
The real issue here though is whether anyone can blow a hole in the
xmin assumptions: is there any situation where ignoring a vacuum
transaction breaks things? I haven't had time to think about it
in any detail, but it definitely needs to be thought about.
regards, tom lane
On E, 2005-05-23 at 10:16 -0400, Tom Lane wrote:
Neil Conway <neilc@samurai.com> writes:
I'm a little worried about having this set to "true" after a VACUUM is
executed, and only reset to false when the next transaction is begun: it
shouldn't affect correctness right now, but it seems like asking for
trouble. Resetting the flag to "false" after processing a transaction
would probably be worth doing.These days I'd be inclined to use a PG_TRY construct to guarantee the
flag is cleared, rather than loading another cleanup operation onto
unrelated code.
Ok, will check out PG_TRY. I hoped that there is some way not to set
inVacuum to false at each transaction start and still be sure that it
will be reverted after vacuum_rel.
So I'll set it once at the start of connection and then maintain it in
vacuum_rel() using PG_TRY.
The MyProc != NULL tests are a waste of code space. You can't even
acquire an LWLock without MyProc being set, let alone access tables.
Thanks, I'll get rid of them.
The real issue here though is whether anyone can blow a hole in the
xmin assumptions: is there any situation where ignoring a vacuum
transaction breaks things? I haven't had time to think about it
in any detail, but it definitely needs to be thought about.
There may be need to exclude vacuum/analyse on system relations from
being ignored by vacuum_rel() as I suspect that the info they both write
to pg_class, pg_attribute, and possibly other tables may be vulnerable
to crashes at right moment.
Also it may be prudent to not exclude other vacuums, when the vacuum_rel
() itself is run on a system relation.
I'm not sure which way it is, as my head gets all thick each time I try
to figure it out ;p.
I can't think of any other cases where it could matter, as at least the
work done inside vacuum_rel() itself seema non-rollbackable.
--
Hannu Krosing <hannu@skype.net>
Hannu Krosing <hannu@skype.net> writes:
I can't think of any other cases where it could matter, as at least the
work done inside vacuum_rel() itself seema non-rollbackable.
VACUUM FULL's tuple-moving is definitely roll-back-able, so it might be
prudent to only do this for lazy VACUUM. But on the other hand, VACUUM
FULL holds an exclusive lock on the table so no one else is going to see
its effects concurrently anyway.
As I said, it needs more thought than I've been able to spare for it yet
...
regards, tom lane
On E, 2005-05-23 at 11:42 -0400, Tom Lane wrote:
Hannu Krosing <hannu@skype.net> writes:
I can't think of any other cases where it could matter, as at least the
work done inside vacuum_rel() itself seema non-rollbackable.VACUUM FULL's tuple-moving is definitely roll-back-able, so it might be
prudent to only do this for lazy VACUUM. But on the other hand, VACUUM
FULL holds an exclusive lock on the table so no one else is going to see
its effects concurrently anyway.
I'm not interested in VACUUM FULL at all. This is improvement mainly for
heavy update OLAP databases, where I would not even think of running
VACUUM FULL.
I'll cheks if there's an easy way to exclude VACUUM FULL.
As I said, it needs more thought than I've been able to spare for it yet
...
Ok, thanks for comments this far .
--
Hannu Krosing <hannu@skype.net>
On E, 2005-05-23 at 11:42 -0400, Tom Lane wrote:
Hannu Krosing <hannu@skype.net> writes:
I can't think of any other cases where it could matter, as at least the
work done inside vacuum_rel() itself seema non-rollbackable.VACUUM FULL's tuple-moving is definitely roll-back-able, so it might be
prudent to only do this for lazy VACUUM. But on the other hand, VACUUM
FULL holds an exclusive lock on the table so no one else is going to see
its effects concurrently anyway.
Ok, this is a new version of the vacuum patch with the following changes
following some suggestions in this thread.
* changed the patch to affect only lazy vacuum
* moved inVacuum handling to use PG_TRY
* moved vac_update_relstats() out of lazy_vacuum_rel into a separate
transaction. The code to do this may not be the prettiest, maybe it
should use a separate struct.
--
Hannu Krosing <hannu@skype.net>
Attachments:
vacuum-patch-8_1.difftext/x-patch; charset=ISO-8859-15; name=vacuum-patch-8_1.diffDownload+94-68
Hannu Krosing <hannu@skype.net> writes:
Ok, this is a new version of the vacuum patch with the following changes
following some suggestions in this thread.
The more I look at this, the uglier it looks ... and I still haven't
seen any convincing demonstration that it *works*, ie doesn't have
bad side-effects on the transaction-is-in-progress logic. I'm
particularly concerned about what happens to the RecentXmin horizon
for pg_subtrans and pg_multixact operations.
regards, tom lane
On P, 2005-07-03 at 12:19 -0400, Tom Lane wrote:
Hannu Krosing <hannu@skype.net> writes:
Ok, this is a new version of the vacuum patch with the following changes
following some suggestions in this thread.The more I look at this, the uglier it looks ... and I still haven't
seen any convincing demonstration that it *works*, ie doesn't have
bad side-effects on the transaction-is-in-progress logic.
The function GetOldestXmin is used *only* when determining oldest xmin
for transactions.
I'm particularly concerned about what happens to the RecentXmin horizon
for pg_subtrans and pg_multixact operations.
How are they affected by this change ? They should still see the vacuum
as oldest transaction, unless they
Oh, now I see. I'm pretty sure that at the time of original patch, the
*only* uses of GetOldestXmin was from VACUUM and catalog/index.c and
both for the same purpose, but now I see also a call from
access/transam/xlog.c.
Perhaps I should separate the function used by vacuum into another
function, say GetOldestDataChangingXmin(), to keep the possible impact
as localised as possible.
Do you have any specific concerns related to this patch after that ?
Or should I just back off for now and maybe start a separate project for
ironing out patches related to running postgresql in real-world 24/7
OLTP environment (similar to what Bizgres does for OLAP ) ?
--
Hannu Krosing <hannu@skype.net>
On E, 2005-07-04 at 10:24 +0300, Hannu Krosing wrote:
On P, 2005-07-03 at 12:19 -0400, Tom Lane wrote:
Hannu Krosing <hannu@skype.net> writes:
Ok, this is a new version of the vacuum patch with the following changes
following some suggestions in this thread.The more I look at this, the uglier it looks ... and I still haven't
seen any convincing demonstration that it *works*, ie doesn't have
bad side-effects on the transaction-is-in-progress logic.
Ok, I changed GetOldestXmin() to use proc->inVacuum only when
determining the oldest visible xid for vacuum and index (i.e. which
tuples are safe to delete and which tuples there is no need to index).
The third use on GetOldestXmin() in xlog.c is changed to use old
GetOldestXmin() logic.
My reasoning for why the patch should work is as follows:
1) the only transaction during which inVacuum is set is the 2nd
transaction (of originally 3, now 4) of lazy VACUUM, which does simple
heap scanning and old tuple removal (lazy_vacuum_rel()), and does no
externally visible changes to the data. It only removes tuples which are
already invisible to all running transactions.
2) That transaction never deletes, updates or inserts any tuples on it
own.
3) As it can't add any tuples or change any existing tuples to have its
xid as either xmin or xmax, it already does run logically "outside of
transactions".
4) The only use made of of proc->inVacuum is when determining which
tuples are safe to remove (in vacuum.c) or not worth indexing (in
index.c) and thus can't affect anything else.
I can easily demonstrate that it "works" in the sense that it allows
several concurrent vacuums to clean out old tuples, and I have thus far
been unable to construct the counterexample where it does anything bad.
Could you tell me which part of my reasoning is wrong or what else do I
overlook.
--
Hannu Krosing <hannu@tm.ee>
Attachments:
vacuum-patch-8_1a.difftext/x-patch; charset=ISO-8859-15; name=vacuum-patch-8_1a.diffDownload+105-79
This has been saved for the 8.2 release:
http://momjian.postgresql.org/cgi-bin/pgpatches_hold
---------------------------------------------------------------------------
Hannu Krosing wrote:
On E, 2005-07-04 at 10:24 +0300, Hannu Krosing wrote:
On P, 2005-07-03 at 12:19 -0400, Tom Lane wrote:
Hannu Krosing <hannu@skype.net> writes:
Ok, this is a new version of the vacuum patch with the following changes
following some suggestions in this thread.The more I look at this, the uglier it looks ... and I still haven't
seen any convincing demonstration that it *works*, ie doesn't have
bad side-effects on the transaction-is-in-progress logic.Ok, I changed GetOldestXmin() to use proc->inVacuum only when
determining the oldest visible xid for vacuum and index (i.e. which
tuples are safe to delete and which tuples there is no need to index).The third use on GetOldestXmin() in xlog.c is changed to use old
GetOldestXmin() logic.My reasoning for why the patch should work is as follows:
1) the only transaction during which inVacuum is set is the 2nd
transaction (of originally 3, now 4) of lazy VACUUM, which does simple
heap scanning and old tuple removal (lazy_vacuum_rel()), and does no
externally visible changes to the data. It only removes tuples which are
already invisible to all running transactions.2) That transaction never deletes, updates or inserts any tuples on it
own.3) As it can't add any tuples or change any existing tuples to have its
xid as either xmin or xmax, it already does run logically "outside of
transactions".4) The only use made of of proc->inVacuum is when determining which
tuples are safe to remove (in vacuum.c) or not worth indexing (in
index.c) and thus can't affect anything else.I can easily demonstrate that it "works" in the sense that it allows
several concurrent vacuums to clean out old tuples, and I have thus far
been unable to construct the counterexample where it does anything bad.Could you tell me which part of my reasoning is wrong or what else do I
overlook.--
Hannu Krosing <hannu@tm.ee>
[ Attachment, skipping... ]
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
On R, 2005-08-12 at 15:47 -0400, Bruce Momjian wrote:
This has been saved for the 8.2 release:
Is there any particular reason for not putting it in 8.1 ?
--
Hannu Krosing <hannu@tm.ee>
Hannu Krosing wrote:
On R, 2005-08-12 at 15:47 -0400, Bruce Momjian wrote:
This has been saved for the 8.2 release:
Is there any particular reason for not putting it in 8.1 ?
I thought there was still uncertainty about the patch. Is there?
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian <pgman@candle.pha.pa.us> writes:
Is there any particular reason for not putting it in 8.1 ?
I thought there was still uncertainty about the patch. Is there?
Considerable uncertainty, in my mind. What we've got here is some
pretty fundamental hacking on the transaction visibility logic, and
neither Hannu nor anyone else has produced a convincing argument
that it's correct. "It hasn't failed yet in my usage" isn't enough
to give me a good feeling about it. Some specific concerns:
* Given that VACUUM ANALYZE does create new output tuples stamped with
its xid, I'm unclear on what happens in pg_statistic with this code in
place. It seems entirely possible that someone might conclude the
analyze tuples are from a crashed transaction and mark them invalid
before the analyze can commit (notice TransactionIdIsInProgress does not
bother looking in PGPROC when the tuple xmin is less than RecentXmin).
* If the vacuum xact is older than what others think is the global xmin,
it could have problems with other vacuums removing tuples it should
still be able to see (presumably only in the system catalogs, so maybe
this isn't an issue, but I'm unsure). A related scenario that I don't
think can be dismissed is someone truncating off part of pg_subtrans or
pg_multixact that the vacuum still needs.
regards, tom lane
Just for the archives, attached is as far as I'd gotten with cleaning up
Hannu's patch before I realized that it wasn't doing what it needed to
do. This fixes an end-of-transaction race condition (can't unset
inVacuum before xact end, unless you want OldestXmin going backwards
from the point of view of other people) and improves the documentation
of what's going on. But unless someone can convince me that it's safe
to mess with GetSnapshotData, it's unlikely this'll ever get applied.
regards, tom lane
On K, 2005-08-17 at 15:40 -0400, Tom Lane wrote:
Saatja:
Tom Lane <tgl@sss.pgh.pa.us>
Kellele:
Bruce Momjian
<pgman@candle.pha.pa.us>, Hannu
Krosing <hannu@tm.ee>, Neil Conway
<neilc@samurai.com>, pgsql-
patches@postgresql.org
Teema:
Re: [PATCHES] PATCH to allow
concurrent VACUUMs to not lock each
Kuupᅵev:
Wed, 17 Aug 2005 15:40:53 -0400
(22:40 EEST)Just for the archives, attached is as far as I'd gotten with cleaning
up
Hannu's patch before I realized that it wasn't doing what it needed to
do. This fixes an end-of-transaction race condition (can't unset
inVacuum before xact end, unless you want OldestXmin going backwards
from the point of view of other people) and improves the documentation
of what's going on. But unless someone can convince me that it's safe
to mess with GetSnapshotData, it's unlikely this'll ever get applied.
Attached is a patch, based on you last one, which messes with
GetSnapshotData in what I think is a safe way.
It introduces another attribute to PROC , proc->nonInVacuumXmin and
computes this in addition to prox->xmin inside GetSnapshotData.
When (and only when) GetOldestXmin is called with ignoreVacuum=true,
then proc->nonInVacuumXmin is checked instead of prox->xmin.
I believe that this will make this change invisible to all other places
where GetSnapshotData or GetOldestXmin is used.
--
Hannu Krosing <hannu@skype.net>
Attachments:
vacuum-patch-8.1-2005.08.24.difftext/x-patch; charset=ISO-8859-15; name=vacuum-patch-8.1-2005.08.24.diffDownload+110-73
This has been saved for the 8.2 release:
http://momjian.postgresql.org/cgi-bin/pgpatches_hold
---------------------------------------------------------------------------
Hannu Krosing wrote:
On K, 2005-08-17 at 15:40 -0400, Tom Lane wrote:
Saatja:
Tom Lane <tgl@sss.pgh.pa.us>
Kellele:
Bruce Momjian
<pgman@candle.pha.pa.us>, Hannu
Krosing <hannu@tm.ee>, Neil Conway
<neilc@samurai.com>, pgsql-
patches@postgresql.org
Teema:
Re: [PATCHES] PATCH to allow
concurrent VACUUMs to not lock each
Kuup?ev:
Wed, 17 Aug 2005 15:40:53 -0400
(22:40 EEST)Just for the archives, attached is as far as I'd gotten with cleaning
up
Hannu's patch before I realized that it wasn't doing what it needed to
do. This fixes an end-of-transaction race condition (can't unset
inVacuum before xact end, unless you want OldestXmin going backwards
from the point of view of other people) and improves the documentation
of what's going on. But unless someone can convince me that it's safe
to mess with GetSnapshotData, it's unlikely this'll ever get applied.Attached is a patch, based on you last one, which messes with
GetSnapshotData in what I think is a safe way.It introduces another attribute to PROC , proc->nonInVacuumXmin and
computes this in addition to prox->xmin inside GetSnapshotData.When (and only when) GetOldestXmin is called with ignoreVacuum=true,
then proc->nonInVacuumXmin is checked instead of prox->xmin.I believe that this will make this change invisible to all other places
where GetSnapshotData or GetOldestXmin is used.--
Hannu Krosing <hannu@skype.net>
[ Attachment, skipping... ]
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
This is going to need a significant safety review.
Your patch has been added to the PostgreSQL unapplied patches list at:
http://momjian.postgresql.org/cgi-bin/pgpatches
It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.
---------------------------------------------------------------------------
Hannu Krosing wrote:
On K, 2005-08-17 at 15:40 -0400, Tom Lane wrote:
Saatja:
Tom Lane <tgl@sss.pgh.pa.us>
Kellele:
Bruce Momjian
<pgman@candle.pha.pa.us>, Hannu
Krosing <hannu@tm.ee>, Neil Conway
<neilc@samurai.com>, pgsql-
patches@postgresql.org
Teema:
Re: [PATCHES] PATCH to allow
concurrent VACUUMs to not lock each
Kuup?ev:
Wed, 17 Aug 2005 15:40:53 -0400
(22:40 EEST)Just for the archives, attached is as far as I'd gotten with cleaning
up
Hannu's patch before I realized that it wasn't doing what it needed to
do. This fixes an end-of-transaction race condition (can't unset
inVacuum before xact end, unless you want OldestXmin going backwards
from the point of view of other people) and improves the documentation
of what's going on. But unless someone can convince me that it's safe
to mess with GetSnapshotData, it's unlikely this'll ever get applied.Attached is a patch, based on you last one, which messes with
GetSnapshotData in what I think is a safe way.It introduces another attribute to PROC , proc->nonInVacuumXmin and
computes this in addition to prox->xmin inside GetSnapshotData.When (and only when) GetOldestXmin is called with ignoreVacuum=true,
then proc->nonInVacuumXmin is checked instead of prox->xmin.I believe that this will make this change invisible to all other places
where GetSnapshotData or GetOldestXmin is used.--
Hannu Krosing <hannu@skype.net>
[ Attachment, skipping... ]
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
--
Bruce Momjian http://candle.pha.pa.us
SRA OSS, Inc. http://www.sraoss.com
+ If your life is a hard drive, Christ can be your backup. +