The vacuum-ignore-vacuum patch
Hi,
Hannu Krossing asked me about his patch to ignore transactions running
VACUUM LAZY in other vacuum transactions. I attach a version of the
patch updated to the current sources.
Just to remind what this is about: the point of the patch is to be able
to run more than one VACUUM LAZY simultaneously and not have them
interefere with each other. For example, assume you have a database
with two tables, one very big and another very small but with a high
update rate. One usually wants to vacuum the small one very frequently
in order to keep the number of dead tuples low. But if one starts to
vacuum the big table, it will take a long time, during which the vacuums
applied to the smaller table won't be able to recover any tuple because
that transaction will think the other transaction may want to read some
of the tuples that the small transaction is trying to remove.
We know this is not so -- a VACUUM can only be run in a standalone
transaction, and it only checks the one table it's vacuuming. Thus we
can optimize the vacuuming so that if the only thing that's holding the
tuples undeletable is another big vacuum operation, ignore it and delete
the tuples anyway.
One exception is that we can't do that with full vacuums. The reason is
that full vacuum may want to run user-defined functions to be able to
index the tuples it moves. This isn't a problem normally, except in the
case where the function tries to scan some other table: if we ignored
that transaction, then another lazy vacuum might delete tuples from that
table that we need to see.
In a previous version of the patch, there was a note somewhere that made
the code not ignore lazy vacuums in the case where we were running
database-wide vacuums. The reason was that the value we computed was
also used as truncate point for pg_clog; thus if we ignored that
transaction, the truncate point could be further ahead than the vacuum,
so the clog page for the vacuum transaction could be gone and it
wouldn't be able to commit. This is no longer the case, because with
the patch I committed yesterday, the clog truncation point is calculated
differently and thus we don't need to take special care about this.
--
Alvaro Herrera http://www.advogato.org/person/alvherre
"Uno combate cuando es necesario... �no cuando est� de humor!
El humor es para el ganado, o para hacer el amor, o para tocar el
baliset. No para combatir." (Gurney Halleck)
Attachments:
ignore-vacuum-2006-07-11.patchtext/plain; charset=us-asciiDownload
Index: src/backend/access/transam/twophase.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/access/transam/twophase.c,v
retrieving revision 1.19
diff -c -r1.19 twophase.c
*** src/backend/access/transam/twophase.c 5 Mar 2006 15:58:22 -0000 1.19
--- src/backend/access/transam/twophase.c 11 Jul 2006 16:44:03 -0000
***************
*** 279,284 ****
--- 279,286 ----
gxact->proc.pid = 0;
gxact->proc.databaseId = databaseid;
gxact->proc.roleId = owner;
+ gxact->proc.inVacuum = false;
+ gxact->proc.nonInVacuumXmin = InvalidTransactionId;
gxact->proc.lwWaiting = false;
gxact->proc.lwExclusive = false;
gxact->proc.lwWaitLink = NULL;
Index: src/backend/access/transam/xact.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/access/transam/xact.c,v
retrieving revision 1.221
diff -c -r1.221 xact.c
*** src/backend/access/transam/xact.c 20 Jun 2006 22:51:59 -0000 1.221
--- src/backend/access/transam/xact.c 11 Jul 2006 16:44:03 -0000
***************
*** 1529,1534 ****
--- 1529,1536 ----
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
MyProc->xid = InvalidTransactionId;
MyProc->xmin = InvalidTransactionId;
+ MyProc->inVacuum = false; /* must be cleared with xid/xmin */
+ MyProc->nonInVacuumXmin = InvalidTransactionId; /* this too */
/* Clear the subtransaction-XID cache too while holding the lock */
MyProc->subxids.nxids = 0;
***************
*** 1762,1767 ****
--- 1764,1771 ----
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
MyProc->xid = InvalidTransactionId;
MyProc->xmin = InvalidTransactionId;
+ MyProc->inVacuum = false; /* must be cleared with xid/xmin */
+ MyProc->nonInVacuumXmin = InvalidTransactionId; /* this too */
/* Clear the subtransaction-XID cache too while holding the lock */
MyProc->subxids.nxids = 0;
***************
*** 1925,1930 ****
--- 1929,1936 ----
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
MyProc->xid = InvalidTransactionId;
MyProc->xmin = InvalidTransactionId;
+ MyProc->inVacuum = false; /* must be cleared with xid/xmin */
+ MyProc->nonInVacuumXmin = InvalidTransactionId; /* this too */
/* Clear the subtransaction-XID cache too while holding the lock */
MyProc->subxids.nxids = 0;
Index: src/backend/access/transam/xlog.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.242
diff -c -r1.242 xlog.c
*** src/backend/access/transam/xlog.c 27 Jun 2006 18:59:17 -0000 1.242
--- src/backend/access/transam/xlog.c 11 Jul 2006 16:44:03 -0000
***************
*** 5417,5423 ****
* StartupSUBTRANS hasn't been called yet.
*/
if (!InRecovery)
! TruncateSUBTRANS(GetOldestXmin(true));
if (!shutdown)
ereport(DEBUG2,
--- 5417,5423 ----
* StartupSUBTRANS hasn't been called yet.
*/
if (!InRecovery)
! TruncateSUBTRANS(GetOldestXmin(true, false));
if (!shutdown)
ereport(DEBUG2,
Index: src/backend/catalog/index.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/catalog/index.c,v
retrieving revision 1.268
diff -c -r1.268 index.c
*** src/backend/catalog/index.c 3 Jul 2006 22:45:37 -0000 1.268
--- src/backend/catalog/index.c 11 Jul 2006 16:44:03 -0000
***************
*** 1365,1371 ****
else
{
snapshot = SnapshotAny;
! OldestXmin = GetOldestXmin(heapRelation->rd_rel->relisshared);
}
scan = heap_beginscan(heapRelation, /* relation */
--- 1365,1372 ----
else
{
snapshot = SnapshotAny;
! /* okay to ignore lazy VACUUMs here */
! OldestXmin = GetOldestXmin(heapRelation->rd_rel->relisshared, true);
}
scan = heap_beginscan(heapRelation, /* relation */
Index: src/backend/commands/vacuum.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/commands/vacuum.c,v
retrieving revision 1.333
diff -c -r1.333 vacuum.c
*** src/backend/commands/vacuum.c 10 Jul 2006 16:20:50 -0000 1.333
--- src/backend/commands/vacuum.c 11 Jul 2006 20:03:35 -0000
***************
*** 40,45 ****
--- 40,46 ----
#include "postmaster/autovacuum.h"
#include "storage/freespace.h"
#include "storage/pmsignal.h"
+ #include "storage/proc.h"
#include "storage/procarray.h"
#include "storage/smgr.h"
#include "tcop/pquery.h"
***************
*** 594,600 ****
{
TransactionId limit;
! *oldestXmin = GetOldestXmin(sharedRel);
Assert(TransactionIdIsNormal(*oldestXmin));
--- 595,607 ----
{
TransactionId limit;
! /*
! * We can always ignore processes running lazy vacuum. This is because we
! * use these values only for deciding which tuples we must keep in the
! * tables. Since lazy vacuum doesn't write its xid to the table, it's
! * safe to ignore it.
! */
! *oldestXmin = GetOldestXmin(sharedRel, true);
Assert(TransactionIdIsNormal(*oldestXmin));
***************
*** 650,655 ****
--- 657,667 ----
* pg_class would've been obsoleted. Of course, this only works for
* fixed-size never-null columns, but these are.
*
+ * Another reason for doing it this way is that when we are in a lazy
+ * VACUUM and have inVacuum set, we mustn't do any updates --- somebody
+ * vacuuming pg_class might think they could delete a tuple marked with
+ * xmin = our xid.
+ *
* This routine is shared by full VACUUM, lazy VACUUM, and stand-alone
* ANALYZE.
*/
***************
*** 1001,1008 ****
/* Begin a transaction for vacuuming this relation */
StartTransactionCommand();
! /* functions in indexes may want a snapshot set */
! ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());
/*
* Tell the cache replacement strategy that vacuum is causing all
--- 1013,1047 ----
/* Begin a transaction for vacuuming this relation */
StartTransactionCommand();
!
! if (vacstmt->full)
! {
! /* functions in indexes may want a snapshot set */
! ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());
! }
! else
! {
! /*
! * During a lazy VACUUM we do not run any user-supplied functions,
! * and so it should be safe to not create a transaction snapshot.
! *
! * We can furthermore set the inVacuum flag, which lets other
! * concurrent VACUUMs know that they can ignore this one while
! * determining their OldestXmin. (The reason we don't set inVacuum
! * during a full VACUUM is exactly that we may have to run user-
! * defined functions for functional indexes, and we want to make
! * sure that if they use the snapshot set above, any tuples it
! * requires can't get removed from other tables. An index function
! * that depends on the contents of other tables is arguably broken,
! * but we won't break it here by violating transaction semantics.)
! *
! * Note: the inVacuum flag remains set until CommitTransaction or
! * AbortTransaction. We don't want to clear it until we reset
! * MyProc->xid/xmin, else OldestXmin might appear to go backwards,
! * which is probably Not Good.
! */
! MyProc->inVacuum = true;
! }
/*
* Tell the cache replacement strategy that vacuum is causing all
Index: src/backend/storage/ipc/procarray.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/storage/ipc/procarray.c,v
retrieving revision 1.12
diff -c -r1.12 procarray.c
*** src/backend/storage/ipc/procarray.c 19 Jun 2006 01:51:21 -0000 1.12
--- src/backend/storage/ipc/procarray.c 11 Jul 2006 20:26:35 -0000
***************
*** 387,406 ****
* If allDbs is TRUE then all backends are considered; if allDbs is FALSE
* then only backends running in my own database are considered.
*
* This is used by VACUUM to decide which deleted tuples must be preserved
* in a table. allDbs = TRUE is needed for shared relations, but allDbs =
* FALSE is sufficient for non-shared relations, since only backends in my
! * own database could ever see the tuples in them.
*
* This is also used to determine where to truncate pg_subtrans. allDbs
! * must be TRUE for that case.
*
* Note: we include the currently running xids in the set of considered xids.
* This ensures that if a just-started xact has not yet set its snapshot,
* when it does set the snapshot it cannot set xmin less than what we compute.
*/
TransactionId
! GetOldestXmin(bool allDbs)
{
ProcArrayStruct *arrayP = procArray;
TransactionId result;
--- 387,410 ----
* If allDbs is TRUE then all backends are considered; if allDbs is FALSE
* then only backends running in my own database are considered.
*
+ * If ignoreVacuum is TRUE then backends with inVacuum set are ignored.
+ *
* This is used by VACUUM to decide which deleted tuples must be preserved
* in a table. allDbs = TRUE is needed for shared relations, but allDbs =
* FALSE is sufficient for non-shared relations, since only backends in my
! * own database could ever see the tuples in them. Also, we can ignore
! * concurrently running lazy VACUUMs because (a) they must be working on other
! * tables, and (b) they don't need to do snapshot-based lookups.
*
* This is also used to determine where to truncate pg_subtrans. allDbs
! * must be TRUE for that case, and ignoreVacuum FALSE.
*
* Note: we include the currently running xids in the set of considered xids.
* This ensures that if a just-started xact has not yet set its snapshot,
* when it does set the snapshot it cannot set xmin less than what we compute.
*/
TransactionId
! GetOldestXmin(bool allDbs, bool ignoreVacuum)
{
ProcArrayStruct *arrayP = procArray;
TransactionId result;
***************
*** 424,429 ****
--- 428,436 ----
{
PGPROC *proc = arrayP->procs[index];
+ if (ignoreVacuum && proc->inVacuum)
+ continue;
+
if (allDbs || proc->databaseId == MyDatabaseId)
{
/* Fetch xid just once - see GetNewTransactionId */
***************
*** 481,486 ****
--- 488,494 ----
TransactionId xmin;
TransactionId xmax;
TransactionId globalxmin;
+ TransactionId noninvacuumxmin;
int index;
int count = 0;
***************
*** 514,520 ****
errmsg("out of memory")));
}
! globalxmin = xmin = GetTopTransactionId();
/*
* If we are going to set MyProc->xmin then we'd better get exclusive
--- 522,528 ----
errmsg("out of memory")));
}
! globalxmin = xmin = noninvacuumxmin = GetTopTransactionId();
/*
* If we are going to set MyProc->xmin then we'd better get exclusive
***************
*** 573,578 ****
--- 581,591 ----
if (TransactionIdPrecedes(xid, xmin))
xmin = xid;
+
+ /* Only consider non-vacuum transactions for nonInVacuumXmin */
+ if (TransactionIdPrecedes(xid, noninvacuumxmin) && !proc->inVacuum)
+ noninvacuumxmin = xid;
+
snapshot->xip[count] = xid;
count++;
***************
*** 584,590 ****
--- 597,606 ----
}
if (serializable)
+ {
MyProc->xmin = TransactionXmin = xmin;
+ MyProc->nonInVacuumXmin = noninvacuumxmin;
+ }
LWLockRelease(ProcArrayLock);
Index: src/backend/storage/lmgr/proc.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/storage/lmgr/proc.c,v
retrieving revision 1.175
diff -c -r1.175 proc.c
*** src/backend/storage/lmgr/proc.c 20 Jun 2006 22:52:00 -0000 1.175
--- src/backend/storage/lmgr/proc.c 11 Jul 2006 16:44:03 -0000
***************
*** 258,263 ****
--- 258,265 ----
/* databaseId and roleId will be filled in later */
MyProc->databaseId = InvalidOid;
MyProc->roleId = InvalidOid;
+ MyProc->inVacuum = false;
+ MyProc->nonInVacuumXmin = InvalidTransactionId;
MyProc->lwWaiting = false;
MyProc->lwExclusive = false;
MyProc->lwWaitLink = NULL;
***************
*** 389,394 ****
--- 391,398 ----
MyProc->xmin = InvalidTransactionId;
MyProc->databaseId = InvalidOid;
MyProc->roleId = InvalidOid;
+ MyProc->inVacuum = false;
+ MyProc->nonInVacuumXmin = InvalidTransactionId;
MyProc->lwWaiting = false;
MyProc->lwExclusive = false;
MyProc->lwWaitLink = NULL;
Index: src/include/storage/proc.h
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/include/storage/proc.h,v
retrieving revision 1.88
diff -c -r1.88 proc.h
*** src/include/storage/proc.h 14 Apr 2006 03:38:56 -0000 1.88
--- src/include/storage/proc.h 11 Jul 2006 20:06:15 -0000
***************
*** 74,79 ****
--- 74,84 ----
Oid databaseId; /* OID of database this backend is using */
Oid roleId; /* OID of role using this backend */
+ bool inVacuum; /* true if current xact is a LAZY VACUUM */
+
+ TransactionId nonInVacuumXmin; /* same as xmin with transactions where
+ * (proc->inVacuum == true) excluded */
+
/* Info about LWLock the process is currently waiting for, if any. */
bool lwWaiting; /* true if waiting for an LW lock */
bool lwExclusive; /* true if waiting for exclusive access */
Index: src/include/storage/procarray.h
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/include/storage/procarray.h,v
retrieving revision 1.9
diff -c -r1.9 procarray.h
*** src/include/storage/procarray.h 19 Jun 2006 01:51:22 -0000 1.9
--- src/include/storage/procarray.h 11 Jul 2006 16:44:03 -0000
***************
*** 24,30 ****
extern bool TransactionIdIsInProgress(TransactionId xid);
extern bool TransactionIdIsActive(TransactionId xid);
! extern TransactionId GetOldestXmin(bool allDbs);
extern PGPROC *BackendPidGetProc(int pid);
extern int BackendXidGetPid(TransactionId xid);
--- 24,30 ----
extern bool TransactionIdIsInProgress(TransactionId xid);
extern bool TransactionIdIsActive(TransactionId xid);
! extern TransactionId GetOldestXmin(bool allDbs, bool ignoreVacuum);
extern PGPROC *BackendPidGetProc(int pid);
extern int BackendXidGetPid(TransactionId xid);
Am Dienstag, 11. Juli 2006 23:01 schrieb Alvaro Herrera:
One exception is that we can't do that with full vacuums. The reason is
that full vacuum may want to run user-defined functions to be able to
index the tuples it moves. This isn't a problem normally, except in the
case where the function tries to scan some other table: if we ignored
that transaction, then another lazy vacuum might delete tuples from that
table that we need to see.
Functions in the index expression must be immutable, so I don't think that is
a real concern.
--
Peter Eisentraut
http://developer.postgresql.org/~petere/
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
Hannu Krossing asked me about his patch to ignore transactions running
VACUUM LAZY in other vacuum transactions. I attach a version of the
patch updated to the current sources.
nonInVacuumXmin seems useless ... perhaps a vestige of some earlier
version of the computation?
In general, it seems to me that a transaction running lazy vacuum could
be ignored for every purpose except truncating clog/subtrans. Since it
will never insert its own XID into the database (note: VACUUM ANALYZE is
run as two separate transactions, hence the pg_statistic rows inserted
by ANALYZE are not a counterexample), there's no need for anyone to
include it as running in their snapshots. So unless I'm missing
something, this is a safe change for lazy vacuum, but perhaps not for
full vacuum, which *does* put its XID into the database.
A possible objection to this is that it would foreclose running VACUUM
and ANALYZE as a single transaction, exactly because of the point that
we couldn't insert pg_statistic rows using a lazy vacuum's XID. I think
there was some discussion of doing that in connection with enlarging
ANALYZE's sample greatly --- if ANALYZE goes back to being a full scan
or nearly so, it'd sure be nice to combine it with the VACUUM scan.
However maybe we should just accept that as the price of not having
multiple vacuums interfere with each other.
regards, tom lane
Tom Lane wrote:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
Hannu Krossing asked me about his patch to ignore transactions running
VACUUM LAZY in other vacuum transactions. I attach a version of the
patch updated to the current sources.nonInVacuumXmin seems useless ... perhaps a vestige of some earlier
version of the computation?
Hmm ... I remember removing a now-useless variable somewhere, but maybe
this one escaped me. I don't have the code handy -- will check.
In general, it seems to me that a transaction running lazy vacuum could
be ignored for every purpose except truncating clog/subtrans. Since it
will never insert its own XID into the database (note: VACUUM ANALYZE is
run as two separate transactions, hence the pg_statistic rows inserted
by ANALYZE are not a counterexample), there's no need for anyone to
include it as running in their snapshots. So unless I'm missing
something, this is a safe change for lazy vacuum, but perhaps not for
full vacuum, which *does* put its XID into the database.
But keep in mind that in the current code, clog truncation takes
relminxid (actually datminxid) into account, not running transactions,
so AFAICS this should affect anything.
Subtrans truncation is different and it certainly should consider lazy
vacuum's Xids.
A possible objection to this is that it would foreclose running VACUUM
and ANALYZE as a single transaction, exactly because of the point that
we couldn't insert pg_statistic rows using a lazy vacuum's XID. I think
there was some discussion of doing that in connection with enlarging
ANALYZE's sample greatly --- if ANALYZE goes back to being a full scan
or nearly so, it'd sure be nice to combine it with the VACUUM scan.
However maybe we should just accept that as the price of not having
multiple vacuums interfere with each other.
Hmm, what about having a single scan for both, and then starting a
normal transaction just for the sake of inserting the pg_statistics
tuple?
I think the interactions of Xids and vacuum and other stuff are starting
to get complex; IMHO it warrants having a README.vacuum, or something.
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes:
Tom Lane wrote:
A possible objection to this is that it would foreclose running VACUUM
and ANALYZE as a single transaction, exactly because of the point that
we couldn't insert pg_statistic rows using a lazy vacuum's XID.
Hmm, what about having a single scan for both, and then starting a
normal transaction just for the sake of inserting the pg_statistics
tuple?
We could, but I think memory consumption would be the issue. VACUUM
wants a lotta memory for the dead-TIDs array, ANALYZE wants a lot for
its statistics gathering ... even more if it's trying to take a larger
sample than before. (This is probably why we kept them separate in
the last rewrite.)
I think the interactions of Xids and vacuum and other stuff are starting
to get complex; IMHO it warrants having a README.vacuum, or something.
Go for it ...
regards, tom lane
Cc: pgsql-hackers removed, as this mail contains a patch.
Tom Lane wrote:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
Hannu Krossing asked me about his patch to ignore transactions running
VACUUM LAZY in other vacuum transactions. I attach a version of the
patch updated to the current sources.nonInVacuumXmin seems useless ... perhaps a vestige of some earlier
version of the computation?
Yup -- I checked that code and found out that nonInVacuumXmin can be
taken out as it's not used anywhere. One upside of this is that taking
it out means we can remove all diffs to GetSnapshotData. New patch
attached; it's a bit smaller than the last one.
I'm currently testing it. Since it appears there are no further
objections, I intend to commit it.
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Attachments:
ignore-vacuum-2006-07-24.patchtext/plain; charset=us-asciiDownload
Index: src/backend/access/transam/twophase.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/access/transam/twophase.c,v
retrieving revision 1.21
diff -c -r1.21 twophase.c
*** src/backend/access/transam/twophase.c 14 Jul 2006 14:52:17 -0000 1.21
--- src/backend/access/transam/twophase.c 24 Jul 2006 22:16:35 -0000
***************
*** 279,284 ****
--- 279,285 ----
gxact->proc.pid = 0;
gxact->proc.databaseId = databaseid;
gxact->proc.roleId = owner;
+ gxact->proc.inVacuum = false;
gxact->proc.lwWaiting = false;
gxact->proc.lwExclusive = false;
gxact->proc.lwWaitLink = NULL;
Index: src/backend/access/transam/xact.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/access/transam/xact.c,v
retrieving revision 1.224
diff -c -r1.224 xact.c
*** src/backend/access/transam/xact.c 24 Jul 2006 16:32:44 -0000 1.224
--- src/backend/access/transam/xact.c 24 Jul 2006 22:16:35 -0000
***************
*** 1529,1534 ****
--- 1529,1535 ----
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
MyProc->xid = InvalidTransactionId;
MyProc->xmin = InvalidTransactionId;
+ MyProc->inVacuum = false; /* must be cleared with xid/xmin */
/* Clear the subtransaction-XID cache too while holding the lock */
MyProc->subxids.nxids = 0;
***************
*** 1764,1769 ****
--- 1765,1771 ----
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
MyProc->xid = InvalidTransactionId;
MyProc->xmin = InvalidTransactionId;
+ MyProc->inVacuum = false; /* must be cleared with xid/xmin */
/* Clear the subtransaction-XID cache too while holding the lock */
MyProc->subxids.nxids = 0;
***************
*** 1927,1932 ****
--- 1929,1935 ----
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
MyProc->xid = InvalidTransactionId;
MyProc->xmin = InvalidTransactionId;
+ MyProc->inVacuum = false; /* must be cleared with xid/xmin */
/* Clear the subtransaction-XID cache too while holding the lock */
MyProc->subxids.nxids = 0;
Index: src/backend/access/transam/xlog.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.244
diff -c -r1.244 xlog.c
*** src/backend/access/transam/xlog.c 14 Jul 2006 14:52:17 -0000 1.244
--- src/backend/access/transam/xlog.c 24 Jul 2006 22:16:35 -0000
***************
*** 5413,5419 ****
* StartupSUBTRANS hasn't been called yet.
*/
if (!InRecovery)
! TruncateSUBTRANS(GetOldestXmin(true));
if (!shutdown)
ereport(DEBUG2,
--- 5413,5419 ----
* StartupSUBTRANS hasn't been called yet.
*/
if (!InRecovery)
! TruncateSUBTRANS(GetOldestXmin(true, false));
if (!shutdown)
ereport(DEBUG2,
Index: src/backend/catalog/index.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/catalog/index.c,v
retrieving revision 1.269
diff -c -r1.269 index.c
*** src/backend/catalog/index.c 13 Jul 2006 16:49:13 -0000 1.269
--- src/backend/catalog/index.c 24 Jul 2006 22:16:35 -0000
***************
*** 1367,1373 ****
else
{
snapshot = SnapshotAny;
! OldestXmin = GetOldestXmin(heapRelation->rd_rel->relisshared);
}
scan = heap_beginscan(heapRelation, /* relation */
--- 1367,1374 ----
else
{
snapshot = SnapshotAny;
! /* okay to ignore lazy VACUUMs here */
! OldestXmin = GetOldestXmin(heapRelation->rd_rel->relisshared, true);
}
scan = heap_beginscan(heapRelation, /* relation */
Index: src/backend/commands/vacuum.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/commands/vacuum.c,v
retrieving revision 1.335
diff -c -r1.335 vacuum.c
*** src/backend/commands/vacuum.c 14 Jul 2006 14:52:18 -0000 1.335
--- src/backend/commands/vacuum.c 24 Jul 2006 22:16:36 -0000
***************
*** 37,42 ****
--- 37,43 ----
#include "postmaster/autovacuum.h"
#include "storage/freespace.h"
#include "storage/pmsignal.h"
+ #include "storage/proc.h"
#include "storage/procarray.h"
#include "utils/acl.h"
#include "utils/builtins.h"
***************
*** 589,595 ****
{
TransactionId limit;
! *oldestXmin = GetOldestXmin(sharedRel);
Assert(TransactionIdIsNormal(*oldestXmin));
--- 590,602 ----
{
TransactionId limit;
! /*
! * We can always ignore processes running lazy vacuum. This is because we
! * use these values only for deciding which tuples we must keep in the
! * tables. Since lazy vacuum doesn't write its xid to the table, it's
! * safe to ignore it.
! */
! *oldestXmin = GetOldestXmin(sharedRel, true);
Assert(TransactionIdIsNormal(*oldestXmin));
***************
*** 645,650 ****
--- 652,662 ----
* pg_class would've been obsoleted. Of course, this only works for
* fixed-size never-null columns, but these are.
*
+ * Another reason for doing it this way is that when we are in a lazy
+ * VACUUM and have inVacuum set, we mustn't do any updates --- somebody
+ * vacuuming pg_class might think they could delete a tuple marked with
+ * xmin = our xid.
+ *
* This routine is shared by full VACUUM, lazy VACUUM, and stand-alone
* ANALYZE.
*/
***************
*** 996,1003 ****
/* Begin a transaction for vacuuming this relation */
StartTransactionCommand();
! /* functions in indexes may want a snapshot set */
! ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());
/*
* Tell the cache replacement strategy that vacuum is causing all
--- 1008,1042 ----
/* Begin a transaction for vacuuming this relation */
StartTransactionCommand();
!
! if (vacstmt->full)
! {
! /* functions in indexes may want a snapshot set */
! ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());
! }
! else
! {
! /*
! * During a lazy VACUUM we do not run any user-supplied functions,
! * and so it should be safe to not create a transaction snapshot.
! *
! * We can furthermore set the inVacuum flag, which lets other
! * concurrent VACUUMs know that they can ignore this one while
! * determining their OldestXmin. (The reason we don't set inVacuum
! * during a full VACUUM is exactly that we may have to run user-
! * defined functions for functional indexes, and we want to make
! * sure that if they use the snapshot set above, any tuples it
! * requires can't get removed from other tables. An index function
! * that depends on the contents of other tables is arguably broken,
! * but we won't break it here by violating transaction semantics.)
! *
! * Note: the inVacuum flag remains set until CommitTransaction or
! * AbortTransaction. We don't want to clear it until we reset
! * MyProc->xid/xmin, else OldestXmin might appear to go backwards,
! * which is probably Not Good.
! */
! MyProc->inVacuum = true;
! }
/*
* Tell the cache replacement strategy that vacuum is causing all
Index: src/backend/storage/ipc/procarray.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/storage/ipc/procarray.c,v
retrieving revision 1.14
diff -c -r1.14 procarray.c
*** src/backend/storage/ipc/procarray.c 14 Jul 2006 14:52:22 -0000 1.14
--- src/backend/storage/ipc/procarray.c 24 Jul 2006 22:16:43 -0000
***************
*** 388,407 ****
* If allDbs is TRUE then all backends are considered; if allDbs is FALSE
* then only backends running in my own database are considered.
*
* This is used by VACUUM to decide which deleted tuples must be preserved
* in a table. allDbs = TRUE is needed for shared relations, but allDbs =
* FALSE is sufficient for non-shared relations, since only backends in my
! * own database could ever see the tuples in them.
*
* This is also used to determine where to truncate pg_subtrans. allDbs
! * must be TRUE for that case.
*
* Note: we include the currently running xids in the set of considered xids.
* This ensures that if a just-started xact has not yet set its snapshot,
* when it does set the snapshot it cannot set xmin less than what we compute.
*/
TransactionId
! GetOldestXmin(bool allDbs)
{
ProcArrayStruct *arrayP = procArray;
TransactionId result;
--- 388,411 ----
* If allDbs is TRUE then all backends are considered; if allDbs is FALSE
* then only backends running in my own database are considered.
*
+ * If ignoreVacuum is TRUE then backends with inVacuum set are ignored.
+ *
* This is used by VACUUM to decide which deleted tuples must be preserved
* in a table. allDbs = TRUE is needed for shared relations, but allDbs =
* FALSE is sufficient for non-shared relations, since only backends in my
! * own database could ever see the tuples in them. Also, we can ignore
! * concurrently running lazy VACUUMs because (a) they must be working on other
! * tables, and (b) they don't need to do snapshot-based lookups.
*
* This is also used to determine where to truncate pg_subtrans. allDbs
! * must be TRUE for that case, and ignoreVacuum FALSE.
*
* Note: we include the currently running xids in the set of considered xids.
* This ensures that if a just-started xact has not yet set its snapshot,
* when it does set the snapshot it cannot set xmin less than what we compute.
*/
TransactionId
! GetOldestXmin(bool allDbs, bool ignoreVacuum)
{
ProcArrayStruct *arrayP = procArray;
TransactionId result;
***************
*** 425,430 ****
--- 429,437 ----
{
PGPROC *proc = arrayP->procs[index];
+ if (ignoreVacuum && proc->inVacuum)
+ continue;
+
if (allDbs || proc->databaseId == MyDatabaseId)
{
/* Fetch xid just once - see GetNewTransactionId */
Index: src/backend/storage/lmgr/proc.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/storage/lmgr/proc.c,v
retrieving revision 1.178
diff -c -r1.178 proc.c
*** src/backend/storage/lmgr/proc.c 23 Jul 2006 23:08:46 -0000 1.178
--- src/backend/storage/lmgr/proc.c 24 Jul 2006 22:16:43 -0000
***************
*** 257,262 ****
--- 257,263 ----
/* databaseId and roleId will be filled in later */
MyProc->databaseId = InvalidOid;
MyProc->roleId = InvalidOid;
+ MyProc->inVacuum = false;
MyProc->lwWaiting = false;
MyProc->lwExclusive = false;
MyProc->lwWaitLink = NULL;
***************
*** 388,393 ****
--- 389,395 ----
MyProc->xmin = InvalidTransactionId;
MyProc->databaseId = InvalidOid;
MyProc->roleId = InvalidOid;
+ MyProc->inVacuum = false;
MyProc->lwWaiting = false;
MyProc->lwExclusive = false;
MyProc->lwWaitLink = NULL;
Index: src/include/storage/proc.h
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/include/storage/proc.h,v
retrieving revision 1.89
diff -c -r1.89 proc.h
*** src/include/storage/proc.h 13 Jul 2006 16:49:20 -0000 1.89
--- src/include/storage/proc.h 24 Jul 2006 22:16:44 -0000
***************
*** 73,78 ****
--- 73,80 ----
Oid databaseId; /* OID of database this backend is using */
Oid roleId; /* OID of role using this backend */
+ bool inVacuum; /* true if current xact is a LAZY VACUUM */
+
/* Info about LWLock the process is currently waiting for, if any. */
bool lwWaiting; /* true if waiting for an LW lock */
bool lwExclusive; /* true if waiting for exclusive access */
Index: src/include/storage/procarray.h
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/include/storage/procarray.h,v
retrieving revision 1.9
diff -c -r1.9 procarray.h
*** src/include/storage/procarray.h 19 Jun 2006 01:51:22 -0000 1.9
--- src/include/storage/procarray.h 11 Jul 2006 16:44:03 -0000
***************
*** 24,30 ****
extern bool TransactionIdIsInProgress(TransactionId xid);
extern bool TransactionIdIsActive(TransactionId xid);
! extern TransactionId GetOldestXmin(bool allDbs);
extern PGPROC *BackendPidGetProc(int pid);
extern int BackendXidGetPid(TransactionId xid);
--- 24,30 ----
extern bool TransactionIdIsInProgress(TransactionId xid);
extern bool TransactionIdIsActive(TransactionId xid);
! extern TransactionId GetOldestXmin(bool allDbs, bool ignoreVacuum);
extern PGPROC *BackendPidGetProc(int pid);
extern int BackendXidGetPid(TransactionId xid);
Tom Lane wrote:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
Hannu Krossing asked me about his patch to ignore transactions running
VACUUM LAZY in other vacuum transactions. I attach a version of the
patch updated to the current sources.nonInVacuumXmin seems useless ... perhaps a vestige of some earlier
version of the computation?
Hmm, not useless at all really -- only a bug of mine. Turns out the
notInVacuumXmin stuff is essential, so I put it back.
I noticed something however -- in calculating the OldestXmin we always
consider all DBs, even though there is a parameter for skipping backends
not in the current DB -- this is because the Xmin we store in PGPROC is
always computed using all backends. The allDbs parameter only allows us
to skip the Xid of a transaction running elsewhere, but this is not very
helpful because the Xmin of transactions running in the local DB will
include those foreign Xids.
In case I'm not explaining myself, the problem is that if I open a
transaction in database A and then vacuum a table in database B, those
tuples deleted after the transaction in database A started cannot be
removed.
To solve this problem, one idea is to change the new member of PGPROC to
"current database's not in vacuum Xmin", which is the minimum of Xmins
of backends running in my database which are not executing a lazy
vacuum. This can be used to vacuum non-shared relations.
We could either add it anew, beside nonInVacuumXmin, or replace
nonInVacuumXmin. The difference will be whether we will have something
to be used to vacuum shared relations or not. I think in general,
shared relations are not vacuumed much so it shouldn't be too much of a
problem if we leave them to be vacuumed with the regular, all-databases,
include-vacuum Xmin.
The other POV is that we don't really care about long-running
transaction in other databases unless they are lazy vacuum, a case which
is appropiately covered by the patch as it currently stands. This seems
to be the POV that Hannu takes: the only long-running transactions he
cares about are lazy vacuums.
Thoughts?
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Ühel kenal päeval, N, 2006-07-27 kell 19:29, kirjutas Alvaro Herrera:
We could either add it anew, beside nonInVacuumXmin, or replace
nonInVacuumXmin. The difference will be whether we will have something
to be used to vacuum shared relations or not. I think in general,
shared relations are not vacuumed much so it shouldn't be too much of a
problem if we leave them to be vacuumed with the regular, all-databases,
include-vacuum Xmin.
Yes. I don't think that vacuuming shared relations will ever be a
significant performance concern.
The other POV is that we don't really care about long-running
transaction in other databases unless they are lazy vacuum, a case which
is appropiately covered by the patch as it currently stands. This seems
to be the POV that Hannu takes: the only long-running transactions he
cares about are lazy vacuums.
Yes. The original target audience of this patch are users running 24/7
OLTP databases with big slow changing tables and small fast-changing
tables which need to stay small even at the time when the big ones are
vacuumed.
The other possible transactions which _could_ possibly be ignored while
VACUUMING are those from ANALYSE and non-lazy VACUUMs.
I don't care about them as:
ANALYSE is relatively fast, even on huge tables, and thus can be
ignored.
If you do run VACUUM FULL on anything bigger than a few thousand
lines then you are not running a 24/7 OLTP database anyway.
I also can't see a usecase for OLTP database where VACUUM FREEZE is
required.
Maybe we could also start ignoring the transactions that are running the
new CONCURRENT CREATE INDEX command, as it also runs inside its own
transaction(s) which can't possibly touch the tuples in the table being
vacuumed as it locks out VACUUM on the indexed table.
That would probably be quite easy to do by just having CONCURRENT CREATE
INDEX also mark its transactions as ignorable by VACUUM. Maybe the
variable name for that (proc->inVacuum) needs to be changed to something
like trxSafeToIgnoreByVacuum.
--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia
Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com
Alvaro Herrera <alvherre@commandprompt.com> writes:
Tom Lane wrote:
nonInVacuumXmin seems useless ... perhaps a vestige of some earlier
version of the computation?
Hmm, not useless at all really -- only a bug of mine. Turns out the
notInVacuumXmin stuff is essential, so I put it back.
Uh, why?
I noticed something however -- in calculating the OldestXmin we always
consider all DBs, even though there is a parameter for skipping backends
not in the current DB -- this is because the Xmin we store in PGPROC is
always computed using all backends. The allDbs parameter only allows us
to skip the Xid of a transaction running elsewhere, but this is not very
helpful because the Xmin of transactions running in the local DB will
include those foreign Xids.
Yeah, this has been recognized for some time. However the overhead of
calculating local and global xmins in *every* transaction start is a
significant reason not to do it.
regards, tom lane
Another idea Jan had today was whether we could vacuum more rows if a
long-running backend is in serializable mode, like pg_dump.
---------------------------------------------------------------------------
Tom Lane wrote:
Alvaro Herrera <alvherre@commandprompt.com> writes:
Tom Lane wrote:
nonInVacuumXmin seems useless ... perhaps a vestige of some earlier
version of the computation?Hmm, not useless at all really -- only a bug of mine. Turns out the
notInVacuumXmin stuff is essential, so I put it back.Uh, why?
I noticed something however -- in calculating the OldestXmin we always
consider all DBs, even though there is a parameter for skipping backends
not in the current DB -- this is because the Xmin we store in PGPROC is
always computed using all backends. The allDbs parameter only allows us
to skip the Xid of a transaction running elsewhere, but this is not very
helpful because the Xmin of transactions running in the local DB will
include those foreign Xids.Yeah, this has been recognized for some time. However the overhead of
calculating local and global xmins in *every* transaction start is a
significant reason not to do it.regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Tom Lane wrote:
Alvaro Herrera <alvherre@commandprompt.com> writes:
Tom Lane wrote:
nonInVacuumXmin seems useless ... perhaps a vestige of some earlier
version of the computation?Hmm, not useless at all really -- only a bug of mine. Turns out the
notInVacuumXmin stuff is essential, so I put it back.Uh, why?
Because it's used to determine the Xmin that our vacuum will use. If
there is a transaction whose Xmin calculation included the Xid of a
transaction running vacuum, we have gained nothing from directly
excluding said vacuum's Xid, because it will affect us anyway indirectly
via that transaction's Xmin.
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Ühel kenal päeval, N, 2006-07-27 kell 22:05, kirjutas Bruce Momjian:
Another idea Jan had today was whether we could vacuum more rows if a
long-running backend is in serializable mode, like pg_dump.
I don't see how this gives us ability to vacuum more rows, as the
snapshot of a serializable transaction is the oldest one.
--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia
Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com
Hannu Krosing wrote:
?hel kenal p?eval, N, 2006-07-27 kell 22:05, kirjutas Bruce Momjian:
Another idea Jan had today was whether we could vacuum more rows if a
long-running backend is in serializable mode, like pg_dump.I don't see how this gives us ability to vacuum more rows, as the
snapshot of a serializable transaction is the oldest one.
Good question. Imagine you have a serializable transaction like
pg_dump, and then you have lots of newer transactions. If pg_dump is
xid=12, and all the new transactions start at xid=30, any row created
and expired between 12 and 30 can be removed because they are not
visible. For a use case, imagine an UPDATE chain where a rows was
created by x=15 and expired by xid=19. Right now, we don't remove that
row, though we could.
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes:
Good question. Imagine you have a serializable transaction like
pg_dump, and then you have lots of newer transactions. If pg_dump is
xid=12, and all the new transactions start at xid=30, any row created
and expired between 12 and 30 can be removed because they are not
visible.
This reasoning is bogus.
It would probably be safe for pg_dump because it's a read-only
operation, but it fails badly if the serializable transaction is trying
to do updates. An update needs to chase the chain of newer versions of
the row forward from the version that's visible to the xact's
serializable snapshot, to see if anyone has committed a newer version.
Your proposal would remove elements of that chain, thereby possibly
allowing the serializable xact to conclude it may update the tuple
when it should have given an error.
regards, tom lane
Alvaro Herrera <alvherre@commandprompt.com> writes:
Tom Lane wrote:
Uh, why?
Because it's used to determine the Xmin that our vacuum will use. If
there is a transaction whose Xmin calculation included the Xid of a
transaction running vacuum, we have gained nothing from directly
excluding said vacuum's Xid, because it will affect us anyway indirectly
via that transaction's Xmin.
But the patch changes things so that *everyone* excludes the vacuum from
their xmin. Or at least I thought that was the plan.
regards, tom lane
Tom Lane wrote:
Alvaro Herrera <alvherre@commandprompt.com> writes:
Tom Lane wrote:
Uh, why?
Because it's used to determine the Xmin that our vacuum will use. If
there is a transaction whose Xmin calculation included the Xid of a
transaction running vacuum, we have gained nothing from directly
excluding said vacuum's Xid, because it will affect us anyway indirectly
via that transaction's Xmin.But the patch changes things so that *everyone* excludes the vacuum from
their xmin. Or at least I thought that was the plan.
We shouldn't do that, because that Xmin is also used to truncate
SUBTRANS. Unless we are prepared to say that vacuum does not use
subtransactions so it doesn't matter. This is true currently, so we
could go ahead and do it (unless I'm missing something) -- but it means
lazy vacuum will never be able to use subtransactions.
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
Good question. Imagine you have a serializable transaction like
pg_dump, and then you have lots of newer transactions. If pg_dump is
xid=12, and all the new transactions start at xid=30, any row created
and expired between 12 and 30 can be removed because they are not
visible.This reasoning is bogus.
It would probably be safe for pg_dump because it's a read-only
operation, but it fails badly if the serializable transaction is trying
to do updates. An update needs to chase the chain of newer versions of
the row forward from the version that's visible to the xact's
serializable snapshot, to see if anyone has committed a newer version.
Your proposal would remove elements of that chain, thereby possibly
allowing the serializable xact to conclude it may update the tuple
when it should have given an error.
So in fact members of the chain are not visible, but vacuum doesn't have
a strong enough lock to remove parts of the chain. What seems strange
is that vacuum can trim the chain, but only if you do members starting
from the head. I assume this is because you don't need to rejoin the
chain around the expired tuples.
("bogus" seems a little strong.)
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Alvaro Herrera <alvherre@commandprompt.com> writes:
Tom Lane wrote:
But the patch changes things so that *everyone* excludes the vacuum from
their xmin. Or at least I thought that was the plan.
We shouldn't do that, because that Xmin is also used to truncate
SUBTRANS.
Yeah, but you were going to change that, no? Truncating SUBTRANS will
need to include the vacuum xact's xmin, but we don't need it for any
other purpose.
but it means
lazy vacuum will never be able to use subtransactions.
This patch already depends on the assumption that lazy vacuum will never
do any transactional updates, so I don't see what it would need
subtransactions for.
regards, tom lane
On Fri, Jul 28, 2006 at 03:08:08AM +0300, Hannu Krosing wrote:
The other POV is that we don't really care about long-running
transaction in other databases unless they are lazy vacuum, a case which
is appropiately covered by the patch as it currently stands. This seems
to be the POV that Hannu takes: the only long-running transactions he
cares about are lazy vacuums.Yes. The original target audience of this patch are users running 24/7
OLTP databases with big slow changing tables and small fast-changing
tables which need to stay small even at the time when the big ones are
vacuumed.The other possible transactions which _could_ possibly be ignored while
VACUUMING are those from ANALYSE and non-lazy VACUUMs.
There are other transactions to consider: user transactions that will
run a long time, but only hit a limited number of relations. These are
as big a problem in an OLTP environment as vacuum is.
Rather than coming up with machinery that will special-case vacuum or
pg_dump, etc., I'd suggest thinking about a generic framework that would
work for any long-runnnig transaction. One possibility:
Transaction flags itself as 'long-running' and provides a list of
exactly what relations it will be touching.
That list is stored someplace a future vacuum can get at.
The transaction runs, with additional checks that ensure it will not
touch any relations that aren't in the list it provided.
Any vacuums that start will take into account these lists of relations
from long-running transactions and build a list of XIDs that have
provided a list, and the minimum XID for every relation that was listed.
If vacuum wants to vacuum a relation that has been listed as part of a
long-running transaction, it will use the oldest XID in the
database/cluster or the oldest XID listed for that relation, whichever
is older. If it wants to vacuum a relation that is not listed, it will
use the oldest XID in the database/cluster, excluding those XIDs that
have listed exactly what relations they will be looking at.
That scheme won't help pg_dump... in order to do so, you'd need to allow
transactions to drop relations from their list.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
jnasby@pervasive.com ("Jim C. Nasby") writes:
There are other transactions to consider: user transactions that will
run a long time, but only hit a limited number of relations. These are
as big a problem in an OLTP environment as vacuum is.Rather than coming up with machinery that will special-case vacuum or
pg_dump, etc., I'd suggest thinking about a generic framework that would
work for any long-runnnig transaction. One possibility:Transaction flags itself as 'long-running' and provides a list of
exactly what relations it will be touching.That list is stored someplace a future vacuum can get at.
The transaction runs, with additional checks that ensure it will not
touch any relations that aren't in the list it provided.
One thought that's a bit different...
How about we mark transactions that are in serializable mode? That
would merely be a flag...
We would know that, for each such transaction, we could treat all
tuples "deadified" after those transactions as being dead and
cleanable.
That doesn't require any knowledge of relations that are
touched/locked...
--
"cbbrowne","@","cbbrowne.com"
http://www.ntlug.org/~cbbrowne/nonrdbms.html
To err is human, to moo bovine.
Ühel kenal päeval, R, 2006-07-28 kell 12:38, kirjutas Jim C. Nasby:
On Fri, Jul 28, 2006 at 03:08:08AM +0300, Hannu Krosing wrote:
The other POV is that we don't really care about long-running
transaction in other databases unless they are lazy vacuum, a case which
is appropiately covered by the patch as it currently stands. This seems
to be the POV that Hannu takes: the only long-running transactions he
cares about are lazy vacuums.Yes. The original target audience of this patch are users running 24/7
OLTP databases with big slow changing tables and small fast-changing
tables which need to stay small even at the time when the big ones are
vacuumed.The other possible transactions which _could_ possibly be ignored while
VACUUMING are those from ANALYSE and non-lazy VACUUMs.There are other transactions to consider: user transactions that will
run a long time, but only hit a limited number of relations. These are
as big a problem in an OLTP environment as vacuum is.
These transactions are better kept out of an OLTP database, by their
nature they belong to OLAP db :)
The reason I addressed the VACUUM first, was the fact that you can't
avoid VACUUM on OLTP db.
Rather than coming up with machinery that will special-case vacuum or
pg_dump, etc., I'd suggest thinking about a generic framework that would
work for any long-runnnig transaction.
So instead of actually *solving* one problem you suggest *thinking*
about solving the general case ?
We have been *thinking* about dead-space-map for at least three years by
now.
One possibility:
Transaction flags itself as 'long-running' and provides a list of
exactly what relations it will be touching.That list is stored someplace a future vacuum can get at.
The transaction runs, with additional checks that ensure it will not
touch any relations that aren't in the list it provided.
I have thought abou that too, but checking on each data change seemed
too expensive to me, at least for the first cut.
There seems to be some ways to avoid actual checking for table-in-list,
but you still have to check weather you have to check .
Any vacuums that start will take into account these lists of relations
from long-running transactions and build a list of XIDs that have
provided a list, and the minimum XID for every relation that was listed.
If vacuum wants to vacuum a relation that has been listed as part of a
long-running transaction, it will use the oldest XID in the
database/cluster or the oldest XID listed for that relation, whichever
is older. If it wants to vacuum a relation that is not listed, it will
use the oldest XID in the database/cluster, excluding those XIDs that
have listed exactly what relations they will be looking at.That scheme won't help pg_dump... in order to do so, you'd need to allow
transactions to drop relations from their list.
The whole thing is probably doable, but I doubt it will be done before
8.2 (or even 8.5, considering that I had the first vacuum-ignore-vacuum
patch ready by 8.0 (i think))
--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia
Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com
On Jul 28, 2006, at 5:05 PM, Hannu Krosing wrote:
Ühel kenal päeval, R, 2006-07-28 kell 12:38, kirjutas Jim C. Nasby:
There are other transactions to consider: user transactions that will
run a long time, but only hit a limited number of relations. These
are
as big a problem in an OLTP environment as vacuum is.These transactions are better kept out of an OLTP database, by their
nature they belong to OLAP db :)
Sure, but that's not always possible/practical.
Rather than coming up with machinery that will special-case vacuum or
pg_dump, etc., I'd suggest thinking about a generic framework that
would
work for any long-runnnig transaction.So instead of actually *solving* one problem you suggest *thinking*
about solving the general case ?We have been *thinking* about dead-space-map for at least three
years by
now.
No, I just wanted anyone who was actually going to work on this to
think about a more general fix. If the vacuum-only fix has a chance
of getting into core a version before the general case, I'll happily
take what I can get.
One possibility:
Transaction flags itself as 'long-running' and provides a list of
exactly what relations it will be touching.That list is stored someplace a future vacuum can get at.
The transaction runs, with additional checks that ensure it will not
touch any relations that aren't in the list it provided.I have thought abou that too, but checking on each data change seemed
too expensive to me, at least for the first cut.There seems to be some ways to avoid actual checking for table-in-
list,
but you still have to check weather you have to check .
Well, presumably the check to see if you have to check would be
extremely cheap. As for checking that only approved relations are
touched, you can do that by analyzing the rules/triggers/etc that are
on all the tables involved. Or for a start, just disallow this on
tables with rules or triggers (well, we'd probably have to allow for
RI).
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Tom Lane wrote:
Alvaro Herrera <alvherre@commandprompt.com> writes:
Tom Lane wrote:
But the patch changes things so that *everyone* excludes the vacuum from
their xmin. Or at least I thought that was the plan.We shouldn't do that, because that Xmin is also used to truncate
SUBTRANS.Yeah, but you were going to change that, no? Truncating SUBTRANS will
need to include the vacuum xact's xmin, but we don't need it for any
other purpose.
That's correct.
but it means
lazy vacuum will never be able to use subtransactions.This patch already depends on the assumption that lazy vacuum will never
do any transactional updates, so I don't see what it would need
subtransactions for.
Here is a patch pursuant to there ideas. The main change is that in
GetSnapshotData, a backend is skipped entirely if inVacuum is found to
be true.
I've been trying to update my SSH CVS several times today but I can't
reach the server. Maybe it's the DoS attach that it's been under, I
don't know.
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Attachments:
ignore-vacuum-8.patchtext/plain; charset=us-asciiDownload
Index: src/backend/access/transam/twophase.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/access/transam/twophase.c,v
retrieving revision 1.21
diff -c -p -r1.21 twophase.c
*** src/backend/access/transam/twophase.c 14 Jul 2006 14:52:17 -0000 1.21
--- src/backend/access/transam/twophase.c 28 Jul 2006 21:59:42 -0000
*************** MarkAsPreparing(TransactionId xid, const
*** 279,284 ****
--- 279,285 ----
gxact->proc.pid = 0;
gxact->proc.databaseId = databaseid;
gxact->proc.roleId = owner;
+ gxact->proc.inVacuum = false;
gxact->proc.lwWaiting = false;
gxact->proc.lwExclusive = false;
gxact->proc.lwWaitLink = NULL;
Index: src/backend/access/transam/xact.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/access/transam/xact.c,v
retrieving revision 1.224
diff -c -p -r1.224 xact.c
*** src/backend/access/transam/xact.c 24 Jul 2006 16:32:44 -0000 1.224
--- src/backend/access/transam/xact.c 28 Jul 2006 21:59:42 -0000
*************** CommitTransaction(void)
*** 1529,1534 ****
--- 1529,1535 ----
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
MyProc->xid = InvalidTransactionId;
MyProc->xmin = InvalidTransactionId;
+ MyProc->inVacuum = false; /* must be cleared with xid/xmin */
/* Clear the subtransaction-XID cache too while holding the lock */
MyProc->subxids.nxids = 0;
*************** PrepareTransaction(void)
*** 1764,1769 ****
--- 1765,1771 ----
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
MyProc->xid = InvalidTransactionId;
MyProc->xmin = InvalidTransactionId;
+ MyProc->inVacuum = false; /* must be cleared with xid/xmin */
/* Clear the subtransaction-XID cache too while holding the lock */
MyProc->subxids.nxids = 0;
*************** AbortTransaction(void)
*** 1927,1932 ****
--- 1929,1935 ----
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
MyProc->xid = InvalidTransactionId;
MyProc->xmin = InvalidTransactionId;
+ MyProc->inVacuum = false; /* must be cleared with xid/xmin */
/* Clear the subtransaction-XID cache too while holding the lock */
MyProc->subxids.nxids = 0;
Index: src/backend/access/transam/xlog.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.244
diff -c -p -r1.244 xlog.c
*** src/backend/access/transam/xlog.c 14 Jul 2006 14:52:17 -0000 1.244
--- src/backend/access/transam/xlog.c 28 Jul 2006 21:59:42 -0000
*************** CreateCheckPoint(bool shutdown, bool for
*** 5413,5419 ****
* StartupSUBTRANS hasn't been called yet.
*/
if (!InRecovery)
! TruncateSUBTRANS(GetOldestXmin(true));
if (!shutdown)
ereport(DEBUG2,
--- 5413,5419 ----
* StartupSUBTRANS hasn't been called yet.
*/
if (!InRecovery)
! TruncateSUBTRANS(GetOldestXmin(true, false));
if (!shutdown)
ereport(DEBUG2,
Index: src/backend/catalog/index.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/catalog/index.c,v
retrieving revision 1.269
diff -c -p -r1.269 index.c
*** src/backend/catalog/index.c 13 Jul 2006 16:49:13 -0000 1.269
--- src/backend/catalog/index.c 28 Jul 2006 21:59:42 -0000
*************** IndexBuildHeapScan(Relation heapRelation
*** 1367,1373 ****
else
{
snapshot = SnapshotAny;
! OldestXmin = GetOldestXmin(heapRelation->rd_rel->relisshared);
}
scan = heap_beginscan(heapRelation, /* relation */
--- 1367,1374 ----
else
{
snapshot = SnapshotAny;
! /* okay to ignore lazy VACUUMs here */
! OldestXmin = GetOldestXmin(heapRelation->rd_rel->relisshared, true);
}
scan = heap_beginscan(heapRelation, /* relation */
Index: src/backend/commands/vacuum.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/commands/vacuum.c,v
retrieving revision 1.335
diff -c -p -r1.335 vacuum.c
*** src/backend/commands/vacuum.c 14 Jul 2006 14:52:18 -0000 1.335
--- src/backend/commands/vacuum.c 28 Jul 2006 21:59:42 -0000
***************
*** 37,42 ****
--- 37,43 ----
#include "postmaster/autovacuum.h"
#include "storage/freespace.h"
#include "storage/pmsignal.h"
+ #include "storage/proc.h"
#include "storage/procarray.h"
#include "utils/acl.h"
#include "utils/builtins.h"
*************** vacuum_set_xid_limits(VacuumStmt *vacstm
*** 589,595 ****
{
TransactionId limit;
! *oldestXmin = GetOldestXmin(sharedRel);
Assert(TransactionIdIsNormal(*oldestXmin));
--- 590,605 ----
{
TransactionId limit;
! /*
! * We can always ignore processes running lazy vacuum. This is because we
! * use these values only for deciding which tuples we must keep in the
! * tables. Since lazy vacuum doesn't write its xid to the table, it's
! * safe to ignore it. In theory it could be problematic to ignore lazy
! * vacuums on a full vacuum, but keep in mind that only one vacuum process
! * can be working on a particular table at any time, and that each vacuum
! * is always an independent transaction.
! */
! *oldestXmin = GetOldestXmin(sharedRel, true);
Assert(TransactionIdIsNormal(*oldestXmin));
*************** vacuum_set_xid_limits(VacuumStmt *vacstm
*** 645,650 ****
--- 655,665 ----
* pg_class would've been obsoleted. Of course, this only works for
* fixed-size never-null columns, but these are.
*
+ * Another reason for doing it this way is that when we are in a lazy
+ * VACUUM and have inVacuum set, we mustn't do any updates --- somebody
+ * vacuuming pg_class might think they could delete a tuple marked with
+ * xmin = our xid.
+ *
* This routine is shared by full VACUUM, lazy VACUUM, and stand-alone
* ANALYZE.
*/
*************** vacuum_rel(Oid relid, VacuumStmt *vacstm
*** 996,1003 ****
/* Begin a transaction for vacuuming this relation */
StartTransactionCommand();
! /* functions in indexes may want a snapshot set */
! ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());
/*
* Tell the cache replacement strategy that vacuum is causing all
--- 1011,1045 ----
/* Begin a transaction for vacuuming this relation */
StartTransactionCommand();
!
! if (vacstmt->full)
! {
! /* functions in indexes may want a snapshot set */
! ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());
! }
! else
! {
! /*
! * During a lazy VACUUM we do not run any user-supplied functions,
! * and so it should be safe to not create a transaction snapshot.
! *
! * We can furthermore set the inVacuum flag, which lets other
! * concurrent VACUUMs know that they can ignore this one while
! * determining their OldestXmin. (The reason we don't set inVacuum
! * during a full VACUUM is exactly that we may have to run user-
! * defined functions for functional indexes, and we want to make
! * sure that if they use the snapshot set above, any tuples it
! * requires can't get removed from other tables. An index function
! * that depends on the contents of other tables is arguably broken,
! * but we won't break it here by violating transaction semantics.)
! *
! * Note: the inVacuum flag remains set until CommitTransaction or
! * AbortTransaction. We don't want to clear it until we reset
! * MyProc->xid/xmin, else OldestXmin might appear to go backwards,
! * which is probably Not Good.
! */
! MyProc->inVacuum = true;
! }
/*
* Tell the cache replacement strategy that vacuum is causing all
Index: src/backend/storage/ipc/procarray.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/storage/ipc/procarray.c,v
retrieving revision 1.14
diff -c -p -r1.14 procarray.c
*** src/backend/storage/ipc/procarray.c 14 Jul 2006 14:52:22 -0000 1.14
--- src/backend/storage/ipc/procarray.c 28 Jul 2006 21:59:42 -0000
*************** TransactionIdIsActive(TransactionId xid)
*** 388,407 ****
* If allDbs is TRUE then all backends are considered; if allDbs is FALSE
* then only backends running in my own database are considered.
*
* This is used by VACUUM to decide which deleted tuples must be preserved
* in a table. allDbs = TRUE is needed for shared relations, but allDbs =
* FALSE is sufficient for non-shared relations, since only backends in my
! * own database could ever see the tuples in them.
*
* This is also used to determine where to truncate pg_subtrans. allDbs
! * must be TRUE for that case.
*
* Note: we include the currently running xids in the set of considered xids.
* This ensures that if a just-started xact has not yet set its snapshot,
* when it does set the snapshot it cannot set xmin less than what we compute.
*/
TransactionId
! GetOldestXmin(bool allDbs)
{
ProcArrayStruct *arrayP = procArray;
TransactionId result;
--- 388,411 ----
* If allDbs is TRUE then all backends are considered; if allDbs is FALSE
* then only backends running in my own database are considered.
*
+ * If ignoreVacuum is TRUE then backends with inVacuum set are ignored.
+ *
* This is used by VACUUM to decide which deleted tuples must be preserved
* in a table. allDbs = TRUE is needed for shared relations, but allDbs =
* FALSE is sufficient for non-shared relations, since only backends in my
! * own database could ever see the tuples in them. Also, we can ignore
! * concurrently running lazy VACUUMs because (a) they must be working on other
! * tables, and (b) they don't need to do snapshot-based lookups.
*
* This is also used to determine where to truncate pg_subtrans. allDbs
! * must be TRUE for that case, and ignoreVacuum FALSE.
*
* Note: we include the currently running xids in the set of considered xids.
* This ensures that if a just-started xact has not yet set its snapshot,
* when it does set the snapshot it cannot set xmin less than what we compute.
*/
TransactionId
! GetOldestXmin(bool allDbs, bool ignoreVacuum)
{
ProcArrayStruct *arrayP = procArray;
TransactionId result;
*************** GetOldestXmin(bool allDbs)
*** 425,430 ****
--- 429,437 ----
{
PGPROC *proc = arrayP->procs[index];
+ if (ignoreVacuum && proc->inVacuum)
+ continue;
+
if (allDbs || proc->databaseId == MyDatabaseId)
{
/* Fetch xid just once - see GetNewTransactionId */
*************** GetOldestXmin(bool allDbs)
*** 432,439 ****
--- 439,456 ----
if (TransactionIdIsNormal(xid))
{
+ /* First consider the transaction own's Xid */
if (TransactionIdPrecedes(xid, result))
result = xid;
+
+ /*
+ * Also consider the transaction's Xmin, if set.
+ *
+ * Note that this Xmin may seem to be guaranteed to be always
+ * lower than the transaction's Xid, but this is not so because
+ * there is a time window on which the Xid is already assigned
+ * but the Xmin has not being calculated yet.
+ */
xid = proc->xmin;
if (TransactionIdIsNormal(xid))
if (TransactionIdPrecedes(xid, result))
*************** GetOldestXmin(bool allDbs)
*** 471,478 ****
* RecentXmin: the xmin computed for the most recent snapshot. XIDs
* older than this are known not running any more.
* RecentGlobalXmin: the global xmin (oldest TransactionXmin across all
! * running transactions). This is the same computation done by
! * GetOldestXmin(TRUE).
*----------
*/
Snapshot
--- 488,495 ----
* RecentXmin: the xmin computed for the most recent snapshot. XIDs
* older than this are known not running any more.
* RecentGlobalXmin: the global xmin (oldest TransactionXmin across all
! * running transactions, except those running LAZY VACUUM). This is
! * the same computation done by GetOldestXmin(true, false).
*----------
*/
Snapshot
*************** GetSnapshotData(Snapshot snapshot, bool
*** 561,575 ****
/*
* Ignore my own proc (dealt with my xid above), procs not running a
! * transaction, and xacts started since we read the next transaction
! * ID. There's no need to store XIDs above what we got from
! * ReadNewTransactionId, since we'll treat them as running anyway. We
! * also assume that such xacts can't compute an xmin older than ours,
! * so they needn't be considered in computing globalxmin.
*/
if (proc == MyProc ||
!TransactionIdIsNormal(xid) ||
! TransactionIdFollowsOrEquals(xid, xmax))
continue;
if (TransactionIdPrecedes(xid, xmin))
--- 578,594 ----
/*
* Ignore my own proc (dealt with my xid above), procs not running a
! * transaction, xacts started since we read the next transaction
! * ID, and xacts executing LAZY VACUUM. There's no need to store XIDs
! * above what we got from ReadNewTransactionId, since we'll treat them
! * as running anyway. We also assume that such xacts can't compute an
! * xmin older than ours, so they needn't be considered in computing
! * globalxmin.
*/
if (proc == MyProc ||
!TransactionIdIsNormal(xid) ||
! TransactionIdFollowsOrEquals(xid, xmax) ||
! proc->inVacuum)
continue;
if (TransactionIdPrecedes(xid, xmin))
Index: src/backend/storage/lmgr/proc.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/storage/lmgr/proc.c,v
retrieving revision 1.178
diff -c -p -r1.178 proc.c
*** src/backend/storage/lmgr/proc.c 23 Jul 2006 23:08:46 -0000 1.178
--- src/backend/storage/lmgr/proc.c 28 Jul 2006 21:59:42 -0000
*************** InitProcess(void)
*** 257,262 ****
--- 257,263 ----
/* databaseId and roleId will be filled in later */
MyProc->databaseId = InvalidOid;
MyProc->roleId = InvalidOid;
+ MyProc->inVacuum = false;
MyProc->lwWaiting = false;
MyProc->lwExclusive = false;
MyProc->lwWaitLink = NULL;
*************** InitDummyProcess(void)
*** 388,393 ****
--- 389,395 ----
MyProc->xmin = InvalidTransactionId;
MyProc->databaseId = InvalidOid;
MyProc->roleId = InvalidOid;
+ MyProc->inVacuum = false;
MyProc->lwWaiting = false;
MyProc->lwExclusive = false;
MyProc->lwWaitLink = NULL;
Index: src/include/storage/proc.h
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/include/storage/proc.h,v
retrieving revision 1.89
diff -c -p -r1.89 proc.h
*** src/include/storage/proc.h 13 Jul 2006 16:49:20 -0000 1.89
--- src/include/storage/proc.h 28 Jul 2006 21:59:42 -0000
*************** struct PGPROC
*** 66,78 ****
* this proc */
TransactionId xmin; /* minimal running XID as it was when we were
! * starting our xact: vacuum must not remove
! * tuples deleted by xid >= xmin ! */
int pid; /* This backend's process id, or 0 */
Oid databaseId; /* OID of database this backend is using */
Oid roleId; /* OID of role using this backend */
/* Info about LWLock the process is currently waiting for, if any. */
bool lwWaiting; /* true if waiting for an LW lock */
bool lwExclusive; /* true if waiting for exclusive access */
--- 66,81 ----
* this proc */
TransactionId xmin; /* minimal running XID as it was when we were
! * starting our xact, excluding LAZY VACUUM:
! * vacuum must not remove tuples deleted by
! * xid >= xmin ! */
int pid; /* This backend's process id, or 0 */
Oid databaseId; /* OID of database this backend is using */
Oid roleId; /* OID of role using this backend */
+ bool inVacuum; /* true if current xact is a LAZY VACUUM */
+
/* Info about LWLock the process is currently waiting for, if any. */
bool lwWaiting; /* true if waiting for an LW lock */
bool lwExclusive; /* true if waiting for exclusive access */
Index: src/include/storage/procarray.h
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/include/storage/procarray.h,v
retrieving revision 1.9
diff -c -p -r1.9 procarray.h
*** src/include/storage/procarray.h 19 Jun 2006 01:51:22 -0000 1.9
--- src/include/storage/procarray.h 28 Jul 2006 21:59:42 -0000
*************** extern void ProcArrayRemove(PGPROC *proc
*** 24,30 ****
extern bool TransactionIdIsInProgress(TransactionId xid);
extern bool TransactionIdIsActive(TransactionId xid);
! extern TransactionId GetOldestXmin(bool allDbs);
extern PGPROC *BackendPidGetProc(int pid);
extern int BackendXidGetPid(TransactionId xid);
--- 24,30 ----
extern bool TransactionIdIsInProgress(TransactionId xid);
extern bool TransactionIdIsActive(TransactionId xid);
! extern TransactionId GetOldestXmin(bool allDbs, bool ignoreVacuum);
extern PGPROC *BackendPidGetProc(int pid);
extern int BackendXidGetPid(TransactionId xid);
Jim Nasby wrote:
On Jul 28, 2006, at 5:05 PM, Hannu Krosing wrote:
So instead of actually *solving* one problem you suggest *thinking*
about solving the general case ?We have been *thinking* about dead-space-map for at least three
years by now.No, I just wanted anyone who was actually going to work on this to
think about a more general fix. If the vacuum-only fix has a chance
of getting into core a version before the general case, I'll happily
take what I can get.
Well, the vacuum-only fix has the advantage that the patch has already
been written, tested, discussed, beaten to death, resurrected,
rewritten, and is ready to be committed, while the "general solution" is
not even past the handwaving phase, let alone *thinking*.
And we have only three days before feature freeze, so if you want the
general solution for 8.2 you should start *thinking* really fast :-)
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera wrote:
Here is a patch pursuant to there ideas. The main change is that in
GetSnapshotData, a backend is skipped entirely if inVacuum is found to
be true.
Patch applied.
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.