Storing hot members of PGPROC out of the band
Hi All,
While working on some of the performance issues on HP-UX, I noticed a
significant data cache misses for accessing PGPROC members. On a close
inspection, it was quite evident that for large number (even few 10s)
of clients, the loop inside GetSnapshotData will cause data cache miss
for almost every PGPROC because the PGPROC structure is quite heavy
and no more than one structure may fit in a single cache line. So I
experimented by separating the most frequently and closely accessed
members of the PGPROC into an out of band array. I call it
PGPROC_MINIMAL structure which contains xid, xmin, vacuumFlags amongst
others. Basically, all the commonly accessed members by
GetSnapshotData find a place in this minimal structure.
When PGPROC array is allocated, we also allocate another array of
PGPROC_MINIMAL structures of the same size. While accessing the
ProcArray, a simple pointer mathematic can get us the corresponding
PGPROC_MINIMAL structure. The only exception being the dummy PGPROC
for prepared transaction. A first cut version of the patch is
attached. It looks big, but most of the changes are cosmetic because
of added indirection. The patch also contains another change to keep
the ProcArray sorted by (PGPROC *) to preserve locality of references
while traversing the array.
I did some tests of a 32 core IA HP-UX box and the results are quite
good. With a scale factor of 100 and -N option of pgbench (updates on
only accounts table), the numbers look something like this:
Clients HEAD PGPROC-Patched Gain
1 1098.488663 1059.830369 -3.52%
4 3569.481435 3663.898254 2.65%
32 11627.059228 16419.864056 41.22%
48 11044.501244 15825.132582 43.29%
64 10432.206525 15408.50304 47.70%
80 10210.57835 15170.614435 48.58%
The numbers are quite reproducible with couple of percentage points
variance. So even for single client, I sometimes see no degradation.
Here are some more numbers with the normal pgbench tests (without -N
option).
Clients HEAD PGPROC-Patched Gain
1 743 771 3.77%
4 1821 2315 27.13%
32 8011 9166 14.42%
48 7282 8959 23.03%
64 6742 8937 32.56%
80 6316 8664 37.18%
Its quite possible that the effect of the patch is more evident on the
particular hardware that I am testing. But the approach nevertheless
seems reasonable. It will very useful if someone else having access to
a large box can test the effect of the patch.
BTW, since I played with many versions of the patch, the exact numbers
with this version might be a little different than what I posted
above. I will conduct more tests, especially with more number of
clients and see if there is any difference in the improvement.
Thanks,
Pavan
--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com
Attachments:
pgproc-minimal-v13.patchtext/x-patch; charset=US-ASCII; name=pgproc-minimal-v13.patchDownload
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 477982d..b907f72 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -114,6 +114,7 @@ int max_prepared_xacts = 0;
typedef struct GlobalTransactionData
{
PGPROC proc; /* dummy proc */
+ PGPROC_MINIMAL proc_minimal; /* dummy proc_minimal */
BackendId dummyBackendId; /* similar to backend id for backends */
TimestampTz prepared_at; /* time of preparation */
XLogRecPtr prepare_lsn; /* XLOG offset of prepare record */
@@ -223,6 +224,9 @@ TwoPhaseShmemInit(void)
* technique.
*/
gxacts[i].dummyBackendId = MaxBackends + 1 + i;
+
+ /* Initialize minimal proc structure from the global structure */
+ gxacts[i].proc.proc_minimal = &gxacts[i].proc_minimal;
}
}
else
@@ -310,14 +314,15 @@ MarkAsPreparing(TransactionId xid, const char *gid,
gxact->proc.waitStatus = STATUS_OK;
/* We set up the gxact's VXID as InvalidBackendId/XID */
gxact->proc.lxid = (LocalTransactionId) xid;
- gxact->proc.xid = xid;
- gxact->proc.xmin = InvalidTransactionId;
+ gxact->proc.proc_minimal = &gxact->proc_minimal;
+ gxact->proc.proc_minimal->xid = xid;
+ gxact->proc.proc_minimal->xmin = InvalidTransactionId;
gxact->proc.pid = 0;
gxact->proc.backendId = InvalidBackendId;
gxact->proc.databaseId = databaseid;
gxact->proc.roleId = owner;
- gxact->proc.inCommit = false;
- gxact->proc.vacuumFlags = 0;
+ gxact->proc.proc_minimal->inCommit = false;
+ gxact->proc.proc_minimal->vacuumFlags = 0;
gxact->proc.lwWaiting = false;
gxact->proc.lwExclusive = false;
gxact->proc.lwWaitLink = NULL;
@@ -326,8 +331,8 @@ MarkAsPreparing(TransactionId xid, const char *gid,
for (i = 0; i < NUM_LOCK_PARTITIONS; i++)
SHMQueueInit(&(gxact->proc.myProcLocks[i]));
/* subxid data must be filled later by GXactLoadSubxactData */
- gxact->proc.subxids.overflowed = false;
- gxact->proc.subxids.nxids = 0;
+ gxact->proc.proc_minimal->overflowed = false;
+ gxact->proc.proc_minimal->nxids = 0;
gxact->prepared_at = prepared_at;
/* initialize LSN to 0 (start of WAL) */
@@ -361,14 +366,14 @@ GXactLoadSubxactData(GlobalTransaction gxact, int nsubxacts,
/* We need no extra lock since the GXACT isn't valid yet */
if (nsubxacts > PGPROC_MAX_CACHED_SUBXIDS)
{
- gxact->proc.subxids.overflowed = true;
+ gxact->proc.proc_minimal->overflowed = true;
nsubxacts = PGPROC_MAX_CACHED_SUBXIDS;
}
if (nsubxacts > 0)
{
memcpy(gxact->proc.subxids.xids, children,
nsubxacts * sizeof(TransactionId));
- gxact->proc.subxids.nxids = nsubxacts;
+ gxact->proc.proc_minimal->nxids = nsubxacts;
}
}
@@ -519,7 +524,7 @@ TransactionIdIsPrepared(TransactionId xid)
{
GlobalTransaction gxact = TwoPhaseState->prepXacts[i];
- if (gxact->valid && gxact->proc.xid == xid)
+ if (gxact->valid && gxact->proc_minimal.xid == xid)
{
result = true;
break;
@@ -656,7 +661,7 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
MemSet(values, 0, sizeof(values));
MemSet(nulls, 0, sizeof(nulls));
- values[0] = TransactionIdGetDatum(gxact->proc.xid);
+ values[0] = TransactionIdGetDatum(gxact->proc_minimal.xid);
values[1] = CStringGetTextDatum(gxact->gid);
values[2] = TimestampTzGetDatum(gxact->prepared_at);
values[3] = ObjectIdGetDatum(gxact->owner);
@@ -712,7 +717,7 @@ TwoPhaseGetDummyProc(TransactionId xid)
{
GlobalTransaction gxact = TwoPhaseState->prepXacts[i];
- if (gxact->proc.xid == xid)
+ if (gxact->proc_minimal.xid == xid)
{
result = &gxact->proc;
break;
@@ -841,7 +846,7 @@ save_state_data(const void *data, uint32 len)
void
StartPrepare(GlobalTransaction gxact)
{
- TransactionId xid = gxact->proc.xid;
+ TransactionId xid = gxact->proc_minimal.xid;
TwoPhaseFileHeader hdr;
TransactionId *children;
RelFileNode *commitrels;
@@ -913,7 +918,7 @@ StartPrepare(GlobalTransaction gxact)
void
EndPrepare(GlobalTransaction gxact)
{
- TransactionId xid = gxact->proc.xid;
+ TransactionId xid = gxact->proc_minimal.xid;
TwoPhaseFileHeader *hdr;
char path[MAXPGPATH];
XLogRecData *record;
@@ -1021,7 +1026,7 @@ EndPrepare(GlobalTransaction gxact)
*/
START_CRIT_SECTION();
- MyProc->inCommit = true;
+ MyProc->proc_minimal->inCommit = true;
gxact->prepare_lsn = XLogInsert(RM_XACT_ID, XLOG_XACT_PREPARE,
records.head);
@@ -1069,7 +1074,7 @@ EndPrepare(GlobalTransaction gxact)
* checkpoint starting after this will certainly see the gxact as a
* candidate for fsyncing.
*/
- MyProc->inCommit = false;
+ MyProc->proc_minimal->inCommit = false;
END_CRIT_SECTION();
@@ -1260,7 +1265,7 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
* try to commit the same GID at once.
*/
gxact = LockGXact(gid, GetUserId());
- xid = gxact->proc.xid;
+ xid = gxact->proc_minimal.xid;
/*
* Read and validate the state file
@@ -1543,7 +1548,7 @@ CheckPointTwoPhase(XLogRecPtr redo_horizon)
if (gxact->valid &&
XLByteLE(gxact->prepare_lsn, redo_horizon))
- xids[nxids++] = gxact->proc.xid;
+ xids[nxids++] = gxact->proc_minimal.xid;
}
LWLockRelease(TwoPhaseStateLock);
@@ -1972,7 +1977,7 @@ RecordTransactionCommitPrepared(TransactionId xid,
START_CRIT_SECTION();
/* See notes in RecordTransactionCommit */
- MyProc->inCommit = true;
+ MyProc->proc_minimal->inCommit = true;
/* Emit the XLOG commit record */
xlrec.xid = xid;
@@ -2037,7 +2042,7 @@ RecordTransactionCommitPrepared(TransactionId xid,
TransactionIdCommitTree(xid, nchildren, children);
/* Checkpoint can proceed now */
- MyProc->inCommit = false;
+ MyProc->proc_minimal->inCommit = false;
END_CRIT_SECTION();
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 61dcfed..0effa1a 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -54,7 +54,7 @@ GetNewTransactionId(bool isSubXact)
if (IsBootstrapProcessingMode())
{
Assert(!isSubXact);
- MyProc->xid = BootstrapTransactionId;
+ MyProc->proc_minimal->xid = BootstrapTransactionId;
return BootstrapTransactionId;
}
@@ -210,18 +210,18 @@ GetNewTransactionId(bool isSubXact)
volatile PGPROC *myproc = MyProc;
if (!isSubXact)
- myproc->xid = xid;
+ myproc->proc_minimal->xid = xid;
else
{
- int nxids = myproc->subxids.nxids;
+ int nxids = myproc->proc_minimal->nxids;
if (nxids < PGPROC_MAX_CACHED_SUBXIDS)
{
myproc->subxids.xids[nxids] = xid;
- myproc->subxids.nxids = nxids + 1;
+ myproc->proc_minimal->nxids = nxids + 1;
}
else
- myproc->subxids.overflowed = true;
+ myproc->proc_minimal->overflowed = true;
}
}
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index c151d3b..838bd23 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -981,7 +981,7 @@ RecordTransactionCommit(void)
* bit fuzzy, but it doesn't matter.
*/
START_CRIT_SECTION();
- MyProc->inCommit = true;
+ MyProc->proc_minimal->inCommit = true;
SetCurrentTransactionStopTimestamp();
@@ -1155,7 +1155,7 @@ RecordTransactionCommit(void)
*/
if (markXidCommitted)
{
- MyProc->inCommit = false;
+ MyProc->proc_minimal->inCommit = false;
END_CRIT_SECTION();
}
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 32985a4..a6ee452 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -223,7 +223,7 @@ analyze_rel(Oid relid, VacuumStmt *vacstmt, BufferAccessStrategy bstrategy)
* OK, let's do it. First let other backends know I'm in ANALYZE.
*/
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- MyProc->vacuumFlags |= PROC_IN_ANALYZE;
+ MyProc->proc_minimal->vacuumFlags |= PROC_IN_ANALYZE;
LWLockRelease(ProcArrayLock);
/*
@@ -250,7 +250,7 @@ analyze_rel(Oid relid, VacuumStmt *vacstmt, BufferAccessStrategy bstrategy)
* because the vacuum flag is cleared by the end-of-xact code.
*/
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- MyProc->vacuumFlags &= ~PROC_IN_ANALYZE;
+ MyProc->proc_minimal->vacuumFlags &= ~PROC_IN_ANALYZE;
LWLockRelease(ProcArrayLock);
}
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index f42504c..c85c002 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -893,9 +893,9 @@ vacuum_rel(Oid relid, VacuumStmt *vacstmt, bool do_toast, bool for_wraparound)
* which is probably Not Good.
*/
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- MyProc->vacuumFlags |= PROC_IN_VACUUM;
+ MyProc->proc_minimal->vacuumFlags |= PROC_IN_VACUUM;
if (for_wraparound)
- MyProc->vacuumFlags |= PROC_VACUUM_FOR_WRAPAROUND;
+ MyProc->proc_minimal->vacuumFlags |= PROC_VACUUM_FOR_WRAPAROUND;
LWLockRelease(ProcArrayLock);
}
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index dd2d6ee..aa54a98 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -702,7 +702,7 @@ ProcessStandbyHSFeedbackMessage(void)
* safe, and if we're moving it backwards, well, the data is at risk
* already since a VACUUM could have just finished calling GetOldestXmin.)
*/
- MyProc->xmin = reply.xmin;
+ MyProc->proc_minimal->xmin = reply.xmin;
}
/* Main loop of walsender process */
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 1a48485..f0e1c5a 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -141,6 +141,29 @@ static void DisplayXidCache(void);
#define xc_slow_answer_inc() ((void) 0)
#endif /* XIDCACHE_DEBUG */
+/*
+ * Get the PROC_MINIMAL structure corresponding to the given PGPROC. The trick
+ * is to avoid access to any PGPROC member to find the minimal structure
+ * because we want to avoid access to any PGPROC member unless its absolutely
+ * necessary. This reduces cache misses and faults. Except for dummy procs for
+ * prepared transactions, all other procs have a corresponding entry in the
+ * allProcs_Minimal array of PGPROC_MINIMAL. So we use pointer arithmetic to
+ * get that.
+ *
+ * XXX A branch mis-prediction can still end up touching the PGPROC
+ * proc_minimal member and that would cause cache miss. We have seen this on
+ * HP-UX compiler. It might be easier to handle with GCC by use of
+ * likely/unlikely hint knowing that in almost all cases its not needed to go
+ * to the PGPROC member or by rearranging the following code. For example, we
+ * can return (PGPROC_MINIMAL **) to avoid accessing PGPROC structure unless
+ * its required for prepared transaction
+ */
+#define PGProcGetMinimal(proc, procglobal) \
+ ((proc) >= (procglobal)->allProcs && \
+ (proc) < (procglobal)->allProcs + (procglobal)->allProcCount) ? \
+ &(procglobal)->allProcs_Minimal[(proc) - (procglobal)->allProcs] : \
+ (proc)->proc_minimal;
+
/* Primitives for KnownAssignedXids array handling for standby */
static void KnownAssignedXidsCompress(bool force);
static void KnownAssignedXidsAdd(TransactionId from_xid, TransactionId to_xid,
@@ -253,6 +276,7 @@ void
ProcArrayAdd(PGPROC *proc)
{
ProcArrayStruct *arrayP = procArray;
+ int index;
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
@@ -269,7 +293,28 @@ ProcArrayAdd(PGPROC *proc)
errmsg("sorry, too many clients already")));
}
- arrayP->procs[arrayP->numProcs] = proc;
+ /*
+ * Keep the procs array sorted by (PGPROC *) so that we can utilize
+ * locality of references much better. This is useful while traversing the
+ * ProcArray because there is a increased likelyhood of finding the next
+ * PGPROC structure in the cache.
+ *
+ * Since the occurance of adding/removing a proc is much lower than the
+ * access to the ProcArray itself, the overhead should be marginal
+ */
+ for (index = 0; index < arrayP->numProcs; index++)
+ {
+ /*
+ * If we are the first PGPROC or if we have found our right position in
+ * the array, break
+ */
+ if ((arrayP->procs[index] == NULL) || (arrayP->procs[index] > proc))
+ break;
+ }
+
+ memmove(&arrayP->procs[index + 1], &arrayP->procs[index],
+ (arrayP->numProcs - index) * sizeof (PGPROC *));
+ arrayP->procs[index] = proc;
arrayP->numProcs++;
LWLockRelease(ProcArrayLock);
@@ -318,7 +363,9 @@ ProcArrayRemove(PGPROC *proc, TransactionId latestXid)
{
if (arrayP->procs[index] == proc)
{
- arrayP->procs[index] = arrayP->procs[arrayP->numProcs - 1];
+ /* Keep the PGPROC array sorted. See notes above */
+ memmove(&arrayP->procs[index], &arrayP->procs[index + 1],
+ (arrayP->numProcs - index - 1) * sizeof (PGPROC *));
arrayP->procs[arrayP->numProcs - 1] = NULL; /* for debugging */
arrayP->numProcs--;
LWLockRelease(ProcArrayLock);
@@ -361,17 +408,17 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- proc->xid = InvalidTransactionId;
+ proc->proc_minimal->xid = InvalidTransactionId;
proc->lxid = InvalidLocalTransactionId;
- proc->xmin = InvalidTransactionId;
+ proc->proc_minimal->xmin = InvalidTransactionId;
/* must be cleared with xid/xmin: */
- proc->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
- proc->inCommit = false; /* be sure this is cleared in abort */
+ proc->proc_minimal->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
+ proc->proc_minimal->inCommit = false; /* be sure this is cleared in abort */
proc->recoveryConflictPending = false;
/* Clear the subtransaction-XID cache too while holding the lock */
- proc->subxids.nxids = 0;
- proc->subxids.overflowed = false;
+ proc->proc_minimal->nxids = 0;
+ proc->proc_minimal->overflowed = false;
/* Also advance global latestCompletedXid while holding the lock */
if (TransactionIdPrecedes(ShmemVariableCache->latestCompletedXid,
@@ -390,10 +437,10 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
Assert(!TransactionIdIsValid(proc->xid));
proc->lxid = InvalidLocalTransactionId;
- proc->xmin = InvalidTransactionId;
+ proc->proc_minimal->xmin = InvalidTransactionId;
/* must be cleared with xid/xmin: */
- proc->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
- proc->inCommit = false; /* be sure this is cleared in abort */
+ proc->proc_minimal->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
+ proc->proc_minimal->inCommit = false; /* be sure this is cleared in abort */
proc->recoveryConflictPending = false;
Assert(proc->subxids.nxids == 0);
@@ -419,18 +466,18 @@ ProcArrayClearTransaction(PGPROC *proc)
* duplicate with the gxact that has already been inserted into the
* ProcArray.
*/
- proc->xid = InvalidTransactionId;
+ proc->proc_minimal->xid = InvalidTransactionId;
proc->lxid = InvalidLocalTransactionId;
- proc->xmin = InvalidTransactionId;
+ proc->proc_minimal->xmin = InvalidTransactionId;
proc->recoveryConflictPending = false;
/* redundant, but just in case */
- proc->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
- proc->inCommit = false;
+ proc->proc_minimal->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
+ proc->proc_minimal->inCommit = false;
/* Clear the subtransaction-XID cache too */
- proc->subxids.nxids = 0;
- proc->subxids.overflowed = false;
+ proc->proc_minimal->nxids = 0;
+ proc->proc_minimal->overflowed = false;
}
/*
@@ -741,6 +788,10 @@ TransactionIdIsInProgress(TransactionId xid)
TransactionId topxid;
int i,
j;
+#ifdef NOT_USED
+ static PGPROC **procs = NULL;
+ static PGPROC_MINIMAL *txns = NULL;
+#endif
/*
* Don't bother checking a transaction older than RecentXmin; it could not
@@ -811,15 +862,19 @@ TransactionIdIsInProgress(TransactionId xid)
/* No shortcuts, gotta grovel through the array */
for (i = 0; i < arrayP->numProcs; i++)
{
+ volatile PROC_HDR *procglobal = ProcGlobal;
volatile PGPROC *proc = arrayP->procs[i];
+ volatile PGPROC_MINIMAL *proc_minimal;
TransactionId pxid;
/* Ignore my own proc --- dealt with it above */
if (proc == MyProc)
continue;
+ proc_minimal = PGProcGetMinimal(proc, procglobal);
+
/* Fetch xid just once - see GetNewTransactionId */
- pxid = proc->xid;
+ pxid = proc_minimal->xid;
if (!TransactionIdIsValid(pxid))
continue;
@@ -844,7 +899,7 @@ TransactionIdIsInProgress(TransactionId xid)
/*
* Step 2: check the cached child-Xids arrays
*/
- for (j = proc->subxids.nxids - 1; j >= 0; j--)
+ for (j = proc_minimal->nxids - 1; j >= 0; j--)
{
/* Fetch xid just once - see GetNewTransactionId */
TransactionId cxid = proc->subxids.xids[j];
@@ -864,7 +919,7 @@ TransactionIdIsInProgress(TransactionId xid)
* we hold ProcArrayLock. So we can't miss an Xid that we need to
* worry about.)
*/
- if (proc->subxids.overflowed)
+ if (proc_minimal->overflowed)
xids[nxids++] = pxid;
}
@@ -965,10 +1020,15 @@ TransactionIdIsActive(TransactionId xid)
for (i = 0; i < arrayP->numProcs; i++)
{
+ volatile PROC_HDR *procglobal = ProcGlobal;
volatile PGPROC *proc = arrayP->procs[i];
+ volatile PGPROC_MINIMAL *proc_minimal;
+ TransactionId pxid;
+
+ proc_minimal = PGProcGetMinimal(proc, procglobal);
/* Fetch xid just once - see GetNewTransactionId */
- TransactionId pxid = proc->xid;
+ pxid = proc_minimal->xid;
if (!TransactionIdIsValid(pxid))
continue;
@@ -1060,9 +1120,13 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
for (index = 0; index < arrayP->numProcs; index++)
{
+ volatile PROC_HDR *procglobal = ProcGlobal;
volatile PGPROC *proc = arrayP->procs[index];
+ volatile PGPROC_MINIMAL *proc_minimal;
+
+ proc_minimal = PGProcGetMinimal(proc, procglobal);
- if (ignoreVacuum && (proc->vacuumFlags & PROC_IN_VACUUM))
+ if (ignoreVacuum && (proc_minimal->vacuumFlags & PROC_IN_VACUUM))
continue;
if (allDbs ||
@@ -1070,7 +1134,7 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
proc->databaseId == 0) /* always include WalSender */
{
/* Fetch xid just once - see GetNewTransactionId */
- TransactionId xid = proc->xid;
+ TransactionId xid = proc_minimal->xid;
/* First consider the transaction's own Xid, if any */
if (TransactionIdIsNormal(xid) &&
@@ -1084,7 +1148,7 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
* have an Xmin but not (yet) an Xid; conversely, if it has an
* Xid, that could determine some not-yet-set Xmin.
*/
- xid = proc->xmin; /* Fetch just once */
+ xid = proc_minimal->xmin; /* Fetch just once */
if (TransactionIdIsNormal(xid) &&
TransactionIdPrecedes(xid, result))
result = xid;
@@ -1271,21 +1335,35 @@ GetSnapshotData(Snapshot snapshot)
*/
for (index = 0; index < arrayP->numProcs; index++)
{
+ volatile PROC_HDR *procglobal = ProcGlobal;
volatile PGPROC *proc = arrayP->procs[index];
+ volatile PGPROC_MINIMAL *proc_minimal;
TransactionId xid;
+ /*
+ * All the information needed by GetSnapshotData is stored out of
+ * band as an array of PGPROC_MINIMAL structures. The only exception
+ * is the dummy PGPROC for prepared transaction.
+ *
+ * We try to avoid accessing any field from the PGPROC because that
+ * may cause cache miss and associated overhead. We instead try to
+ * directly access the relevant information by doing pointer
+ * arithmatic.
+ */
+ proc_minimal = PGProcGetMinimal(proc, procglobal);
+
/* Ignore procs running LAZY VACUUM */
- if (proc->vacuumFlags & PROC_IN_VACUUM)
+ if (proc_minimal->vacuumFlags & PROC_IN_VACUUM)
continue;
/* Update globalxmin to be the smallest valid xmin */
- xid = proc->xmin; /* fetch just once */
+ xid = proc_minimal->xmin; /* fetch just once */
if (TransactionIdIsNormal(xid) &&
TransactionIdPrecedes(xid, globalxmin))
globalxmin = xid;
/* Fetch xid just once - see GetNewTransactionId */
- xid = proc->xid;
+ xid = proc_minimal->xid;
/*
* If the transaction has been assigned an xid < xmax we add it to
@@ -1323,11 +1401,11 @@ GetSnapshotData(Snapshot snapshot)
*/
if (!suboverflowed && proc != MyProc)
{
- if (proc->subxids.overflowed)
+ if (proc_minimal->overflowed)
suboverflowed = true;
else
{
- int nxids = proc->subxids.nxids;
+ int nxids = proc_minimal->nxids;
if (nxids > 0)
{
@@ -1372,9 +1450,8 @@ GetSnapshotData(Snapshot snapshot)
suboverflowed = true;
}
- if (!TransactionIdIsValid(MyProc->xmin))
- MyProc->xmin = TransactionXmin = xmin;
-
+ if (!TransactionIdIsValid(MyProc->proc_minimal->xmin))
+ MyProc->proc_minimal->xmin = TransactionXmin = xmin;
LWLockRelease(ProcArrayLock);
/*
@@ -1436,14 +1513,18 @@ ProcArrayInstallImportedXmin(TransactionId xmin, TransactionId sourcexid)
for (index = 0; index < arrayP->numProcs; index++)
{
+ volatile PROC_HDR *procglobal = ProcGlobal;
volatile PGPROC *proc = arrayP->procs[index];
+ volatile PGPROC_MINIMAL *proc_minimal;
TransactionId xid;
+ proc_minimal = PGProcGetMinimal(proc, procglobal);
+
/* Ignore procs running LAZY VACUUM */
- if (proc->vacuumFlags & PROC_IN_VACUUM)
+ if (proc_minimal->vacuumFlags & PROC_IN_VACUUM)
continue;
- xid = proc->xid; /* fetch just once */
+ xid = proc_minimal->xid; /* fetch just once */
if (xid != sourcexid)
continue;
@@ -1459,7 +1540,7 @@ ProcArrayInstallImportedXmin(TransactionId xmin, TransactionId sourcexid)
/*
* Likewise, let's just make real sure its xmin does cover us.
*/
- xid = proc->xmin; /* fetch just once */
+ xid = proc_minimal->xmin; /* fetch just once */
if (!TransactionIdIsNormal(xid) ||
!TransactionIdPrecedesOrEquals(xid, xmin))
continue;
@@ -1470,7 +1551,7 @@ ProcArrayInstallImportedXmin(TransactionId xmin, TransactionId sourcexid)
* GetSnapshotData first, we'll be overwriting a valid xmin here,
* so we don't check that.)
*/
- MyProc->xmin = TransactionXmin = xmin;
+ MyProc->proc_minimal->xmin = TransactionXmin = xmin;
result = true;
break;
@@ -1562,12 +1643,16 @@ GetRunningTransactionData(void)
*/
for (index = 0; index < arrayP->numProcs; index++)
{
+ volatile PROC_HDR *procglobal = ProcGlobal;
volatile PGPROC *proc = arrayP->procs[index];
+ volatile PGPROC_MINIMAL *proc_minimal;
TransactionId xid;
int nxids;
+ proc_minimal = PGProcGetMinimal(proc, procglobal);
+
/* Fetch xid just once - see GetNewTransactionId */
- xid = proc->xid;
+ xid = proc_minimal->xid;
/*
* We don't need to store transactions that don't have a TransactionId
@@ -1585,7 +1670,7 @@ GetRunningTransactionData(void)
* Save subtransaction XIDs. Other backends can't add or remove
* entries while we're holding XidGenLock.
*/
- nxids = proc->subxids.nxids;
+ nxids = proc_minimal->nxids;
if (nxids > 0)
{
memcpy(&xids[count], (void *) proc->subxids.xids,
@@ -1593,7 +1678,7 @@ GetRunningTransactionData(void)
count += nxids;
subcount += nxids;
- if (proc->subxids.overflowed)
+ if (proc_minimal->overflowed)
suboverflowed = true;
/*
@@ -1653,11 +1738,15 @@ GetOldestActiveTransactionId(void)
*/
for (index = 0; index < arrayP->numProcs; index++)
{
+ volatile PROC_HDR *procglobal = ProcGlobal;
volatile PGPROC *proc = arrayP->procs[index];
+ volatile PGPROC_MINIMAL *proc_minimal;
TransactionId xid;
+ proc_minimal = PGProcGetMinimal(proc, procglobal);
+
/* Fetch xid just once - see GetNewTransactionId */
- xid = proc->xid;
+ xid = proc_minimal->xid;
if (!TransactionIdIsNormal(xid))
continue;
@@ -1709,12 +1798,17 @@ GetTransactionsInCommit(TransactionId **xids_p)
for (index = 0; index < arrayP->numProcs; index++)
{
+ volatile PROC_HDR *procglobal = ProcGlobal;
volatile PGPROC *proc = arrayP->procs[index];
+ volatile PGPROC_MINIMAL *proc_minimal;
+ TransactionId pxid;
+
+ proc_minimal = PGProcGetMinimal(proc, procglobal);
/* Fetch xid just once - see GetNewTransactionId */
- TransactionId pxid = proc->xid;
+ pxid = proc_minimal->xid;
- if (proc->inCommit && TransactionIdIsValid(pxid))
+ if (proc_minimal->inCommit && TransactionIdIsValid(pxid))
xids[nxids++] = pxid;
}
@@ -1744,12 +1838,17 @@ HaveTransactionsInCommit(TransactionId *xids, int nxids)
for (index = 0; index < arrayP->numProcs; index++)
{
+ volatile PROC_HDR *procglobal = ProcGlobal;
volatile PGPROC *proc = arrayP->procs[index];
+ volatile PGPROC_MINIMAL *proc_minimal;
+ TransactionId pxid;
+
+ proc_minimal = PGProcGetMinimal(proc, procglobal);
/* Fetch xid just once - see GetNewTransactionId */
- TransactionId pxid = proc->xid;
+ pxid = proc_minimal->xid;
- if (proc->inCommit && TransactionIdIsValid(pxid))
+ if (proc_minimal->inCommit && TransactionIdIsValid(pxid))
{
int i;
@@ -1833,9 +1932,13 @@ BackendXidGetPid(TransactionId xid)
for (index = 0; index < arrayP->numProcs; index++)
{
+ volatile PROC_HDR *procglobal = ProcGlobal;
volatile PGPROC *proc = arrayP->procs[index];
+ volatile PGPROC_MINIMAL *proc_minimal;
- if (proc->xid == xid)
+ proc_minimal = PGProcGetMinimal(proc, procglobal);
+
+ if (proc_minimal->xid == xid)
{
result = proc->pid;
break;
@@ -1906,13 +2009,13 @@ GetCurrentVirtualXIDs(TransactionId limitXmin, bool excludeXmin0,
if (proc == MyProc)
continue;
- if (excludeVacuum & proc->vacuumFlags)
+ if (excludeVacuum & proc->proc_minimal->vacuumFlags)
continue;
if (allDbs || proc->databaseId == MyDatabaseId)
{
/* Fetch xmin just once - might change on us */
- TransactionId pxmin = proc->xmin;
+ TransactionId pxmin = proc->proc_minimal->xmin;
if (excludeXmin0 && !TransactionIdIsValid(pxmin))
continue;
@@ -2006,7 +2109,7 @@ GetConflictingVirtualXIDs(TransactionId limitXmin, Oid dbOid)
proc->databaseId == dbOid)
{
/* Fetch xmin just once - can't change on us, but good coding */
- TransactionId pxmin = proc->xmin;
+ TransactionId pxmin = proc->proc_minimal->xmin;
/*
* We ignore an invalid pxmin because this means that backend has
@@ -2104,7 +2207,9 @@ MinimumActiveBackends(int min)
*/
for (index = 0; index < arrayP->numProcs; index++)
{
+ volatile PROC_HDR *procglobal = ProcGlobal;
volatile PGPROC *proc = arrayP->procs[index];
+ volatile PGPROC_MINIMAL *proc_minimal;
/*
* Since we're not holding a lock, need to check that the pointer is
@@ -2120,12 +2225,14 @@ MinimumActiveBackends(int min)
if (proc == NULL)
continue;
+ proc_minimal = PGProcGetMinimal(proc, procglobal);
+
if (proc == MyProc)
continue; /* do not count myself */
+ if (proc_minimal->xid == InvalidTransactionId)
+ continue; /* do not count if no XID assigned */
if (proc->pid == 0)
continue; /* do not count prepared xacts */
- if (proc->xid == InvalidTransactionId)
- continue; /* do not count if no XID assigned */
if (proc->waitLock != NULL)
continue; /* do not count if blocked on a lock */
count++;
@@ -2291,7 +2398,7 @@ CountOtherDBBackends(Oid databaseId, int *nbackends, int *nprepared)
else
{
(*nbackends)++;
- if ((proc->vacuumFlags & PROC_IS_AUTOVACUUM) &&
+ if ((proc->proc_minimal->vacuumFlags & PROC_IS_AUTOVACUUM) &&
nautovacs < MAXAUTOVACPIDS)
autovac_pids[nautovacs++] = proc->pid;
}
@@ -2321,8 +2428,8 @@ CountOtherDBBackends(Oid databaseId, int *nbackends, int *nprepared)
#define XidCacheRemove(i) \
do { \
- MyProc->subxids.xids[i] = MyProc->subxids.xids[MyProc->subxids.nxids - 1]; \
- MyProc->subxids.nxids--; \
+ MyProc->subxids.xids[i] = MyProc->subxids.xids[MyProc->proc_minimal->nxids - 1]; \
+ MyProc->proc_minimal->nxids--; \
} while (0)
/*
@@ -2361,7 +2468,7 @@ XidCacheRemoveRunningXids(TransactionId xid,
{
TransactionId anxid = xids[i];
- for (j = MyProc->subxids.nxids - 1; j >= 0; j--)
+ for (j = MyProc->proc_minimal->nxids - 1; j >= 0; j--)
{
if (TransactionIdEquals(MyProc->subxids.xids[j], anxid))
{
@@ -2377,11 +2484,11 @@ XidCacheRemoveRunningXids(TransactionId xid,
* error during AbortSubTransaction. So instead of Assert, emit a
* debug warning.
*/
- if (j < 0 && !MyProc->subxids.overflowed)
+ if (j < 0 && !MyProc->proc_minimal->overflowed)
elog(WARNING, "did not find subXID %u in MyProc", anxid);
}
- for (j = MyProc->subxids.nxids - 1; j >= 0; j--)
+ for (j = MyProc->proc_minimal->nxids - 1; j >= 0; j--)
{
if (TransactionIdEquals(MyProc->subxids.xids[j], xid))
{
@@ -2390,7 +2497,7 @@ XidCacheRemoveRunningXids(TransactionId xid,
}
}
/* Ordinarily we should have found it, unless the cache has overflowed */
- if (j < 0 && !MyProc->subxids.overflowed)
+ if (j < 0 && !MyProc->proc_minimal->overflowed)
elog(WARNING, "did not find subXID %u in MyProc", xid);
/* Also advance global latestCompletedXid while holding the lock */
diff --git a/src/backend/storage/lmgr/deadlock.c b/src/backend/storage/lmgr/deadlock.c
index 7e7f6af..bb1e654 100644
--- a/src/backend/storage/lmgr/deadlock.c
+++ b/src/backend/storage/lmgr/deadlock.c
@@ -541,7 +541,7 @@ FindLockCycleRecurse(PGPROC *checkProc,
* vacuumFlag bit), but we don't do that here to avoid
* grabbing ProcArrayLock.
*/
- if (proc->vacuumFlags & PROC_IS_AUTOVACUUM)
+ if (proc->proc_minimal->vacuumFlags & PROC_IS_AUTOVACUUM)
blocking_autovacuum_proc = proc;
/* This proc hard-blocks checkProc */
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index ed8344f..3e7a557 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -3184,7 +3184,7 @@ GetRunningTransactionLocks(int *nlocks)
PGPROC *proc = proclock->tag.myProc;
LOCK *lock = proclock->tag.myLock;
- accessExclusiveLocks[index].xid = proc->xid;
+ accessExclusiveLocks[index].xid = proc->proc_minimal->xid;
accessExclusiveLocks[index].dbOid = lock->tag.locktag_field1;
accessExclusiveLocks[index].relOid = lock->tag.locktag_field2;
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index eda3a98..0261de7 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -113,6 +113,9 @@ ProcGlobalShmemSize(void)
/* ProcStructLock */
size = add_size(size, sizeof(slock_t));
+ size = add_size(size, mul_size(NUM_AUXILIARY_PROCS, sizeof(PGPROC_MINIMAL)));
+ size = add_size(size, mul_size(MaxBackends, sizeof(PGPROC_MINIMAL)));
+
return size;
}
@@ -157,6 +160,7 @@ void
InitProcGlobal(void)
{
PGPROC *procs;
+ PGPROC_MINIMAL *procs_minimal;
int i,
j;
bool found;
@@ -195,6 +199,22 @@ InitProcGlobal(void)
(errcode(ERRCODE_OUT_OF_MEMORY),
errmsg("out of shared memory")));
MemSet(procs, 0, TotalProcs * sizeof(PGPROC));
+
+ /*
+ * Also allocate a separate array of PROC_MINIMAL structures. We keep this
+ * out of band of the main PGPROC array to ensure the very heavily accessed
+ * members of the PGPROC structure are stored contiguously in the memory.
+ * This provides significant performance benefits, especially on a
+ * multiprocessor system by improving cache hit ratio.
+ *
+ * Note: We separate the members needed by GetSnapshotData since that's the
+ * most frequently accessed code path. There is one PROC_MINIMAL structure
+ * for every PGPROC structure.
+ */
+ procs_minimal = (PGPROC_MINIMAL *) ShmemAlloc(TotalProcs * sizeof(PGPROC_MINIMAL));
+ MemSet(procs_minimal, 0, TotalProcs * sizeof(PGPROC_MINIMAL));
+ ProcGlobal->allProcs_Minimal = procs_minimal;
+
for (i = 0; i < TotalProcs; i++)
{
/* Common initialization for all PGPROCs, regardless of type. */
@@ -203,6 +223,7 @@ InitProcGlobal(void)
PGSemaphoreCreate(&(procs[i].sem));
InitSharedLatch(&(procs[i].procLatch));
procs[i].backendLock = LWLockAssign();
+ procs[i].proc_minimal = &procs_minimal[i];
/*
* Newly created PGPROCs for normal backends or for autovacuum must
@@ -313,18 +334,18 @@ InitProcess(void)
SHMQueueElemInit(&(MyProc->links));
MyProc->waitStatus = STATUS_OK;
MyProc->lxid = InvalidLocalTransactionId;
- MyProc->xid = InvalidTransactionId;
- MyProc->xmin = InvalidTransactionId;
+ MyProc->proc_minimal->xid = InvalidTransactionId;
+ MyProc->proc_minimal->xmin = InvalidTransactionId;
MyProc->pid = MyProcPid;
/* backendId, databaseId and roleId will be filled in later */
MyProc->backendId = InvalidBackendId;
MyProc->databaseId = InvalidOid;
MyProc->roleId = InvalidOid;
- MyProc->inCommit = false;
- MyProc->vacuumFlags = 0;
+ MyProc->proc_minimal->inCommit = false;
+ MyProc->proc_minimal->vacuumFlags = 0;
/* NB -- autovac launcher intentionally does not set IS_AUTOVACUUM */
if (IsAutoVacuumWorkerProcess())
- MyProc->vacuumFlags |= PROC_IS_AUTOVACUUM;
+ MyProc->proc_minimal->vacuumFlags |= PROC_IS_AUTOVACUUM;
MyProc->lwWaiting = false;
MyProc->lwExclusive = false;
MyProc->lwWaitLink = NULL;
@@ -472,13 +493,13 @@ InitAuxiliaryProcess(void)
SHMQueueElemInit(&(MyProc->links));
MyProc->waitStatus = STATUS_OK;
MyProc->lxid = InvalidLocalTransactionId;
- MyProc->xid = InvalidTransactionId;
- MyProc->xmin = InvalidTransactionId;
+ MyProc->proc_minimal->xid = InvalidTransactionId;
+ MyProc->proc_minimal->xmin = InvalidTransactionId;
MyProc->backendId = InvalidBackendId;
MyProc->databaseId = InvalidOid;
MyProc->roleId = InvalidOid;
- MyProc->inCommit = false;
- MyProc->vacuumFlags = 0;
+ MyProc->proc_minimal->inCommit = false;
+ MyProc->proc_minimal->vacuumFlags = 0;
MyProc->lwWaiting = false;
MyProc->lwExclusive = false;
MyProc->lwWaitLink = NULL;
@@ -1053,8 +1074,8 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
* wraparound.
*/
if ((autovac != NULL) &&
- (autovac->vacuumFlags & PROC_IS_AUTOVACUUM) &&
- !(autovac->vacuumFlags & PROC_VACUUM_FOR_WRAPAROUND))
+ (autovac->proc_minimal->vacuumFlags & PROC_IS_AUTOVACUUM) &&
+ !(autovac->proc_minimal->vacuumFlags & PROC_VACUUM_FOR_WRAPAROUND))
{
int pid = autovac->pid;
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index 50fb780..bcfe3f3 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -577,7 +577,7 @@ static void
SnapshotResetXmin(void)
{
if (RegisteredSnapshots == 0 && ActiveSnapshot == NULL)
- MyProc->xmin = InvalidTransactionId;
+ MyProc->proc_minimal->xmin = InvalidTransactionId;
}
/*
diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h
index bc746a3..093a56e 100644
--- a/src/include/storage/lock.h
+++ b/src/include/storage/lock.h
@@ -21,6 +21,7 @@
/* struct PGPROC is declared in proc.h, but must forward-reference it */
typedef struct PGPROC PGPROC;
+typedef struct PGPROC_MINIMAL PGPROC_MINIMAL;
typedef struct PROC_QUEUE
{
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 6e798b1..a136309 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -35,8 +35,6 @@
struct XidCache
{
- bool overflowed;
- int nxids;
TransactionId xids[PGPROC_MAX_CACHED_SUBXIDS];
};
@@ -57,6 +55,8 @@ struct XidCache
*/
#define FP_LOCK_SLOTS_PER_BACKEND 16
+struct PGPROC_MINIMAL;
+
/*
* Each backend has a PGPROC struct in shared memory. There is also a list of
* currently-unused PGPROC structs that will be reallocated to new backends.
@@ -87,14 +87,7 @@ struct PGPROC
* being executed by this proc, if running;
* else InvalidLocalTransactionId */
- TransactionId xid; /* id of top-level transaction currently being
- * executed by this proc, if running and XID
- * is assigned; else InvalidTransactionId */
-
- TransactionId xmin; /* minimal running XID as it was when we were
- * starting our xact, excluding LAZY VACUUM:
- * vacuum must not remove tuples deleted by
- * xid >= xmin ! */
+ PGPROC_MINIMAL *proc_minimal;
int pid; /* Backend's process ID; 0 if prepared xact */
@@ -103,10 +96,6 @@ struct PGPROC
Oid databaseId; /* OID of database this backend is using */
Oid roleId; /* OID of role using this backend */
- bool inCommit; /* true if within commit critical section */
-
- uint8 vacuumFlags; /* vacuum-related flags, see above */
-
/*
* While in hot standby mode, shows that a conflict signal has been sent
* for the current transaction. Set/cleared while holding ProcArrayLock,
@@ -161,6 +150,32 @@ struct PGPROC
extern PGDLLIMPORT PGPROC *MyProc;
+/*
+ * A minimal part of the PGPROC. We store these members out of the main PGPROC
+ * structure since they are very heavily accessed members and usually in a loop
+ * for all active PGPROCs. Storing them in a separate array ensures that these
+ * members can be very effeciently accessed with minimum cache misses. On a
+ * large multiprocessor system, this can show a significant performance
+ * improvement.
+ */
+struct PGPROC_MINIMAL
+{
+ TransactionId xid; /* id of top-level transaction currently being
+ * executed by this proc, if running and XID
+ * is assigned; else InvalidTransactionId */
+
+ TransactionId xmin; /* minimal running XID as it was when we were
+ * starting our xact, excluding LAZY VACUUM:
+ * vacuum must not remove tuples deleted by
+ * xid >= xmin ! */
+
+ uint8 vacuumFlags; /* vacuum-related flags, see above */
+ bool overflowed;
+ bool inCommit; /* true if within commit critical section */
+
+ int nxids;
+};
+
/*
* There is one ProcGlobal struct for the whole database cluster.
@@ -169,6 +184,8 @@ typedef struct PROC_HDR
{
/* Array of PGPROC structures (not including dummies for prepared txns) */
PGPROC *allProcs;
+ /* Array of PGPROC_MINIMAL structures (not including dummies for prepared txns */
+ PGPROC_MINIMAL *allProcs_Minimal;
/* Length of allProcs array */
uint32 allProcCount;
/* Head of list of free PGPROC structures */
On Thu, Nov 3, 2011 at 6:12 PM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
When PGPROC array is allocated, we also allocate another array of
PGPROC_MINIMAL structures of the same size. While accessing the
ProcArray, a simple pointer mathematic can get us the corresponding
PGPROC_MINIMAL structure. The only exception being the dummy PGPROC
for prepared transaction. A first cut version of the patch is
attached. It looks big, but most of the changes are cosmetic because
of added indirection. The patch also contains another change to keep
the ProcArray sorted by (PGPROC *) to preserve locality of references
while traversing the array.
This is very good.
If you look at your PGPROC_MINIMAL, its mostly transaction related
stuff, so I would rename it PGXACT or similar. Not sure why you talk
about pointer math, seems easy enough just to have two arrays
protected by one lock, and have each proc use the same offset in both
arrays.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Nov 4, 2011 at 4:13 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
If you look at your PGPROC_MINIMAL, its mostly transaction related
stuff, so I would rename it PGXACT or similar.
Yeah, that looks good too. Though I am not sure if all fields are
related to transaction state and whether we would need to add more
fields to the structure in future. Having a general name might help in
that case.
Not sure why you talk
about pointer math, seems easy enough just to have two arrays
protected by one lock, and have each proc use the same offset in both
arrays.
Right now we store PGPROC pointers in the ProcArray and the pointer
math gets us the index to look into the other array. But we can
actually just store indexes in the ProcArray to avoid that. A positive
index may mean offset into the normal PGPROC array and a negative
index can be used to get dummy PGPROC from the prepared transactions.
Thanks,
Pavan
On 04.11.2011 00:43, Simon Riggs wrote:
On Thu, Nov 3, 2011 at 6:12 PM, Pavan Deolasee<pavan.deolasee@gmail.com> wrote:
When PGPROC array is allocated, we also allocate another array of
PGPROC_MINIMAL structures of the same size. While accessing the
ProcArray, a simple pointer mathematic can get us the corresponding
PGPROC_MINIMAL structure. The only exception being the dummy PGPROC
for prepared transaction. A first cut version of the patch is
attached. It looks big, but most of the changes are cosmetic because
of added indirection. The patch also contains another change to keep
the ProcArray sorted by (PGPROC *) to preserve locality of references
while traversing the array.This is very good.
If you look at your PGPROC_MINIMAL, its mostly transaction related
stuff, so I would rename it PGXACT or similar. Not sure why you talk
about pointer math, seems easy enough just to have two arrays
protected by one lock, and have each proc use the same offset in both
arrays.
Agreed, that seems more clean. The PGPROCs for prepared transactions are
currently allocated separately, so they're not currently in the same
array as all other PGPROCs, but that could be changed. Here's a WIP
patch for that. I kept the PGPROC_MINIMAL nomenclature for now, but I
agree with Simon's suggestion to rename it.
On 03.11.2011 20:12, Pavan Deolasee wrote:
Its quite possible that the effect of the patch is more evident on the
particular hardware that I am testing. But the approach nevertheless
seems reasonable. It will very useful if someone else having access to
a large box can test the effect of the patch.
I tested this on an 8-core x64 box, but couldn't see any measurable
difference in pgbench performance. I tried with and without -N and -S,
and --unlogged-tables, but no difference.
While looking at GetSnapshotData(), it occurred to me that when a PGPROC
entry does not have its xid set, ie. xid == InvalidTransactionId, it's
pointless to check the subxid array for that proc. If a transaction has
no xid, none of its subtransactions can have an xid either. That's a
trivial optimization, see attached. I tested this with "pgbench -S -M
prepared -c 500" on the 8-core box, and it seemed to make a 1-2%
difference (without the other patch). So, almost in the noise, but it
also doesn't cost us anything in terms of readability or otherwise.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Attachments:
pgproc-minimal-heikki-4.patchtext/x-diff; name=pgproc-minimal-heikki-4.patchDownload
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 477982d..b72915b 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -113,7 +113,8 @@ int max_prepared_xacts = 0;
typedef struct GlobalTransactionData
{
- PGPROC proc; /* dummy proc */
+ GlobalTransaction next;
+ int pgprocno; /* dummy proc */
BackendId dummyBackendId; /* similar to backend id for backends */
TimestampTz prepared_at; /* time of preparation */
XLogRecPtr prepare_lsn; /* XLOG offset of prepare record */
@@ -207,7 +208,8 @@ TwoPhaseShmemInit(void)
sizeof(GlobalTransaction) * max_prepared_xacts));
for (i = 0; i < max_prepared_xacts; i++)
{
- gxacts[i].proc.links.next = (SHM_QUEUE *) TwoPhaseState->freeGXacts;
+ gxacts[i].pgprocno = PreparedXactProcs[i].pgprocno;
+ gxacts[i].next = TwoPhaseState->freeGXacts;
TwoPhaseState->freeGXacts = &gxacts[i];
/*
@@ -243,6 +245,8 @@ MarkAsPreparing(TransactionId xid, const char *gid,
TimestampTz prepared_at, Oid owner, Oid databaseid)
{
GlobalTransaction gxact;
+ PGPROC *proc;
+ PGPROC_MINIMAL *proc_minimal;
int i;
if (strlen(gid) >= GIDSIZE)
@@ -274,7 +278,7 @@ MarkAsPreparing(TransactionId xid, const char *gid,
TwoPhaseState->numPrepXacts--;
TwoPhaseState->prepXacts[i] = TwoPhaseState->prepXacts[TwoPhaseState->numPrepXacts];
/* and put it back in the freelist */
- gxact->proc.links.next = (SHM_QUEUE *) TwoPhaseState->freeGXacts;
+ gxact->next = TwoPhaseState->freeGXacts;
TwoPhaseState->freeGXacts = gxact;
/* Back up index count too, so we don't miss scanning one */
i--;
@@ -302,32 +306,36 @@ MarkAsPreparing(TransactionId xid, const char *gid,
errhint("Increase max_prepared_transactions (currently %d).",
max_prepared_xacts)));
gxact = TwoPhaseState->freeGXacts;
- TwoPhaseState->freeGXacts = (GlobalTransaction) gxact->proc.links.next;
+ TwoPhaseState->freeGXacts = (GlobalTransaction) gxact->next;
- /* Initialize it */
- MemSet(&gxact->proc, 0, sizeof(PGPROC));
- SHMQueueElemInit(&(gxact->proc.links));
- gxact->proc.waitStatus = STATUS_OK;
+ proc = &ProcGlobal->allProcs[gxact->pgprocno];
+ proc_minimal = &ProcGlobal->allProcs_Minimal[gxact->pgprocno];
+
+ /* Initialize the PGPROC entry */
+ MemSet(proc, 0, sizeof(PGPROC));
+ proc->pgprocno = gxact->pgprocno;
+ SHMQueueElemInit(&(proc->links));
+ proc->waitStatus = STATUS_OK;
/* We set up the gxact's VXID as InvalidBackendId/XID */
- gxact->proc.lxid = (LocalTransactionId) xid;
- gxact->proc.xid = xid;
- gxact->proc.xmin = InvalidTransactionId;
- gxact->proc.pid = 0;
- gxact->proc.backendId = InvalidBackendId;
- gxact->proc.databaseId = databaseid;
- gxact->proc.roleId = owner;
- gxact->proc.inCommit = false;
- gxact->proc.vacuumFlags = 0;
- gxact->proc.lwWaiting = false;
- gxact->proc.lwExclusive = false;
- gxact->proc.lwWaitLink = NULL;
- gxact->proc.waitLock = NULL;
- gxact->proc.waitProcLock = NULL;
+ proc->lxid = (LocalTransactionId) xid;
+ proc_minimal->xid = xid;
+ proc_minimal->xmin = InvalidTransactionId;
+ proc_minimal->inCommit = false;
+ proc_minimal->vacuumFlags = 0;
+ proc->pid = 0;
+ proc->backendId = InvalidBackendId;
+ proc->databaseId = databaseid;
+ proc->roleId = owner;
+ proc->lwWaiting = false;
+ proc->lwExclusive = false;
+ proc->lwWaitLink = NULL;
+ proc->waitLock = NULL;
+ proc->waitProcLock = NULL;
for (i = 0; i < NUM_LOCK_PARTITIONS; i++)
- SHMQueueInit(&(gxact->proc.myProcLocks[i]));
+ SHMQueueInit(&(proc->myProcLocks[i]));
/* subxid data must be filled later by GXactLoadSubxactData */
- gxact->proc.subxids.overflowed = false;
- gxact->proc.subxids.nxids = 0;
+ proc_minimal->overflowed = false;
+ proc_minimal->nxids = 0;
gxact->prepared_at = prepared_at;
/* initialize LSN to 0 (start of WAL) */
@@ -358,17 +366,19 @@ static void
GXactLoadSubxactData(GlobalTransaction gxact, int nsubxacts,
TransactionId *children)
{
+ PGPROC *proc = &ProcGlobal->allProcs[gxact->pgprocno];
+ PGPROC_MINIMAL *proc_minimal = &ProcGlobal->allProcs_Minimal[gxact->pgprocno];
/* We need no extra lock since the GXACT isn't valid yet */
if (nsubxacts > PGPROC_MAX_CACHED_SUBXIDS)
{
- gxact->proc.subxids.overflowed = true;
+ proc_minimal->overflowed = true;
nsubxacts = PGPROC_MAX_CACHED_SUBXIDS;
}
if (nsubxacts > 0)
{
- memcpy(gxact->proc.subxids.xids, children,
+ memcpy(proc->subxids.xids, children,
nsubxacts * sizeof(TransactionId));
- gxact->proc.subxids.nxids = nsubxacts;
+ proc_minimal->nxids = nsubxacts;
}
}
@@ -389,7 +399,7 @@ MarkAsPrepared(GlobalTransaction gxact)
* Put it into the global ProcArray so TransactionIdIsInProgress considers
* the XID as still running.
*/
- ProcArrayAdd(&gxact->proc);
+ ProcArrayAdd(&ProcGlobal->allProcs[gxact->pgprocno]);
}
/*
@@ -406,6 +416,7 @@ LockGXact(const char *gid, Oid user)
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
GlobalTransaction gxact = TwoPhaseState->prepXacts[i];
+ PGPROC *proc = &ProcGlobal->allProcs[gxact->pgprocno];
/* Ignore not-yet-valid GIDs */
if (!gxact->valid)
@@ -436,7 +447,7 @@ LockGXact(const char *gid, Oid user)
* there may be some other issues as well. Hence disallow until
* someone gets motivated to make it work.
*/
- if (MyDatabaseId != gxact->proc.databaseId)
+ if (MyDatabaseId != proc->databaseId)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("prepared transaction belongs to another database"),
@@ -483,7 +494,7 @@ RemoveGXact(GlobalTransaction gxact)
TwoPhaseState->prepXacts[i] = TwoPhaseState->prepXacts[TwoPhaseState->numPrepXacts];
/* and put it back in the freelist */
- gxact->proc.links.next = (SHM_QUEUE *) TwoPhaseState->freeGXacts;
+ gxact->next = TwoPhaseState->freeGXacts;
TwoPhaseState->freeGXacts = gxact;
LWLockRelease(TwoPhaseStateLock);
@@ -518,8 +529,9 @@ TransactionIdIsPrepared(TransactionId xid)
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
GlobalTransaction gxact = TwoPhaseState->prepXacts[i];
+ PGPROC_MINIMAL *proc_minimal = &ProcGlobal->allProcs_Minimal[gxact->pgprocno];
- if (gxact->valid && gxact->proc.xid == xid)
+ if (gxact->valid && proc_minimal->xid == xid)
{
result = true;
break;
@@ -642,6 +654,8 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
while (status->array != NULL && status->currIdx < status->ngxacts)
{
GlobalTransaction gxact = &status->array[status->currIdx++];
+ PGPROC *proc = &ProcGlobal->allProcs[gxact->pgprocno];
+ PGPROC_MINIMAL *proc_minimal = &ProcGlobal->allProcs_Minimal[gxact->pgprocno];
Datum values[5];
bool nulls[5];
HeapTuple tuple;
@@ -656,11 +670,11 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
MemSet(values, 0, sizeof(values));
MemSet(nulls, 0, sizeof(nulls));
- values[0] = TransactionIdGetDatum(gxact->proc.xid);
+ values[0] = TransactionIdGetDatum(proc_minimal->xid);
values[1] = CStringGetTextDatum(gxact->gid);
values[2] = TimestampTzGetDatum(gxact->prepared_at);
values[3] = ObjectIdGetDatum(gxact->owner);
- values[4] = ObjectIdGetDatum(gxact->proc.databaseId);
+ values[4] = ObjectIdGetDatum(proc->databaseId);
tuple = heap_form_tuple(funcctx->tuple_desc, values, nulls);
result = HeapTupleGetDatum(tuple);
@@ -711,10 +725,11 @@ TwoPhaseGetDummyProc(TransactionId xid)
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
GlobalTransaction gxact = TwoPhaseState->prepXacts[i];
+ PGPROC_MINIMAL *proc_minimal = &ProcGlobal->allProcs_Minimal[gxact->pgprocno];
- if (gxact->proc.xid == xid)
+ if (proc_minimal->xid == xid)
{
- result = &gxact->proc;
+ result = &ProcGlobal->allProcs[gxact->pgprocno];
break;
}
}
@@ -841,7 +856,9 @@ save_state_data(const void *data, uint32 len)
void
StartPrepare(GlobalTransaction gxact)
{
- TransactionId xid = gxact->proc.xid;
+ PGPROC *proc = &ProcGlobal->allProcs[gxact->pgprocno];
+ PGPROC_MINIMAL *proc_minimal = &ProcGlobal->allProcs_Minimal[gxact->pgprocno];
+ TransactionId xid = proc_minimal->xid;
TwoPhaseFileHeader hdr;
TransactionId *children;
RelFileNode *commitrels;
@@ -865,7 +882,7 @@ StartPrepare(GlobalTransaction gxact)
hdr.magic = TWOPHASE_MAGIC;
hdr.total_len = 0; /* EndPrepare will fill this in */
hdr.xid = xid;
- hdr.database = gxact->proc.databaseId;
+ hdr.database = proc->databaseId;
hdr.prepared_at = gxact->prepared_at;
hdr.owner = gxact->owner;
hdr.nsubxacts = xactGetCommittedChildren(&children);
@@ -913,7 +930,8 @@ StartPrepare(GlobalTransaction gxact)
void
EndPrepare(GlobalTransaction gxact)
{
- TransactionId xid = gxact->proc.xid;
+ PGPROC_MINIMAL *proc_minimal = &ProcGlobal->allProcs_Minimal[gxact->pgprocno];
+ TransactionId xid = proc_minimal->xid;
TwoPhaseFileHeader *hdr;
char path[MAXPGPATH];
XLogRecData *record;
@@ -1021,7 +1039,7 @@ EndPrepare(GlobalTransaction gxact)
*/
START_CRIT_SECTION();
- MyProc->inCommit = true;
+ MyProcMinimal->inCommit = true;
gxact->prepare_lsn = XLogInsert(RM_XACT_ID, XLOG_XACT_PREPARE,
records.head);
@@ -1069,7 +1087,7 @@ EndPrepare(GlobalTransaction gxact)
* checkpoint starting after this will certainly see the gxact as a
* candidate for fsyncing.
*/
- MyProc->inCommit = false;
+ MyProcMinimal->inCommit = false;
END_CRIT_SECTION();
@@ -1242,6 +1260,8 @@ void
FinishPreparedTransaction(const char *gid, bool isCommit)
{
GlobalTransaction gxact;
+ PGPROC *proc;
+ PGPROC_MINIMAL *proc_minimal;
TransactionId xid;
char *buf;
char *bufptr;
@@ -1260,7 +1280,9 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
* try to commit the same GID at once.
*/
gxact = LockGXact(gid, GetUserId());
- xid = gxact->proc.xid;
+ proc = &ProcGlobal->allProcs[gxact->pgprocno];
+ proc_minimal = &ProcGlobal->allProcs_Minimal[gxact->pgprocno];
+ xid = proc_minimal->xid;
/*
* Read and validate the state file
@@ -1309,7 +1331,7 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
hdr->nsubxacts, children,
hdr->nabortrels, abortrels);
- ProcArrayRemove(&gxact->proc, latestXid);
+ ProcArrayRemove(proc, latestXid);
/*
* In case we fail while running the callbacks, mark the gxact invalid so
@@ -1540,10 +1562,11 @@ CheckPointTwoPhase(XLogRecPtr redo_horizon)
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
GlobalTransaction gxact = TwoPhaseState->prepXacts[i];
+ PGPROC_MINIMAL *proc_minimal = &ProcGlobal->allProcs_Minimal[gxact->pgprocno];
if (gxact->valid &&
XLByteLE(gxact->prepare_lsn, redo_horizon))
- xids[nxids++] = gxact->proc.xid;
+ xids[nxids++] = proc_minimal->xid;
}
LWLockRelease(TwoPhaseStateLock);
@@ -1972,7 +1995,7 @@ RecordTransactionCommitPrepared(TransactionId xid,
START_CRIT_SECTION();
/* See notes in RecordTransactionCommit */
- MyProc->inCommit = true;
+ MyProcMinimal->inCommit = true;
/* Emit the XLOG commit record */
xlrec.xid = xid;
@@ -2037,7 +2060,7 @@ RecordTransactionCommitPrepared(TransactionId xid,
TransactionIdCommitTree(xid, nchildren, children);
/* Checkpoint can proceed now */
- MyProc->inCommit = false;
+ MyProcMinimal->inCommit = false;
END_CRIT_SECTION();
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 61dcfed..7c986aa 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -54,7 +54,7 @@ GetNewTransactionId(bool isSubXact)
if (IsBootstrapProcessingMode())
{
Assert(!isSubXact);
- MyProc->xid = BootstrapTransactionId;
+ MyProcMinimal->xid = BootstrapTransactionId;
return BootstrapTransactionId;
}
@@ -208,20 +208,21 @@ GetNewTransactionId(bool isSubXact)
* TransactionId and int fetch/store are atomic.
*/
volatile PGPROC *myproc = MyProc;
+ volatile PGPROC_MINIMAL *myprocminimal = MyProcMinimal;
if (!isSubXact)
- myproc->xid = xid;
+ myprocminimal->xid = xid;
else
{
- int nxids = myproc->subxids.nxids;
+ int nxids = myprocminimal->nxids;
if (nxids < PGPROC_MAX_CACHED_SUBXIDS)
{
myproc->subxids.xids[nxids] = xid;
- myproc->subxids.nxids = nxids + 1;
+ myprocminimal->nxids = nxids + 1;
}
else
- myproc->subxids.overflowed = true;
+ myprocminimal->overflowed = true;
}
}
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index c151d3b..e0f7419 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -981,7 +981,7 @@ RecordTransactionCommit(void)
* bit fuzzy, but it doesn't matter.
*/
START_CRIT_SECTION();
- MyProc->inCommit = true;
+ MyProcMinimal->inCommit = true;
SetCurrentTransactionStopTimestamp();
@@ -1155,7 +1155,7 @@ RecordTransactionCommit(void)
*/
if (markXidCommitted)
{
- MyProc->inCommit = false;
+ MyProcMinimal->inCommit = false;
END_CRIT_SECTION();
}
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 32985a4..bde51bb 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -223,7 +223,7 @@ analyze_rel(Oid relid, VacuumStmt *vacstmt, BufferAccessStrategy bstrategy)
* OK, let's do it. First let other backends know I'm in ANALYZE.
*/
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- MyProc->vacuumFlags |= PROC_IN_ANALYZE;
+ MyProcMinimal->vacuumFlags |= PROC_IN_ANALYZE;
LWLockRelease(ProcArrayLock);
/*
@@ -250,7 +250,7 @@ analyze_rel(Oid relid, VacuumStmt *vacstmt, BufferAccessStrategy bstrategy)
* because the vacuum flag is cleared by the end-of-xact code.
*/
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- MyProc->vacuumFlags &= ~PROC_IN_ANALYZE;
+ MyProcMinimal->vacuumFlags &= ~PROC_IN_ANALYZE;
LWLockRelease(ProcArrayLock);
}
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index f42504c..3542b3a 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -893,9 +893,9 @@ vacuum_rel(Oid relid, VacuumStmt *vacstmt, bool do_toast, bool for_wraparound)
* which is probably Not Good.
*/
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- MyProc->vacuumFlags |= PROC_IN_VACUUM;
+ MyProcMinimal->vacuumFlags |= PROC_IN_VACUUM;
if (for_wraparound)
- MyProc->vacuumFlags |= PROC_VACUUM_FOR_WRAPAROUND;
+ MyProcMinimal->vacuumFlags |= PROC_VACUUM_FOR_WRAPAROUND;
LWLockRelease(ProcArrayLock);
}
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index dd2d6ee..dc93b42 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -702,7 +702,7 @@ ProcessStandbyHSFeedbackMessage(void)
* safe, and if we're moving it backwards, well, the data is at risk
* already since a VACUUM could have just finished calling GetOldestXmin.)
*/
- MyProc->xmin = reply.xmin;
+ MyProcMinimal->xmin = reply.xmin;
}
/* Main loop of walsender process */
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 56c0bd8..bb8b832 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -192,7 +192,6 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
XLOGShmemInit();
CLOGShmemInit();
SUBTRANSShmemInit();
- TwoPhaseShmemInit();
MultiXactShmemInit();
InitBufferPool();
@@ -213,6 +212,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
InitProcGlobal();
CreateSharedProcArray();
CreateSharedBackendStatus();
+ TwoPhaseShmemInit();
/*
* Set up shared-inval messaging
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 1a48485..a4cf965 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -82,14 +82,17 @@ typedef struct ProcArrayStruct
TransactionId lastOverflowedXid;
/*
- * We declare procs[] as 1 entry because C wants a fixed-size array, but
+ * We declare pgprocnos[] as 1 entry because C wants a fixed-size array, but
* actually it is maxProcs entries long.
*/
- PGPROC *procs[1]; /* VARIABLE LENGTH ARRAY */
+ int pgprocnos[1]; /* VARIABLE LENGTH ARRAY */
} ProcArrayStruct;
static ProcArrayStruct *procArray;
+static PGPROC *allProcs;
+static PGPROC_MINIMAL *allProcs_Minimal;
+
/*
* Bookkeeping for tracking emulated transactions in recovery
*/
@@ -169,8 +172,8 @@ ProcArrayShmemSize(void)
/* Size of the ProcArray structure itself */
#define PROCARRAY_MAXPROCS (MaxBackends + max_prepared_xacts)
- size = offsetof(ProcArrayStruct, procs);
- size = add_size(size, mul_size(sizeof(PGPROC *), PROCARRAY_MAXPROCS));
+ size = offsetof(ProcArrayStruct, pgprocnos);
+ size = add_size(size, mul_size(sizeof(int), PROCARRAY_MAXPROCS));
/*
* During Hot Standby processing we have a data structure called
@@ -211,8 +214,8 @@ CreateSharedProcArray(void)
/* Create or attach to the ProcArray shared structure */
procArray = (ProcArrayStruct *)
ShmemInitStruct("Proc Array",
- add_size(offsetof(ProcArrayStruct, procs),
- mul_size(sizeof(PGPROC *),
+ add_size(offsetof(ProcArrayStruct, pgprocnos),
+ mul_size(sizeof(int),
PROCARRAY_MAXPROCS)),
&found);
@@ -231,6 +234,9 @@ CreateSharedProcArray(void)
procArray->lastOverflowedXid = InvalidTransactionId;
}
+ allProcs = ProcGlobal->allProcs;
+ allProcs_Minimal = ProcGlobal->allProcs_Minimal;
+
/* Create or attach to the KnownAssignedXids arrays too, if needed */
if (EnableHotStandby)
{
@@ -253,6 +259,7 @@ void
ProcArrayAdd(PGPROC *proc)
{
ProcArrayStruct *arrayP = procArray;
+ int index;
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
@@ -269,7 +276,28 @@ ProcArrayAdd(PGPROC *proc)
errmsg("sorry, too many clients already")));
}
- arrayP->procs[arrayP->numProcs] = proc;
+ /*
+ * Keep the procs array sorted by (PGPROC *) so that we can utilize
+ * locality of references much better. This is useful while traversing the
+ * ProcArray because there is a increased likelyhood of finding the next
+ * PGPROC structure in the cache.
+ *
+ * Since the occurance of adding/removing a proc is much lower than the
+ * access to the ProcArray itself, the overhead should be marginal
+ */
+ for (index = 0; index < arrayP->numProcs; index++)
+ {
+ /*
+ * If we are the first PGPROC or if we have found our right position in
+ * the array, break
+ */
+ if ((arrayP->pgprocnos[index] == -1) || (arrayP->pgprocnos[index] > proc->pgprocno))
+ break;
+ }
+
+ memmove(&arrayP->pgprocnos[index + 1], &arrayP->pgprocnos[index],
+ (arrayP->numProcs - index) * sizeof (int));
+ arrayP->pgprocnos[index] = proc->pgprocno;
arrayP->numProcs++;
LWLockRelease(ProcArrayLock);
@@ -316,10 +344,12 @@ ProcArrayRemove(PGPROC *proc, TransactionId latestXid)
for (index = 0; index < arrayP->numProcs; index++)
{
- if (arrayP->procs[index] == proc)
+ if (arrayP->pgprocnos[index] == proc->pgprocno)
{
- arrayP->procs[index] = arrayP->procs[arrayP->numProcs - 1];
- arrayP->procs[arrayP->numProcs - 1] = NULL; /* for debugging */
+ /* Keep the PGPROC array sorted. See notes above */
+ memmove(&arrayP->pgprocnos[index], &arrayP->pgprocnos[index + 1],
+ (arrayP->numProcs - index - 1) * sizeof (int));
+ arrayP->pgprocnos[arrayP->numProcs - 1] = -1; /* for debugging */
arrayP->numProcs--;
LWLockRelease(ProcArrayLock);
return;
@@ -349,6 +379,8 @@ ProcArrayRemove(PGPROC *proc, TransactionId latestXid)
void
ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
{
+ PGPROC_MINIMAL *proc_minimal = &allProcs_Minimal[proc->pgprocno];
+
if (TransactionIdIsValid(latestXid))
{
/*
@@ -361,17 +393,17 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- proc->xid = InvalidTransactionId;
+ proc_minimal->xid = InvalidTransactionId;
proc->lxid = InvalidLocalTransactionId;
- proc->xmin = InvalidTransactionId;
+ proc_minimal->xmin = InvalidTransactionId;
/* must be cleared with xid/xmin: */
- proc->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
- proc->inCommit = false; /* be sure this is cleared in abort */
+ proc_minimal->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
+ proc_minimal->inCommit = false; /* be sure this is cleared in abort */
proc->recoveryConflictPending = false;
/* Clear the subtransaction-XID cache too while holding the lock */
- proc->subxids.nxids = 0;
- proc->subxids.overflowed = false;
+ proc_minimal->nxids = 0;
+ proc_minimal->overflowed = false;
/* Also advance global latestCompletedXid while holding the lock */
if (TransactionIdPrecedes(ShmemVariableCache->latestCompletedXid,
@@ -390,10 +422,10 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
Assert(!TransactionIdIsValid(proc->xid));
proc->lxid = InvalidLocalTransactionId;
- proc->xmin = InvalidTransactionId;
+ proc_minimal->xmin = InvalidTransactionId;
/* must be cleared with xid/xmin: */
- proc->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
- proc->inCommit = false; /* be sure this is cleared in abort */
+ proc_minimal->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
+ proc_minimal->inCommit = false; /* be sure this is cleared in abort */
proc->recoveryConflictPending = false;
Assert(proc->subxids.nxids == 0);
@@ -413,24 +445,26 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
void
ProcArrayClearTransaction(PGPROC *proc)
{
+ PGPROC_MINIMAL *proc_minimal = &allProcs_Minimal[proc->pgprocno];
+
/*
* We can skip locking ProcArrayLock here, because this action does not
* actually change anyone's view of the set of running XIDs: our entry is
* duplicate with the gxact that has already been inserted into the
* ProcArray.
*/
- proc->xid = InvalidTransactionId;
+ proc_minimal->xid = InvalidTransactionId;
proc->lxid = InvalidLocalTransactionId;
- proc->xmin = InvalidTransactionId;
+ proc_minimal->xmin = InvalidTransactionId;
proc->recoveryConflictPending = false;
/* redundant, but just in case */
- proc->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
- proc->inCommit = false;
+ proc_minimal->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
+ proc_minimal->inCommit = false;
/* Clear the subtransaction-XID cache too */
- proc->subxids.nxids = 0;
- proc->subxids.overflowed = false;
+ proc_minimal->nxids = 0;
+ proc_minimal->overflowed = false;
}
/*
@@ -811,7 +845,9 @@ TransactionIdIsInProgress(TransactionId xid)
/* No shortcuts, gotta grovel through the array */
for (i = 0; i < arrayP->numProcs; i++)
{
- volatile PGPROC *proc = arrayP->procs[i];
+ int pgprocno = arrayP->pgprocnos[i];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGPROC_MINIMAL *proc_minimal = &allProcs_Minimal[pgprocno];
TransactionId pxid;
/* Ignore my own proc --- dealt with it above */
@@ -819,7 +855,7 @@ TransactionIdIsInProgress(TransactionId xid)
continue;
/* Fetch xid just once - see GetNewTransactionId */
- pxid = proc->xid;
+ pxid = proc_minimal->xid;
if (!TransactionIdIsValid(pxid))
continue;
@@ -844,7 +880,7 @@ TransactionIdIsInProgress(TransactionId xid)
/*
* Step 2: check the cached child-Xids arrays
*/
- for (j = proc->subxids.nxids - 1; j >= 0; j--)
+ for (j = proc_minimal->nxids - 1; j >= 0; j--)
{
/* Fetch xid just once - see GetNewTransactionId */
TransactionId cxid = proc->subxids.xids[j];
@@ -864,7 +900,7 @@ TransactionIdIsInProgress(TransactionId xid)
* we hold ProcArrayLock. So we can't miss an Xid that we need to
* worry about.)
*/
- if (proc->subxids.overflowed)
+ if (proc_minimal->overflowed)
xids[nxids++] = pxid;
}
@@ -965,10 +1001,13 @@ TransactionIdIsActive(TransactionId xid)
for (i = 0; i < arrayP->numProcs; i++)
{
- volatile PGPROC *proc = arrayP->procs[i];
+ int pgprocno = arrayP->pgprocnos[i];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGPROC_MINIMAL *proc_minimal = &allProcs_Minimal[pgprocno];
+ TransactionId pxid;
/* Fetch xid just once - see GetNewTransactionId */
- TransactionId pxid = proc->xid;
+ pxid = proc_minimal->xid;
if (!TransactionIdIsValid(pxid))
continue;
@@ -1060,9 +1099,11 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGPROC_MINIMAL *proc_minimal = &allProcs_Minimal[pgprocno];
- if (ignoreVacuum && (proc->vacuumFlags & PROC_IN_VACUUM))
+ if (ignoreVacuum && (proc_minimal->vacuumFlags & PROC_IN_VACUUM))
continue;
if (allDbs ||
@@ -1070,7 +1111,7 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
proc->databaseId == 0) /* always include WalSender */
{
/* Fetch xid just once - see GetNewTransactionId */
- TransactionId xid = proc->xid;
+ TransactionId xid = proc_minimal->xid;
/* First consider the transaction's own Xid, if any */
if (TransactionIdIsNormal(xid) &&
@@ -1084,7 +1125,7 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
* have an Xmin but not (yet) an Xid; conversely, if it has an
* Xid, that could determine some not-yet-set Xmin.
*/
- xid = proc->xmin; /* Fetch just once */
+ xid = proc_minimal->xmin; /* Fetch just once */
if (TransactionIdIsNormal(xid) &&
TransactionIdPrecedes(xid, result))
result = xid;
@@ -1200,6 +1241,8 @@ GetSnapshotData(Snapshot snapshot)
int count = 0;
int subcount = 0;
bool suboverflowed = false;
+ static TransactionId *xmins = NULL;
+ int numProcs;
Assert(snapshot != NULL);
@@ -1235,6 +1278,15 @@ GetSnapshotData(Snapshot snapshot)
errmsg("out of memory")));
}
+ if (xmins == NULL)
+ {
+ xmins = malloc(procArray->maxProcs * sizeof(TransactionId));
+ if (xmins == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of memory")));
+ }
+
/*
* It is sufficient to get shared lock on ProcArrayLock, even if we are
* going to set MyProc->xmin.
@@ -1261,6 +1313,8 @@ GetSnapshotData(Snapshot snapshot)
if (!snapshot->takenDuringRecovery)
{
+ int *pgprocnos = arrayP->pgprocnos;
+
/*
* Spin over procArray checking xid, xmin, and subxids. The goal is
* to gather all active xids, find the lowest xmin, and try to record
@@ -1269,23 +1323,25 @@ GetSnapshotData(Snapshot snapshot)
* prepared transaction xids are held in KnownAssignedXids, so these
* will be seen without needing to loop through procs here.
*/
- for (index = 0; index < arrayP->numProcs; index++)
+ numProcs = arrayP->numProcs;
+ for (index = 0; index < numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = pgprocnos[index];
+ volatile PGPROC_MINIMAL *proc_minimal = &allProcs_Minimal[pgprocno];
TransactionId xid;
/* Ignore procs running LAZY VACUUM */
- if (proc->vacuumFlags & PROC_IN_VACUUM)
+ if (proc_minimal->vacuumFlags & PROC_IN_VACUUM)
+ {
+ xmins[index] = InvalidTransactionId;
continue;
+ }
/* Update globalxmin to be the smallest valid xmin */
- xid = proc->xmin; /* fetch just once */
- if (TransactionIdIsNormal(xid) &&
- TransactionIdPrecedes(xid, globalxmin))
- globalxmin = xid;
+ xmins[index] = proc_minimal->xmin; /* fetch just once */
/* Fetch xid just once - see GetNewTransactionId */
- xid = proc->xid;
+ xid = proc_minimal->xid;
/*
* If the transaction has been assigned an xid < xmax we add it to
@@ -1300,7 +1356,7 @@ GetSnapshotData(Snapshot snapshot)
{
if (TransactionIdFollowsOrEquals(xid, xmax))
continue;
- if (proc != MyProc)
+ if (proc_minimal != MyProcMinimal)
snapshot->xip[count++] = xid;
if (TransactionIdPrecedes(xid, xmin))
xmin = xid;
@@ -1321,16 +1377,17 @@ GetSnapshotData(Snapshot snapshot)
*
* Again, our own XIDs are not included in the snapshot.
*/
- if (!suboverflowed && proc != MyProc)
+ if (!suboverflowed && proc_minimal != MyProcMinimal)
{
- if (proc->subxids.overflowed)
+ if (proc_minimal->overflowed)
suboverflowed = true;
else
{
- int nxids = proc->subxids.nxids;
+ int nxids = proc_minimal->nxids;
if (nxids > 0)
{
+ volatile PGPROC *proc = &allProcs[pgprocno];
memcpy(snapshot->subxip + subcount,
(void *) proc->subxids.xids,
nxids * sizeof(TransactionId));
@@ -1342,6 +1399,7 @@ GetSnapshotData(Snapshot snapshot)
}
else
{
+ numProcs = 0;
/*
* We're in hot standby, so get XIDs from KnownAssignedXids.
*
@@ -1372,9 +1430,8 @@ GetSnapshotData(Snapshot snapshot)
suboverflowed = true;
}
- if (!TransactionIdIsValid(MyProc->xmin))
- MyProc->xmin = TransactionXmin = xmin;
-
+ if (!TransactionIdIsValid(MyProcMinimal->xmin))
+ MyProcMinimal->xmin = TransactionXmin = xmin;
LWLockRelease(ProcArrayLock);
/*
@@ -1382,6 +1439,14 @@ GetSnapshotData(Snapshot snapshot)
* different way of computing it than GetOldestXmin uses, but should give
* the same result.
*/
+ for (index = 0; index < numProcs; index++)
+ {
+ TransactionId xid = xmins[index];
+ if (TransactionIdIsNormal(xid) &&
+ TransactionIdPrecedes(xid, globalxmin))
+ globalxmin = xid;
+ }
+
if (TransactionIdPrecedes(xmin, globalxmin))
globalxmin = xmin;
@@ -1436,14 +1501,16 @@ ProcArrayInstallImportedXmin(TransactionId xmin, TransactionId sourcexid)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGPROC_MINIMAL *proc_minimal = &allProcs_Minimal[pgprocno];
TransactionId xid;
/* Ignore procs running LAZY VACUUM */
- if (proc->vacuumFlags & PROC_IN_VACUUM)
+ if (proc_minimal->vacuumFlags & PROC_IN_VACUUM)
continue;
- xid = proc->xid; /* fetch just once */
+ xid = proc_minimal->xid; /* fetch just once */
if (xid != sourcexid)
continue;
@@ -1459,7 +1526,7 @@ ProcArrayInstallImportedXmin(TransactionId xmin, TransactionId sourcexid)
/*
* Likewise, let's just make real sure its xmin does cover us.
*/
- xid = proc->xmin; /* fetch just once */
+ xid = proc_minimal->xmin; /* fetch just once */
if (!TransactionIdIsNormal(xid) ||
!TransactionIdPrecedesOrEquals(xid, xmin))
continue;
@@ -1470,7 +1537,7 @@ ProcArrayInstallImportedXmin(TransactionId xmin, TransactionId sourcexid)
* GetSnapshotData first, we'll be overwriting a valid xmin here,
* so we don't check that.)
*/
- MyProc->xmin = TransactionXmin = xmin;
+ MyProcMinimal->xmin = TransactionXmin = xmin;
result = true;
break;
@@ -1562,12 +1629,14 @@ GetRunningTransactionData(void)
*/
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGPROC_MINIMAL *proc_minimal = &allProcs_Minimal[pgprocno];
TransactionId xid;
int nxids;
/* Fetch xid just once - see GetNewTransactionId */
- xid = proc->xid;
+ xid = proc_minimal->xid;
/*
* We don't need to store transactions that don't have a TransactionId
@@ -1585,7 +1654,7 @@ GetRunningTransactionData(void)
* Save subtransaction XIDs. Other backends can't add or remove
* entries while we're holding XidGenLock.
*/
- nxids = proc->subxids.nxids;
+ nxids = proc_minimal->nxids;
if (nxids > 0)
{
memcpy(&xids[count], (void *) proc->subxids.xids,
@@ -1593,7 +1662,7 @@ GetRunningTransactionData(void)
count += nxids;
subcount += nxids;
- if (proc->subxids.overflowed)
+ if (proc_minimal->overflowed)
suboverflowed = true;
/*
@@ -1653,11 +1722,12 @@ GetOldestActiveTransactionId(void)
*/
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC_MINIMAL *proc_minimal = &allProcs_Minimal[pgprocno];
TransactionId xid;
/* Fetch xid just once - see GetNewTransactionId */
- xid = proc->xid;
+ xid = proc_minimal->xid;
if (!TransactionIdIsNormal(xid))
continue;
@@ -1709,12 +1779,14 @@ GetTransactionsInCommit(TransactionId **xids_p)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC_MINIMAL *proc_minimal = &allProcs_Minimal[pgprocno];
+ TransactionId pxid;
/* Fetch xid just once - see GetNewTransactionId */
- TransactionId pxid = proc->xid;
+ pxid = proc_minimal->xid;
- if (proc->inCommit && TransactionIdIsValid(pxid))
+ if (proc_minimal->inCommit && TransactionIdIsValid(pxid))
xids[nxids++] = pxid;
}
@@ -1744,12 +1816,14 @@ HaveTransactionsInCommit(TransactionId *xids, int nxids)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC_MINIMAL *proc_minimal = &allProcs_Minimal[pgprocno];
+ TransactionId pxid;
/* Fetch xid just once - see GetNewTransactionId */
- TransactionId pxid = proc->xid;
+ pxid = proc_minimal->xid;
- if (proc->inCommit && TransactionIdIsValid(pxid))
+ if (proc_minimal->inCommit && TransactionIdIsValid(pxid))
{
int i;
@@ -1792,7 +1866,7 @@ BackendPidGetProc(int pid)
for (index = 0; index < arrayP->numProcs; index++)
{
- PGPROC *proc = arrayP->procs[index];
+ PGPROC *proc = &allProcs[arrayP->pgprocnos[index]];
if (proc->pid == pid)
{
@@ -1833,9 +1907,11 @@ BackendXidGetPid(TransactionId xid)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGPROC_MINIMAL *proc_minimal = &allProcs_Minimal[pgprocno];
- if (proc->xid == xid)
+ if (proc_minimal->xid == xid)
{
result = proc->pid;
break;
@@ -1901,18 +1977,20 @@ GetCurrentVirtualXIDs(TransactionId limitXmin, bool excludeXmin0,
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGPROC_MINIMAL *proc_minimal = &allProcs_Minimal[pgprocno];
if (proc == MyProc)
continue;
- if (excludeVacuum & proc->vacuumFlags)
+ if (excludeVacuum & proc_minimal->vacuumFlags)
continue;
if (allDbs || proc->databaseId == MyDatabaseId)
{
/* Fetch xmin just once - might change on us */
- TransactionId pxmin = proc->xmin;
+ TransactionId pxmin = proc_minimal->xmin;
if (excludeXmin0 && !TransactionIdIsValid(pxmin))
continue;
@@ -1996,7 +2074,9 @@ GetConflictingVirtualXIDs(TransactionId limitXmin, Oid dbOid)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGPROC_MINIMAL *proc_minimal = &allProcs_Minimal[pgprocno];
/* Exclude prepared transactions */
if (proc->pid == 0)
@@ -2006,7 +2086,7 @@ GetConflictingVirtualXIDs(TransactionId limitXmin, Oid dbOid)
proc->databaseId == dbOid)
{
/* Fetch xmin just once - can't change on us, but good coding */
- TransactionId pxmin = proc->xmin;
+ TransactionId pxmin = proc_minimal->xmin;
/*
* We ignore an invalid pxmin because this means that backend has
@@ -2050,8 +2130,9 @@ CancelVirtualTransaction(VirtualTransactionId vxid, ProcSignalReason sigmode)
for (index = 0; index < arrayP->numProcs; index++)
{
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
VirtualTransactionId procvxid;
- PGPROC *proc = arrayP->procs[index];
GET_VXID_FROM_PGPROC(procvxid, *proc);
@@ -2104,7 +2185,9 @@ MinimumActiveBackends(int min)
*/
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGPROC_MINIMAL *proc_minimal = &allProcs_Minimal[pgprocno];
/*
* Since we're not holding a lock, need to check that the pointer is
@@ -2122,10 +2205,10 @@ MinimumActiveBackends(int min)
if (proc == MyProc)
continue; /* do not count myself */
+ if (proc_minimal->xid == InvalidTransactionId)
+ continue; /* do not count if no XID assigned */
if (proc->pid == 0)
continue; /* do not count prepared xacts */
- if (proc->xid == InvalidTransactionId)
- continue; /* do not count if no XID assigned */
if (proc->waitLock != NULL)
continue; /* do not count if blocked on a lock */
count++;
@@ -2150,7 +2233,8 @@ CountDBBackends(Oid databaseid)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
if (proc->pid == 0)
continue; /* do not count prepared xacts */
@@ -2179,7 +2263,8 @@ CancelDBBackends(Oid databaseid, ProcSignalReason sigmode, bool conflictPending)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
if (databaseid == InvalidOid || proc->databaseId == databaseid)
{
@@ -2217,7 +2302,8 @@ CountUserBackends(Oid roleid)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
if (proc->pid == 0)
continue; /* do not count prepared xacts */
@@ -2277,7 +2363,9 @@ CountOtherDBBackends(Oid databaseId, int *nbackends, int *nprepared)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGPROC_MINIMAL *proc_minimal = &allProcs_Minimal[pgprocno];
if (proc->databaseId != databaseId)
continue;
@@ -2291,7 +2379,7 @@ CountOtherDBBackends(Oid databaseId, int *nbackends, int *nprepared)
else
{
(*nbackends)++;
- if ((proc->vacuumFlags & PROC_IS_AUTOVACUUM) &&
+ if ((proc_minimal->vacuumFlags & PROC_IS_AUTOVACUUM) &&
nautovacs < MAXAUTOVACPIDS)
autovac_pids[nautovacs++] = proc->pid;
}
@@ -2321,8 +2409,8 @@ CountOtherDBBackends(Oid databaseId, int *nbackends, int *nprepared)
#define XidCacheRemove(i) \
do { \
- MyProc->subxids.xids[i] = MyProc->subxids.xids[MyProc->subxids.nxids - 1]; \
- MyProc->subxids.nxids--; \
+ MyProc->subxids.xids[i] = MyProc->subxids.xids[MyProcMinimal->nxids - 1]; \
+ MyProcMinimal->nxids--; \
} while (0)
/*
@@ -2361,7 +2449,7 @@ XidCacheRemoveRunningXids(TransactionId xid,
{
TransactionId anxid = xids[i];
- for (j = MyProc->subxids.nxids - 1; j >= 0; j--)
+ for (j = MyProcMinimal->nxids - 1; j >= 0; j--)
{
if (TransactionIdEquals(MyProc->subxids.xids[j], anxid))
{
@@ -2377,11 +2465,11 @@ XidCacheRemoveRunningXids(TransactionId xid,
* error during AbortSubTransaction. So instead of Assert, emit a
* debug warning.
*/
- if (j < 0 && !MyProc->subxids.overflowed)
+ if (j < 0 && !MyProcMinimal->overflowed)
elog(WARNING, "did not find subXID %u in MyProc", anxid);
}
- for (j = MyProc->subxids.nxids - 1; j >= 0; j--)
+ for (j = MyProcMinimal->nxids - 1; j >= 0; j--)
{
if (TransactionIdEquals(MyProc->subxids.xids[j], xid))
{
@@ -2390,7 +2478,7 @@ XidCacheRemoveRunningXids(TransactionId xid,
}
}
/* Ordinarily we should have found it, unless the cache has overflowed */
- if (j < 0 && !MyProc->subxids.overflowed)
+ if (j < 0 && !MyProcMinimal->overflowed)
elog(WARNING, "did not find subXID %u in MyProc", xid);
/* Also advance global latestCompletedXid while holding the lock */
diff --git a/src/backend/storage/lmgr/deadlock.c b/src/backend/storage/lmgr/deadlock.c
index 7e7f6af..4fd7bd7 100644
--- a/src/backend/storage/lmgr/deadlock.c
+++ b/src/backend/storage/lmgr/deadlock.c
@@ -450,6 +450,7 @@ FindLockCycleRecurse(PGPROC *checkProc,
int *nSoftEdges) /* output argument */
{
PGPROC *proc;
+ PGPROC_MINIMAL *proc_minimal;
LOCK *lock;
PROCLOCK *proclock;
SHM_QUEUE *procLocks;
@@ -516,6 +517,7 @@ FindLockCycleRecurse(PGPROC *checkProc,
while (proclock)
{
proc = proclock->tag.myProc;
+ proc_minimal = &ProcGlobal->allProcs_Minimal[proc->pgprocno];
/* A proc never blocks itself */
if (proc != checkProc)
@@ -541,7 +543,7 @@ FindLockCycleRecurse(PGPROC *checkProc,
* vacuumFlag bit), but we don't do that here to avoid
* grabbing ProcArrayLock.
*/
- if (proc->vacuumFlags & PROC_IS_AUTOVACUUM)
+ if (proc_minimal->vacuumFlags & PROC_IS_AUTOVACUUM)
blocking_autovacuum_proc = proc;
/* This proc hard-blocks checkProc */
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index ed8344f..af45f9a 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -3182,9 +3182,10 @@ GetRunningTransactionLocks(int *nlocks)
proclock->tag.myLock->tag.locktag_type == LOCKTAG_RELATION)
{
PGPROC *proc = proclock->tag.myProc;
+ PGPROC_MINIMAL *proc_minimal = &ProcGlobal->allProcs_Minimal[proc->pgprocno];
LOCK *lock = proclock->tag.myLock;
- accessExclusiveLocks[index].xid = proc->xid;
+ accessExclusiveLocks[index].xid = proc_minimal->xid;
accessExclusiveLocks[index].dbOid = lock->tag.locktag_field1;
accessExclusiveLocks[index].relOid = lock->tag.locktag_field2;
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index eda3a98..289569a 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -36,6 +36,7 @@
#include <sys/time.h>
#include "access/transam.h"
+#include "access/twophase.h"
#include "access/xact.h"
#include "miscadmin.h"
#include "postmaster/autovacuum.h"
@@ -57,6 +58,7 @@ bool log_lock_waits = false;
/* Pointer to this process's PGPROC struct, if any */
PGPROC *MyProc = NULL;
+PGPROC_MINIMAL *MyProcMinimal = NULL;
/*
* This spinlock protects the freelist of recycled PGPROC structures.
@@ -70,6 +72,7 @@ NON_EXEC_STATIC slock_t *ProcStructLock = NULL;
/* Pointers to shared-memory structures */
PROC_HDR *ProcGlobal = NULL;
NON_EXEC_STATIC PGPROC *AuxiliaryProcs = NULL;
+PGPROC *PreparedXactProcs = NULL;
/* If we are waiting for a lock, this points to the associated LOCALLOCK */
static LOCALLOCK *lockAwaited = NULL;
@@ -106,13 +109,19 @@ ProcGlobalShmemSize(void)
/* ProcGlobal */
size = add_size(size, sizeof(PROC_HDR));
- /* AuxiliaryProcs */
- size = add_size(size, mul_size(NUM_AUXILIARY_PROCS, sizeof(PGPROC)));
/* MyProcs, including autovacuum workers and launcher */
size = add_size(size, mul_size(MaxBackends, sizeof(PGPROC)));
+ /* AuxiliaryProcs */
+ size = add_size(size, mul_size(NUM_AUXILIARY_PROCS, sizeof(PGPROC)));
+ /* Prepared xacts */
+ size = add_size(size, mul_size(max_prepared_xacts, sizeof(PGPROC)));
/* ProcStructLock */
size = add_size(size, sizeof(slock_t));
+ size = add_size(size, mul_size(MaxBackends, sizeof(PGPROC_MINIMAL)));
+ size = add_size(size, mul_size(NUM_AUXILIARY_PROCS, sizeof(PGPROC_MINIMAL)));
+ size = add_size(size, mul_size(max_prepared_xacts, sizeof(PGPROC_MINIMAL)));
+
return size;
}
@@ -157,10 +166,11 @@ void
InitProcGlobal(void)
{
PGPROC *procs;
+ PGPROC_MINIMAL *procs_minimal;
int i,
j;
bool found;
- uint32 TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS;
+ uint32 TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS + max_prepared_xacts;
/* Create the ProcGlobal shared structure */
ProcGlobal = (PROC_HDR *)
@@ -195,14 +205,38 @@ InitProcGlobal(void)
(errcode(ERRCODE_OUT_OF_MEMORY),
errmsg("out of shared memory")));
MemSet(procs, 0, TotalProcs * sizeof(PGPROC));
+
+ /*
+ * Also allocate a separate array of PROC_MINIMAL structures. We keep this
+ * out of band of the main PGPROC array to ensure the very heavily accessed
+ * members of the PGPROC structure are stored contiguously in the memory.
+ * This provides significant performance benefits, especially on a
+ * multiprocessor system by improving cache hit ratio.
+ *
+ * Note: We separate the members needed by GetSnapshotData since that's the
+ * most frequently accessed code path. There is one PROC_MINIMAL structure
+ * for every PGPROC structure.
+ */
+ procs_minimal = (PGPROC_MINIMAL *) ShmemAlloc(TotalProcs * sizeof(PGPROC_MINIMAL));
+ MemSet(procs_minimal, 0, TotalProcs * sizeof(PGPROC_MINIMAL));
+ ProcGlobal->allProcs_Minimal = procs_minimal;
+
for (i = 0; i < TotalProcs; i++)
{
/* Common initialization for all PGPROCs, regardless of type. */
- /* Set up per-PGPROC semaphore, latch, and backendLock */
- PGSemaphoreCreate(&(procs[i].sem));
- InitSharedLatch(&(procs[i].procLatch));
- procs[i].backendLock = LWLockAssign();
+ /*
+ * Set up per-PGPROC semaphore, latch, and backendLock. Prepared
+ * xact dummy PGPROCs don't need these though - they're never
+ * associated with a real process
+ */
+ if (i < MaxBackends + NUM_AUXILIARY_PROCS)
+ {
+ PGSemaphoreCreate(&(procs[i].sem));
+ InitSharedLatch(&(procs[i].procLatch));
+ procs[i].backendLock = LWLockAssign();
+ }
+ procs[i].pgprocno = i;
/*
* Newly created PGPROCs for normal backends or for autovacuum must
@@ -234,6 +268,7 @@ InitProcGlobal(void)
* auxiliary proceses.
*/
AuxiliaryProcs = &procs[MaxBackends];
+ PreparedXactProcs = &procs[MaxBackends + NUM_AUXILIARY_PROCS];
/* Create ProcStructLock spinlock, too */
ProcStructLock = (slock_t *) ShmemAlloc(sizeof(slock_t));
@@ -296,6 +331,7 @@ InitProcess(void)
(errcode(ERRCODE_TOO_MANY_CONNECTIONS),
errmsg("sorry, too many clients already")));
}
+ MyProcMinimal = &ProcGlobal->allProcs_Minimal[MyProc->pgprocno];
/*
* Now that we have a PGPROC, mark ourselves as an active postmaster
@@ -313,18 +349,18 @@ InitProcess(void)
SHMQueueElemInit(&(MyProc->links));
MyProc->waitStatus = STATUS_OK;
MyProc->lxid = InvalidLocalTransactionId;
- MyProc->xid = InvalidTransactionId;
- MyProc->xmin = InvalidTransactionId;
+ MyProcMinimal->xid = InvalidTransactionId;
+ MyProcMinimal->xmin = InvalidTransactionId;
MyProc->pid = MyProcPid;
/* backendId, databaseId and roleId will be filled in later */
MyProc->backendId = InvalidBackendId;
MyProc->databaseId = InvalidOid;
MyProc->roleId = InvalidOid;
- MyProc->inCommit = false;
- MyProc->vacuumFlags = 0;
+ MyProcMinimal->inCommit = false;
+ MyProcMinimal->vacuumFlags = 0;
/* NB -- autovac launcher intentionally does not set IS_AUTOVACUUM */
if (IsAutoVacuumWorkerProcess())
- MyProc->vacuumFlags |= PROC_IS_AUTOVACUUM;
+ MyProcMinimal->vacuumFlags |= PROC_IS_AUTOVACUUM;
MyProc->lwWaiting = false;
MyProc->lwExclusive = false;
MyProc->lwWaitLink = NULL;
@@ -462,6 +498,7 @@ InitAuxiliaryProcess(void)
((volatile PGPROC *) auxproc)->pid = MyProcPid;
MyProc = auxproc;
+ MyProcMinimal = &ProcGlobal->allProcs_Minimal[auxproc->pgprocno];
SpinLockRelease(ProcStructLock);
@@ -472,13 +509,13 @@ InitAuxiliaryProcess(void)
SHMQueueElemInit(&(MyProc->links));
MyProc->waitStatus = STATUS_OK;
MyProc->lxid = InvalidLocalTransactionId;
- MyProc->xid = InvalidTransactionId;
- MyProc->xmin = InvalidTransactionId;
+ MyProcMinimal->xid = InvalidTransactionId;
+ MyProcMinimal->xmin = InvalidTransactionId;
MyProc->backendId = InvalidBackendId;
MyProc->databaseId = InvalidOid;
MyProc->roleId = InvalidOid;
- MyProc->inCommit = false;
- MyProc->vacuumFlags = 0;
+ MyProcMinimal->inCommit = false;
+ MyProcMinimal->vacuumFlags = 0;
MyProc->lwWaiting = false;
MyProc->lwExclusive = false;
MyProc->lwWaitLink = NULL;
@@ -1045,6 +1082,7 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
if (deadlock_state == DS_BLOCKED_BY_AUTOVACUUM && allow_autovacuum_cancel)
{
PGPROC *autovac = GetBlockingAutoVacuumPgproc();
+ PGPROC_MINIMAL *autovac_minimal = &ProcGlobal->allProcs_Minimal[autovac->pgprocno];
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
@@ -1053,8 +1091,8 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
* wraparound.
*/
if ((autovac != NULL) &&
- (autovac->vacuumFlags & PROC_IS_AUTOVACUUM) &&
- !(autovac->vacuumFlags & PROC_VACUUM_FOR_WRAPAROUND))
+ (autovac_minimal->vacuumFlags & PROC_IS_AUTOVACUUM) &&
+ !(autovac_minimal->vacuumFlags & PROC_VACUUM_FOR_WRAPAROUND))
{
int pid = autovac->pid;
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index 50fb780..1f4f5b4 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -577,7 +577,7 @@ static void
SnapshotResetXmin(void)
{
if (RegisteredSnapshots == 0 && ActiveSnapshot == NULL)
- MyProc->xmin = InvalidTransactionId;
+ MyProcMinimal->xmin = InvalidTransactionId;
}
/*
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 6e798b1..40651db 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -35,8 +35,6 @@
struct XidCache
{
- bool overflowed;
- int nxids;
TransactionId xids[PGPROC_MAX_CACHED_SUBXIDS];
};
@@ -86,27 +84,14 @@ struct PGPROC
LocalTransactionId lxid; /* local id of top-level transaction currently
* being executed by this proc, if running;
* else InvalidLocalTransactionId */
-
- TransactionId xid; /* id of top-level transaction currently being
- * executed by this proc, if running and XID
- * is assigned; else InvalidTransactionId */
-
- TransactionId xmin; /* minimal running XID as it was when we were
- * starting our xact, excluding LAZY VACUUM:
- * vacuum must not remove tuples deleted by
- * xid >= xmin ! */
-
int pid; /* Backend's process ID; 0 if prepared xact */
+ int pgprocno;
/* These fields are zero while a backend is still starting up: */
BackendId backendId; /* This backend's backend ID (if assigned) */
Oid databaseId; /* OID of database this backend is using */
Oid roleId; /* OID of role using this backend */
- bool inCommit; /* true if within commit critical section */
-
- uint8 vacuumFlags; /* vacuum-related flags, see above */
-
/*
* While in hot standby mode, shows that a conflict signal has been sent
* for the current transaction. Set/cleared while holding ProcArrayLock,
@@ -160,7 +145,35 @@ struct PGPROC
extern PGDLLIMPORT PGPROC *MyProc;
+extern PGDLLIMPORT struct PGPROC_MINIMAL *MyProcMinimal;
+
+/*
+ * A minimal part of the PGPROC. We store these members out of the main PGPROC
+ * structure since they are very heavily accessed members and usually in a loop
+ * for all active PGPROCs. Storing them in a separate array ensures that these
+ * members can be very effeciently accessed with minimum cache misses. On a
+ * large multiprocessor system, this can show a significant performance
+ * improvement.
+ */
+struct PGPROC_MINIMAL
+{
+ TransactionId xid; /* id of top-level transaction currently being
+ * executed by this proc, if running and XID
+ * is assigned; else InvalidTransactionId */
+ TransactionId xmin; /* minimal running XID as it was when we were
+ * starting our xact, excluding LAZY VACUUM:
+ * vacuum must not remove tuples deleted by
+ * xid >= xmin ! */
+
+ uint8 vacuumFlags; /* vacuum-related flags, see above */
+ bool overflowed;
+ bool inCommit; /* true if within commit critical section */
+
+ uint8 nxids;
+};
+
+typedef struct PGPROC_MINIMAL PGPROC_MINIMAL;
/*
* There is one ProcGlobal struct for the whole database cluster.
@@ -169,6 +182,8 @@ typedef struct PROC_HDR
{
/* Array of PGPROC structures (not including dummies for prepared txns) */
PGPROC *allProcs;
+ /* Array of PGPROC_MINIMAL structures (not including dummies for prepared txns */
+ PGPROC_MINIMAL *allProcs_Minimal;
/* Length of allProcs array */
uint32 allProcCount;
/* Head of list of free PGPROC structures */
@@ -186,6 +201,8 @@ typedef struct PROC_HDR
extern PROC_HDR *ProcGlobal;
+extern PGPROC *PreparedXactProcs;
+
/*
* We set aside some extra PGPROC structures for auxiliary processes,
* ie things that aren't full-fledged backends but need shmem access.
subxids-must-have-parent-1.patchtext/x-diff; name=subxids-must-have-parent-1.patchDownload
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 1a48485..4bd783f 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -1304,8 +1304,8 @@ GetSnapshotData(Snapshot snapshot)
snapshot->xip[count++] = xid;
if (TransactionIdPrecedes(xid, xmin))
xmin = xid;
- }
+ /* XXX indent */
/*
* Save subtransaction XIDs if possible (if we've already
* overflowed, there's no point). Note that the subxact XIDs must
@@ -1338,6 +1338,7 @@ GetSnapshotData(Snapshot snapshot)
}
}
}
+ }
}
}
else
On Mon, Nov 7, 2011 at 6:45 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
While looking at GetSnapshotData(), it occurred to me that when a PGPROC
entry does not have its xid set, ie. xid == InvalidTransactionId, it's
pointless to check the subxid array for that proc. If a transaction has no
xid, none of its subtransactions can have an xid either. That's a trivial
optimization, see attached. I tested this with "pgbench -S -M prepared -c
500" on the 8-core box, and it seemed to make a 1-2% difference (without the
other patch). So, almost in the noise, but it also doesn't cost us anything
in terms of readability or otherwise.
Oh, that's a good idea. +1 for doing that now, and then we can keep
working on the rest of it.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Mon, Nov 7, 2011 at 6:45 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
Agreed, that seems more clean. The PGPROCs for prepared transactions are
currently allocated separately, so they're not currently in the same array
as all other PGPROCs, but that could be changed. Here's a WIP patch for
that. I kept the PGPROC_MINIMAL nomenclature for now, but I agree with
Simon's suggestion to rename it.
All right, I did that in the attached version, using Simon's suggested
name. I also fixed up various comments (especially in
InitProcGlobal), fixed the --enable-cassert build (which was busted),
and added code to save/restore PreparedXactProcs in EXEC_BACKEND mode
(which I assume is necessary, though the regression tests failed to
fail without it).
I'm wondering about the changes to how globalxmin is computed in
GetSnapshotData(). That seems like it could hurt performance in the
single-client case, and especially in the case where there is one
active connection and lots of idle connections, and I'm wondering how
thoroughly we've tested that particular bit apart from these other
changes.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
pgxact.patchapplication/octet-stream; name=pgxact.patchDownload
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 477982d..d2fecb1 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -113,7 +113,8 @@ int max_prepared_xacts = 0;
typedef struct GlobalTransactionData
{
- PGPROC proc; /* dummy proc */
+ GlobalTransaction next;
+ int pgprocno; /* dummy proc */
BackendId dummyBackendId; /* similar to backend id for backends */
TimestampTz prepared_at; /* time of preparation */
XLogRecPtr prepare_lsn; /* XLOG offset of prepare record */
@@ -207,7 +208,8 @@ TwoPhaseShmemInit(void)
sizeof(GlobalTransaction) * max_prepared_xacts));
for (i = 0; i < max_prepared_xacts; i++)
{
- gxacts[i].proc.links.next = (SHM_QUEUE *) TwoPhaseState->freeGXacts;
+ gxacts[i].pgprocno = PreparedXactProcs[i].pgprocno;
+ gxacts[i].next = TwoPhaseState->freeGXacts;
TwoPhaseState->freeGXacts = &gxacts[i];
/*
@@ -243,6 +245,8 @@ MarkAsPreparing(TransactionId xid, const char *gid,
TimestampTz prepared_at, Oid owner, Oid databaseid)
{
GlobalTransaction gxact;
+ PGPROC *proc;
+ PGXACT *pgxact;
int i;
if (strlen(gid) >= GIDSIZE)
@@ -274,7 +278,7 @@ MarkAsPreparing(TransactionId xid, const char *gid,
TwoPhaseState->numPrepXacts--;
TwoPhaseState->prepXacts[i] = TwoPhaseState->prepXacts[TwoPhaseState->numPrepXacts];
/* and put it back in the freelist */
- gxact->proc.links.next = (SHM_QUEUE *) TwoPhaseState->freeGXacts;
+ gxact->next = TwoPhaseState->freeGXacts;
TwoPhaseState->freeGXacts = gxact;
/* Back up index count too, so we don't miss scanning one */
i--;
@@ -302,32 +306,36 @@ MarkAsPreparing(TransactionId xid, const char *gid,
errhint("Increase max_prepared_transactions (currently %d).",
max_prepared_xacts)));
gxact = TwoPhaseState->freeGXacts;
- TwoPhaseState->freeGXacts = (GlobalTransaction) gxact->proc.links.next;
+ TwoPhaseState->freeGXacts = (GlobalTransaction) gxact->next;
- /* Initialize it */
- MemSet(&gxact->proc, 0, sizeof(PGPROC));
- SHMQueueElemInit(&(gxact->proc.links));
- gxact->proc.waitStatus = STATUS_OK;
+ proc = &ProcGlobal->allProcs[gxact->pgprocno];
+ pgxact = &ProcGlobal->allPgXact[gxact->pgprocno];
+
+ /* Initialize the PGPROC entry */
+ MemSet(proc, 0, sizeof(PGPROC));
+ proc->pgprocno = gxact->pgprocno;
+ SHMQueueElemInit(&(proc->links));
+ proc->waitStatus = STATUS_OK;
/* We set up the gxact's VXID as InvalidBackendId/XID */
- gxact->proc.lxid = (LocalTransactionId) xid;
- gxact->proc.xid = xid;
- gxact->proc.xmin = InvalidTransactionId;
- gxact->proc.pid = 0;
- gxact->proc.backendId = InvalidBackendId;
- gxact->proc.databaseId = databaseid;
- gxact->proc.roleId = owner;
- gxact->proc.inCommit = false;
- gxact->proc.vacuumFlags = 0;
- gxact->proc.lwWaiting = false;
- gxact->proc.lwExclusive = false;
- gxact->proc.lwWaitLink = NULL;
- gxact->proc.waitLock = NULL;
- gxact->proc.waitProcLock = NULL;
+ proc->lxid = (LocalTransactionId) xid;
+ pgxact->xid = xid;
+ pgxact->xmin = InvalidTransactionId;
+ pgxact->inCommit = false;
+ pgxact->vacuumFlags = 0;
+ proc->pid = 0;
+ proc->backendId = InvalidBackendId;
+ proc->databaseId = databaseid;
+ proc->roleId = owner;
+ proc->lwWaiting = false;
+ proc->lwExclusive = false;
+ proc->lwWaitLink = NULL;
+ proc->waitLock = NULL;
+ proc->waitProcLock = NULL;
for (i = 0; i < NUM_LOCK_PARTITIONS; i++)
- SHMQueueInit(&(gxact->proc.myProcLocks[i]));
+ SHMQueueInit(&(proc->myProcLocks[i]));
/* subxid data must be filled later by GXactLoadSubxactData */
- gxact->proc.subxids.overflowed = false;
- gxact->proc.subxids.nxids = 0;
+ pgxact->overflowed = false;
+ pgxact->nxids = 0;
gxact->prepared_at = prepared_at;
/* initialize LSN to 0 (start of WAL) */
@@ -358,17 +366,19 @@ static void
GXactLoadSubxactData(GlobalTransaction gxact, int nsubxacts,
TransactionId *children)
{
+ PGPROC *proc = &ProcGlobal->allProcs[gxact->pgprocno];
+ PGXACT *pgxact = &ProcGlobal->allPgXact[gxact->pgprocno];
/* We need no extra lock since the GXACT isn't valid yet */
if (nsubxacts > PGPROC_MAX_CACHED_SUBXIDS)
{
- gxact->proc.subxids.overflowed = true;
+ pgxact->overflowed = true;
nsubxacts = PGPROC_MAX_CACHED_SUBXIDS;
}
if (nsubxacts > 0)
{
- memcpy(gxact->proc.subxids.xids, children,
+ memcpy(proc->subxids.xids, children,
nsubxacts * sizeof(TransactionId));
- gxact->proc.subxids.nxids = nsubxacts;
+ pgxact->nxids = nsubxacts;
}
}
@@ -389,7 +399,7 @@ MarkAsPrepared(GlobalTransaction gxact)
* Put it into the global ProcArray so TransactionIdIsInProgress considers
* the XID as still running.
*/
- ProcArrayAdd(&gxact->proc);
+ ProcArrayAdd(&ProcGlobal->allProcs[gxact->pgprocno]);
}
/*
@@ -406,6 +416,7 @@ LockGXact(const char *gid, Oid user)
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
GlobalTransaction gxact = TwoPhaseState->prepXacts[i];
+ PGPROC *proc = &ProcGlobal->allProcs[gxact->pgprocno];
/* Ignore not-yet-valid GIDs */
if (!gxact->valid)
@@ -436,7 +447,7 @@ LockGXact(const char *gid, Oid user)
* there may be some other issues as well. Hence disallow until
* someone gets motivated to make it work.
*/
- if (MyDatabaseId != gxact->proc.databaseId)
+ if (MyDatabaseId != proc->databaseId)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("prepared transaction belongs to another database"),
@@ -483,7 +494,7 @@ RemoveGXact(GlobalTransaction gxact)
TwoPhaseState->prepXacts[i] = TwoPhaseState->prepXacts[TwoPhaseState->numPrepXacts];
/* and put it back in the freelist */
- gxact->proc.links.next = (SHM_QUEUE *) TwoPhaseState->freeGXacts;
+ gxact->next = TwoPhaseState->freeGXacts;
TwoPhaseState->freeGXacts = gxact;
LWLockRelease(TwoPhaseStateLock);
@@ -518,8 +529,9 @@ TransactionIdIsPrepared(TransactionId xid)
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
GlobalTransaction gxact = TwoPhaseState->prepXacts[i];
+ PGXACT *pgxact = &ProcGlobal->allPgXact[gxact->pgprocno];
- if (gxact->valid && gxact->proc.xid == xid)
+ if (gxact->valid && pgxact->xid == xid)
{
result = true;
break;
@@ -642,6 +654,8 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
while (status->array != NULL && status->currIdx < status->ngxacts)
{
GlobalTransaction gxact = &status->array[status->currIdx++];
+ PGPROC *proc = &ProcGlobal->allProcs[gxact->pgprocno];
+ PGXACT *pgxact = &ProcGlobal->allPgXact[gxact->pgprocno];
Datum values[5];
bool nulls[5];
HeapTuple tuple;
@@ -656,11 +670,11 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
MemSet(values, 0, sizeof(values));
MemSet(nulls, 0, sizeof(nulls));
- values[0] = TransactionIdGetDatum(gxact->proc.xid);
+ values[0] = TransactionIdGetDatum(pgxact->xid);
values[1] = CStringGetTextDatum(gxact->gid);
values[2] = TimestampTzGetDatum(gxact->prepared_at);
values[3] = ObjectIdGetDatum(gxact->owner);
- values[4] = ObjectIdGetDatum(gxact->proc.databaseId);
+ values[4] = ObjectIdGetDatum(proc->databaseId);
tuple = heap_form_tuple(funcctx->tuple_desc, values, nulls);
result = HeapTupleGetDatum(tuple);
@@ -711,10 +725,11 @@ TwoPhaseGetDummyProc(TransactionId xid)
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
GlobalTransaction gxact = TwoPhaseState->prepXacts[i];
+ PGXACT *pgxact = &ProcGlobal->allPgXact[gxact->pgprocno];
- if (gxact->proc.xid == xid)
+ if (pgxact->xid == xid)
{
- result = &gxact->proc;
+ result = &ProcGlobal->allProcs[gxact->pgprocno];
break;
}
}
@@ -841,7 +856,9 @@ save_state_data(const void *data, uint32 len)
void
StartPrepare(GlobalTransaction gxact)
{
- TransactionId xid = gxact->proc.xid;
+ PGPROC *proc = &ProcGlobal->allProcs[gxact->pgprocno];
+ PGXACT *pgxact = &ProcGlobal->allPgXact[gxact->pgprocno];
+ TransactionId xid = pgxact->xid;
TwoPhaseFileHeader hdr;
TransactionId *children;
RelFileNode *commitrels;
@@ -865,7 +882,7 @@ StartPrepare(GlobalTransaction gxact)
hdr.magic = TWOPHASE_MAGIC;
hdr.total_len = 0; /* EndPrepare will fill this in */
hdr.xid = xid;
- hdr.database = gxact->proc.databaseId;
+ hdr.database = proc->databaseId;
hdr.prepared_at = gxact->prepared_at;
hdr.owner = gxact->owner;
hdr.nsubxacts = xactGetCommittedChildren(&children);
@@ -913,7 +930,8 @@ StartPrepare(GlobalTransaction gxact)
void
EndPrepare(GlobalTransaction gxact)
{
- TransactionId xid = gxact->proc.xid;
+ PGXACT *pgxact = &ProcGlobal->allPgXact[gxact->pgprocno];
+ TransactionId xid = pgxact->xid;
TwoPhaseFileHeader *hdr;
char path[MAXPGPATH];
XLogRecData *record;
@@ -1021,7 +1039,7 @@ EndPrepare(GlobalTransaction gxact)
*/
START_CRIT_SECTION();
- MyProc->inCommit = true;
+ MyPgXact->inCommit = true;
gxact->prepare_lsn = XLogInsert(RM_XACT_ID, XLOG_XACT_PREPARE,
records.head);
@@ -1069,7 +1087,7 @@ EndPrepare(GlobalTransaction gxact)
* checkpoint starting after this will certainly see the gxact as a
* candidate for fsyncing.
*/
- MyProc->inCommit = false;
+ MyPgXact->inCommit = false;
END_CRIT_SECTION();
@@ -1242,6 +1260,8 @@ void
FinishPreparedTransaction(const char *gid, bool isCommit)
{
GlobalTransaction gxact;
+ PGPROC *proc;
+ PGXACT *pgxact;
TransactionId xid;
char *buf;
char *bufptr;
@@ -1260,7 +1280,9 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
* try to commit the same GID at once.
*/
gxact = LockGXact(gid, GetUserId());
- xid = gxact->proc.xid;
+ proc = &ProcGlobal->allProcs[gxact->pgprocno];
+ pgxact = &ProcGlobal->allPgXact[gxact->pgprocno];
+ xid = pgxact->xid;
/*
* Read and validate the state file
@@ -1309,7 +1331,7 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
hdr->nsubxacts, children,
hdr->nabortrels, abortrels);
- ProcArrayRemove(&gxact->proc, latestXid);
+ ProcArrayRemove(proc, latestXid);
/*
* In case we fail while running the callbacks, mark the gxact invalid so
@@ -1540,10 +1562,11 @@ CheckPointTwoPhase(XLogRecPtr redo_horizon)
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
GlobalTransaction gxact = TwoPhaseState->prepXacts[i];
+ PGXACT *pgxact = &ProcGlobal->allPgXact[gxact->pgprocno];
if (gxact->valid &&
XLByteLE(gxact->prepare_lsn, redo_horizon))
- xids[nxids++] = gxact->proc.xid;
+ xids[nxids++] = pgxact->xid;
}
LWLockRelease(TwoPhaseStateLock);
@@ -1972,7 +1995,7 @@ RecordTransactionCommitPrepared(TransactionId xid,
START_CRIT_SECTION();
/* See notes in RecordTransactionCommit */
- MyProc->inCommit = true;
+ MyPgXact->inCommit = true;
/* Emit the XLOG commit record */
xlrec.xid = xid;
@@ -2037,7 +2060,7 @@ RecordTransactionCommitPrepared(TransactionId xid,
TransactionIdCommitTree(xid, nchildren, children);
/* Checkpoint can proceed now */
- MyProc->inCommit = false;
+ MyPgXact->inCommit = false;
END_CRIT_SECTION();
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 61dcfed..443e5e4 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -54,7 +54,7 @@ GetNewTransactionId(bool isSubXact)
if (IsBootstrapProcessingMode())
{
Assert(!isSubXact);
- MyProc->xid = BootstrapTransactionId;
+ MyPgXact->xid = BootstrapTransactionId;
return BootstrapTransactionId;
}
@@ -208,20 +208,21 @@ GetNewTransactionId(bool isSubXact)
* TransactionId and int fetch/store are atomic.
*/
volatile PGPROC *myproc = MyProc;
+ volatile PGXACT *mypgxact = MyPgXact;
if (!isSubXact)
- myproc->xid = xid;
+ mypgxact->xid = xid;
else
{
- int nxids = myproc->subxids.nxids;
+ int nxids = mypgxact->nxids;
if (nxids < PGPROC_MAX_CACHED_SUBXIDS)
{
myproc->subxids.xids[nxids] = xid;
- myproc->subxids.nxids = nxids + 1;
+ mypgxact->nxids = nxids + 1;
}
else
- myproc->subxids.overflowed = true;
+ mypgxact->overflowed = true;
}
}
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index c151d3b..c383011 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -981,7 +981,7 @@ RecordTransactionCommit(void)
* bit fuzzy, but it doesn't matter.
*/
START_CRIT_SECTION();
- MyProc->inCommit = true;
+ MyPgXact->inCommit = true;
SetCurrentTransactionStopTimestamp();
@@ -1155,7 +1155,7 @@ RecordTransactionCommit(void)
*/
if (markXidCommitted)
{
- MyProc->inCommit = false;
+ MyPgXact->inCommit = false;
END_CRIT_SECTION();
}
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 32985a4..3143246 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -223,7 +223,7 @@ analyze_rel(Oid relid, VacuumStmt *vacstmt, BufferAccessStrategy bstrategy)
* OK, let's do it. First let other backends know I'm in ANALYZE.
*/
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- MyProc->vacuumFlags |= PROC_IN_ANALYZE;
+ MyPgXact->vacuumFlags |= PROC_IN_ANALYZE;
LWLockRelease(ProcArrayLock);
/*
@@ -250,7 +250,7 @@ analyze_rel(Oid relid, VacuumStmt *vacstmt, BufferAccessStrategy bstrategy)
* because the vacuum flag is cleared by the end-of-xact code.
*/
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- MyProc->vacuumFlags &= ~PROC_IN_ANALYZE;
+ MyPgXact->vacuumFlags &= ~PROC_IN_ANALYZE;
LWLockRelease(ProcArrayLock);
}
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index f42504c..89e190d 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -893,9 +893,9 @@ vacuum_rel(Oid relid, VacuumStmt *vacstmt, bool do_toast, bool for_wraparound)
* which is probably Not Good.
*/
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- MyProc->vacuumFlags |= PROC_IN_VACUUM;
+ MyPgXact->vacuumFlags |= PROC_IN_VACUUM;
if (for_wraparound)
- MyProc->vacuumFlags |= PROC_VACUUM_FOR_WRAPAROUND;
+ MyPgXact->vacuumFlags |= PROC_VACUUM_FOR_WRAPAROUND;
LWLockRelease(ProcArrayLock);
}
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 6758083..963189d 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -430,6 +430,7 @@ typedef struct
slock_t *ProcStructLock;
PROC_HDR *ProcGlobal;
PGPROC *AuxiliaryProcs;
+ PGPROC *PreparedXactProcs;
PMSignalData *PMSignalState;
InheritableSocket pgStatSock;
pid_t PostmasterPid;
@@ -4724,6 +4725,7 @@ save_backend_variables(BackendParameters *param, Port *port,
param->ProcStructLock = ProcStructLock;
param->ProcGlobal = ProcGlobal;
param->AuxiliaryProcs = AuxiliaryProcs;
+ param->PreparedXactProcs = PreparedXactProcs;
param->PMSignalState = PMSignalState;
if (!write_inheritable_socket(¶m->pgStatSock, pgStatSock, childPid))
return false;
@@ -4947,6 +4949,7 @@ restore_backend_variables(BackendParameters *param, Port *port)
ProcStructLock = param->ProcStructLock;
ProcGlobal = param->ProcGlobal;
AuxiliaryProcs = param->AuxiliaryProcs;
+ PreparedXactProcs = param->PreparedXactProcs;
PMSignalState = param->PMSignalState;
read_inheritable_socket(&pgStatSock, ¶m->pgStatSock);
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index dd2d6ee..ea86520 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -702,7 +702,7 @@ ProcessStandbyHSFeedbackMessage(void)
* safe, and if we're moving it backwards, well, the data is at risk
* already since a VACUUM could have just finished calling GetOldestXmin.)
*/
- MyProc->xmin = reply.xmin;
+ MyPgXact->xmin = reply.xmin;
}
/* Main loop of walsender process */
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 56c0bd8..bb8b832 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -192,7 +192,6 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
XLOGShmemInit();
CLOGShmemInit();
SUBTRANSShmemInit();
- TwoPhaseShmemInit();
MultiXactShmemInit();
InitBufferPool();
@@ -213,6 +212,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
InitProcGlobal();
CreateSharedProcArray();
CreateSharedBackendStatus();
+ TwoPhaseShmemInit();
/*
* Set up shared-inval messaging
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 1a48485..6986c57 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -82,14 +82,17 @@ typedef struct ProcArrayStruct
TransactionId lastOverflowedXid;
/*
- * We declare procs[] as 1 entry because C wants a fixed-size array, but
+ * We declare pgprocnos[] as 1 entry because C wants a fixed-size array, but
* actually it is maxProcs entries long.
*/
- PGPROC *procs[1]; /* VARIABLE LENGTH ARRAY */
+ int pgprocnos[1]; /* VARIABLE LENGTH ARRAY */
} ProcArrayStruct;
static ProcArrayStruct *procArray;
+static PGPROC *allProcs;
+static PGXACT *allPgXact;
+
/*
* Bookkeeping for tracking emulated transactions in recovery
*/
@@ -169,8 +172,8 @@ ProcArrayShmemSize(void)
/* Size of the ProcArray structure itself */
#define PROCARRAY_MAXPROCS (MaxBackends + max_prepared_xacts)
- size = offsetof(ProcArrayStruct, procs);
- size = add_size(size, mul_size(sizeof(PGPROC *), PROCARRAY_MAXPROCS));
+ size = offsetof(ProcArrayStruct, pgprocnos);
+ size = add_size(size, mul_size(sizeof(int), PROCARRAY_MAXPROCS));
/*
* During Hot Standby processing we have a data structure called
@@ -211,8 +214,8 @@ CreateSharedProcArray(void)
/* Create or attach to the ProcArray shared structure */
procArray = (ProcArrayStruct *)
ShmemInitStruct("Proc Array",
- add_size(offsetof(ProcArrayStruct, procs),
- mul_size(sizeof(PGPROC *),
+ add_size(offsetof(ProcArrayStruct, pgprocnos),
+ mul_size(sizeof(int),
PROCARRAY_MAXPROCS)),
&found);
@@ -231,6 +234,9 @@ CreateSharedProcArray(void)
procArray->lastOverflowedXid = InvalidTransactionId;
}
+ allProcs = ProcGlobal->allProcs;
+ allPgXact = ProcGlobal->allPgXact;
+
/* Create or attach to the KnownAssignedXids arrays too, if needed */
if (EnableHotStandby)
{
@@ -253,6 +259,7 @@ void
ProcArrayAdd(PGPROC *proc)
{
ProcArrayStruct *arrayP = procArray;
+ int index;
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
@@ -269,7 +276,28 @@ ProcArrayAdd(PGPROC *proc)
errmsg("sorry, too many clients already")));
}
- arrayP->procs[arrayP->numProcs] = proc;
+ /*
+ * Keep the procs array sorted by (PGPROC *) so that we can utilize
+ * locality of references much better. This is useful while traversing the
+ * ProcArray because there is a increased likelyhood of finding the next
+ * PGPROC structure in the cache.
+ *
+ * Since the occurance of adding/removing a proc is much lower than the
+ * access to the ProcArray itself, the overhead should be marginal
+ */
+ for (index = 0; index < arrayP->numProcs; index++)
+ {
+ /*
+ * If we are the first PGPROC or if we have found our right position in
+ * the array, break
+ */
+ if ((arrayP->pgprocnos[index] == -1) || (arrayP->pgprocnos[index] > proc->pgprocno))
+ break;
+ }
+
+ memmove(&arrayP->pgprocnos[index + 1], &arrayP->pgprocnos[index],
+ (arrayP->numProcs - index) * sizeof (int));
+ arrayP->pgprocnos[index] = proc->pgprocno;
arrayP->numProcs++;
LWLockRelease(ProcArrayLock);
@@ -301,7 +329,7 @@ ProcArrayRemove(PGPROC *proc, TransactionId latestXid)
if (TransactionIdIsValid(latestXid))
{
- Assert(TransactionIdIsValid(proc->xid));
+ Assert(TransactionIdIsValid(allPgXact[proc->pgprocno].xid));
/* Advance global latestCompletedXid while holding the lock */
if (TransactionIdPrecedes(ShmemVariableCache->latestCompletedXid,
@@ -311,15 +339,17 @@ ProcArrayRemove(PGPROC *proc, TransactionId latestXid)
else
{
/* Shouldn't be trying to remove a live transaction here */
- Assert(!TransactionIdIsValid(proc->xid));
+ Assert(!TransactionIdIsValid(allPgXact[proc->pgprocno].xid));
}
for (index = 0; index < arrayP->numProcs; index++)
{
- if (arrayP->procs[index] == proc)
+ if (arrayP->pgprocnos[index] == proc->pgprocno)
{
- arrayP->procs[index] = arrayP->procs[arrayP->numProcs - 1];
- arrayP->procs[arrayP->numProcs - 1] = NULL; /* for debugging */
+ /* Keep the PGPROC array sorted. See notes above */
+ memmove(&arrayP->pgprocnos[index], &arrayP->pgprocnos[index + 1],
+ (arrayP->numProcs - index - 1) * sizeof (int));
+ arrayP->pgprocnos[arrayP->numProcs - 1] = -1; /* for debugging */
arrayP->numProcs--;
LWLockRelease(ProcArrayLock);
return;
@@ -349,29 +379,31 @@ ProcArrayRemove(PGPROC *proc, TransactionId latestXid)
void
ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
{
+ PGXACT *pgxact = &allPgXact[proc->pgprocno];
+
if (TransactionIdIsValid(latestXid))
{
/*
- * We must lock ProcArrayLock while clearing proc->xid, so that we do
- * not exit the set of "running" transactions while someone else is
- * taking a snapshot. See discussion in
+ * We must lock ProcArrayLock while clearing our advertised XID, so
+ * that we do not exit the set of "running" transactions while someone
+ * else is taking a snapshot. See discussion in
* src/backend/access/transam/README.
*/
- Assert(TransactionIdIsValid(proc->xid));
+ Assert(TransactionIdIsValid(allPgXact[proc->pgprocno].xid));
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- proc->xid = InvalidTransactionId;
+ pgxact->xid = InvalidTransactionId;
proc->lxid = InvalidLocalTransactionId;
- proc->xmin = InvalidTransactionId;
+ pgxact->xmin = InvalidTransactionId;
/* must be cleared with xid/xmin: */
- proc->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
- proc->inCommit = false; /* be sure this is cleared in abort */
+ pgxact->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
+ pgxact->inCommit = false; /* be sure this is cleared in abort */
proc->recoveryConflictPending = false;
/* Clear the subtransaction-XID cache too while holding the lock */
- proc->subxids.nxids = 0;
- proc->subxids.overflowed = false;
+ pgxact->nxids = 0;
+ pgxact->overflowed = false;
/* Also advance global latestCompletedXid while holding the lock */
if (TransactionIdPrecedes(ShmemVariableCache->latestCompletedXid,
@@ -387,17 +419,17 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
* anyone else's calculation of a snapshot. We might change their
* estimate of global xmin, but that's OK.
*/
- Assert(!TransactionIdIsValid(proc->xid));
+ Assert(!TransactionIdIsValid(allPgXact[proc->pgprocno].xid));
proc->lxid = InvalidLocalTransactionId;
- proc->xmin = InvalidTransactionId;
+ pgxact->xmin = InvalidTransactionId;
/* must be cleared with xid/xmin: */
- proc->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
- proc->inCommit = false; /* be sure this is cleared in abort */
+ pgxact->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
+ pgxact->inCommit = false; /* be sure this is cleared in abort */
proc->recoveryConflictPending = false;
- Assert(proc->subxids.nxids == 0);
- Assert(proc->subxids.overflowed == false);
+ Assert(pgxact->nxids == 0);
+ Assert(pgxact->overflowed == false);
}
}
@@ -413,24 +445,26 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
void
ProcArrayClearTransaction(PGPROC *proc)
{
+ PGXACT *pgxact = &allPgXact[proc->pgprocno];
+
/*
* We can skip locking ProcArrayLock here, because this action does not
* actually change anyone's view of the set of running XIDs: our entry is
* duplicate with the gxact that has already been inserted into the
* ProcArray.
*/
- proc->xid = InvalidTransactionId;
+ pgxact->xid = InvalidTransactionId;
proc->lxid = InvalidLocalTransactionId;
- proc->xmin = InvalidTransactionId;
+ pgxact->xmin = InvalidTransactionId;
proc->recoveryConflictPending = false;
/* redundant, but just in case */
- proc->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
- proc->inCommit = false;
+ pgxact->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
+ pgxact->inCommit = false;
/* Clear the subtransaction-XID cache too */
- proc->subxids.nxids = 0;
- proc->subxids.overflowed = false;
+ pgxact->nxids = 0;
+ pgxact->overflowed = false;
}
/*
@@ -811,15 +845,17 @@ TransactionIdIsInProgress(TransactionId xid)
/* No shortcuts, gotta grovel through the array */
for (i = 0; i < arrayP->numProcs; i++)
{
- volatile PGPROC *proc = arrayP->procs[i];
- TransactionId pxid;
+ int pgprocno = arrayP->pgprocnos[i];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
+ TransactionId pxid;
/* Ignore my own proc --- dealt with it above */
if (proc == MyProc)
continue;
/* Fetch xid just once - see GetNewTransactionId */
- pxid = proc->xid;
+ pxid = pgxact->xid;
if (!TransactionIdIsValid(pxid))
continue;
@@ -844,7 +880,7 @@ TransactionIdIsInProgress(TransactionId xid)
/*
* Step 2: check the cached child-Xids arrays
*/
- for (j = proc->subxids.nxids - 1; j >= 0; j--)
+ for (j = pgxact->nxids - 1; j >= 0; j--)
{
/* Fetch xid just once - see GetNewTransactionId */
TransactionId cxid = proc->subxids.xids[j];
@@ -864,7 +900,7 @@ TransactionIdIsInProgress(TransactionId xid)
* we hold ProcArrayLock. So we can't miss an Xid that we need to
* worry about.)
*/
- if (proc->subxids.overflowed)
+ if (pgxact->overflowed)
xids[nxids++] = pxid;
}
@@ -965,10 +1001,13 @@ TransactionIdIsActive(TransactionId xid)
for (i = 0; i < arrayP->numProcs; i++)
{
- volatile PGPROC *proc = arrayP->procs[i];
+ int pgprocno = arrayP->pgprocnos[i];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
+ TransactionId pxid;
/* Fetch xid just once - see GetNewTransactionId */
- TransactionId pxid = proc->xid;
+ pxid = pgxact->xid;
if (!TransactionIdIsValid(pxid))
continue;
@@ -1060,9 +1099,11 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
- if (ignoreVacuum && (proc->vacuumFlags & PROC_IN_VACUUM))
+ if (ignoreVacuum && (pgxact->vacuumFlags & PROC_IN_VACUUM))
continue;
if (allDbs ||
@@ -1070,7 +1111,7 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
proc->databaseId == 0) /* always include WalSender */
{
/* Fetch xid just once - see GetNewTransactionId */
- TransactionId xid = proc->xid;
+ TransactionId xid = pgxact->xid;
/* First consider the transaction's own Xid, if any */
if (TransactionIdIsNormal(xid) &&
@@ -1084,7 +1125,7 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
* have an Xmin but not (yet) an Xid; conversely, if it has an
* Xid, that could determine some not-yet-set Xmin.
*/
- xid = proc->xmin; /* Fetch just once */
+ xid = pgxact->xmin; /* Fetch just once */
if (TransactionIdIsNormal(xid) &&
TransactionIdPrecedes(xid, result))
result = xid;
@@ -1200,6 +1241,8 @@ GetSnapshotData(Snapshot snapshot)
int count = 0;
int subcount = 0;
bool suboverflowed = false;
+ static TransactionId *xmins = NULL;
+ int numProcs;
Assert(snapshot != NULL);
@@ -1235,6 +1278,15 @@ GetSnapshotData(Snapshot snapshot)
errmsg("out of memory")));
}
+ if (xmins == NULL)
+ {
+ xmins = malloc(procArray->maxProcs * sizeof(TransactionId));
+ if (xmins == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of memory")));
+ }
+
/*
* It is sufficient to get shared lock on ProcArrayLock, even if we are
* going to set MyProc->xmin.
@@ -1261,31 +1313,32 @@ GetSnapshotData(Snapshot snapshot)
if (!snapshot->takenDuringRecovery)
{
+ int *pgprocnos = arrayP->pgprocnos;
+
/*
* Spin over procArray checking xid, xmin, and subxids. The goal is
* to gather all active xids, find the lowest xmin, and try to record
- * subxids. During recovery no xids will be assigned, so all normal
- * backends can be ignored, nor are there any VACUUMs running. All
- * prepared transaction xids are held in KnownAssignedXids, so these
- * will be seen without needing to loop through procs here.
+ * subxids.
*/
- for (index = 0; index < arrayP->numProcs; index++)
+ numProcs = arrayP->numProcs;
+ for (index = 0; index < numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
- TransactionId xid;
+ int pgprocno = pgprocnos[index];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
+ TransactionId xid;
/* Ignore procs running LAZY VACUUM */
- if (proc->vacuumFlags & PROC_IN_VACUUM)
+ if (pgxact->vacuumFlags & PROC_IN_VACUUM)
+ {
+ xmins[index] = InvalidTransactionId;
continue;
+ }
/* Update globalxmin to be the smallest valid xmin */
- xid = proc->xmin; /* fetch just once */
- if (TransactionIdIsNormal(xid) &&
- TransactionIdPrecedes(xid, globalxmin))
- globalxmin = xid;
+ xmins[index] = pgxact->xmin; /* fetch just once */
/* Fetch xid just once - see GetNewTransactionId */
- xid = proc->xid;
+ xid = pgxact->xid;
/*
* If the transaction has been assigned an xid < xmax we add it to
@@ -1300,7 +1353,7 @@ GetSnapshotData(Snapshot snapshot)
{
if (TransactionIdFollowsOrEquals(xid, xmax))
continue;
- if (proc != MyProc)
+ if (pgxact != MyPgXact)
snapshot->xip[count++] = xid;
if (TransactionIdPrecedes(xid, xmin))
xmin = xid;
@@ -1321,16 +1374,17 @@ GetSnapshotData(Snapshot snapshot)
*
* Again, our own XIDs are not included in the snapshot.
*/
- if (!suboverflowed && proc != MyProc)
+ if (!suboverflowed && pgxact != MyPgXact)
{
- if (proc->subxids.overflowed)
+ if (pgxact->overflowed)
suboverflowed = true;
else
{
- int nxids = proc->subxids.nxids;
+ int nxids = pgxact->nxids;
if (nxids > 0)
{
+ volatile PGPROC *proc = &allProcs[pgprocno];
memcpy(snapshot->subxip + subcount,
(void *) proc->subxids.xids,
nxids * sizeof(TransactionId));
@@ -1342,6 +1396,7 @@ GetSnapshotData(Snapshot snapshot)
}
else
{
+ numProcs = 0;
/*
* We're in hot standby, so get XIDs from KnownAssignedXids.
*
@@ -1372,9 +1427,8 @@ GetSnapshotData(Snapshot snapshot)
suboverflowed = true;
}
- if (!TransactionIdIsValid(MyProc->xmin))
- MyProc->xmin = TransactionXmin = xmin;
-
+ if (!TransactionIdIsValid(MyPgXact->xmin))
+ MyPgXact->xmin = TransactionXmin = xmin;
LWLockRelease(ProcArrayLock);
/*
@@ -1382,6 +1436,14 @@ GetSnapshotData(Snapshot snapshot)
* different way of computing it than GetOldestXmin uses, but should give
* the same result.
*/
+ for (index = 0; index < numProcs; index++)
+ {
+ TransactionId xid = xmins[index];
+ if (TransactionIdIsNormal(xid) &&
+ TransactionIdPrecedes(xid, globalxmin))
+ globalxmin = xid;
+ }
+
if (TransactionIdPrecedes(xmin, globalxmin))
globalxmin = xmin;
@@ -1436,14 +1498,16 @@ ProcArrayInstallImportedXmin(TransactionId xmin, TransactionId sourcexid)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
- TransactionId xid;
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
+ TransactionId xid;
/* Ignore procs running LAZY VACUUM */
- if (proc->vacuumFlags & PROC_IN_VACUUM)
+ if (pgxact->vacuumFlags & PROC_IN_VACUUM)
continue;
- xid = proc->xid; /* fetch just once */
+ xid = pgxact->xid; /* fetch just once */
if (xid != sourcexid)
continue;
@@ -1459,7 +1523,7 @@ ProcArrayInstallImportedXmin(TransactionId xmin, TransactionId sourcexid)
/*
* Likewise, let's just make real sure its xmin does cover us.
*/
- xid = proc->xmin; /* fetch just once */
+ xid = pgxact->xmin; /* fetch just once */
if (!TransactionIdIsNormal(xid) ||
!TransactionIdPrecedesOrEquals(xid, xmin))
continue;
@@ -1470,7 +1534,7 @@ ProcArrayInstallImportedXmin(TransactionId xmin, TransactionId sourcexid)
* GetSnapshotData first, we'll be overwriting a valid xmin here,
* so we don't check that.)
*/
- MyProc->xmin = TransactionXmin = xmin;
+ MyPgXact->xmin = TransactionXmin = xmin;
result = true;
break;
@@ -1562,12 +1626,14 @@ GetRunningTransactionData(void)
*/
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
TransactionId xid;
int nxids;
/* Fetch xid just once - see GetNewTransactionId */
- xid = proc->xid;
+ xid = pgxact->xid;
/*
* We don't need to store transactions that don't have a TransactionId
@@ -1585,7 +1651,7 @@ GetRunningTransactionData(void)
* Save subtransaction XIDs. Other backends can't add or remove
* entries while we're holding XidGenLock.
*/
- nxids = proc->subxids.nxids;
+ nxids = pgxact->nxids;
if (nxids > 0)
{
memcpy(&xids[count], (void *) proc->subxids.xids,
@@ -1593,7 +1659,7 @@ GetRunningTransactionData(void)
count += nxids;
subcount += nxids;
- if (proc->subxids.overflowed)
+ if (pgxact->overflowed)
suboverflowed = true;
/*
@@ -1653,11 +1719,12 @@ GetOldestActiveTransactionId(void)
*/
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
TransactionId xid;
/* Fetch xid just once - see GetNewTransactionId */
- xid = proc->xid;
+ xid = pgxact->xid;
if (!TransactionIdIsNormal(xid))
continue;
@@ -1709,12 +1776,14 @@ GetTransactionsInCommit(TransactionId **xids_p)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
+ TransactionId pxid;
/* Fetch xid just once - see GetNewTransactionId */
- TransactionId pxid = proc->xid;
+ pxid = pgxact->xid;
- if (proc->inCommit && TransactionIdIsValid(pxid))
+ if (pgxact->inCommit && TransactionIdIsValid(pxid))
xids[nxids++] = pxid;
}
@@ -1744,12 +1813,14 @@ HaveTransactionsInCommit(TransactionId *xids, int nxids)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
+ TransactionId pxid;
/* Fetch xid just once - see GetNewTransactionId */
- TransactionId pxid = proc->xid;
+ pxid = pgxact->xid;
- if (proc->inCommit && TransactionIdIsValid(pxid))
+ if (pgxact->inCommit && TransactionIdIsValid(pxid))
{
int i;
@@ -1792,7 +1863,7 @@ BackendPidGetProc(int pid)
for (index = 0; index < arrayP->numProcs; index++)
{
- PGPROC *proc = arrayP->procs[index];
+ PGPROC *proc = &allProcs[arrayP->pgprocnos[index]];
if (proc->pid == pid)
{
@@ -1833,9 +1904,11 @@ BackendXidGetPid(TransactionId xid)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
- if (proc->xid == xid)
+ if (pgxact->xid == xid)
{
result = proc->pid;
break;
@@ -1901,18 +1974,20 @@ GetCurrentVirtualXIDs(TransactionId limitXmin, bool excludeXmin0,
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
if (proc == MyProc)
continue;
- if (excludeVacuum & proc->vacuumFlags)
+ if (excludeVacuum & pgxact->vacuumFlags)
continue;
if (allDbs || proc->databaseId == MyDatabaseId)
{
/* Fetch xmin just once - might change on us */
- TransactionId pxmin = proc->xmin;
+ TransactionId pxmin = pgxact->xmin;
if (excludeXmin0 && !TransactionIdIsValid(pxmin))
continue;
@@ -1996,7 +2071,9 @@ GetConflictingVirtualXIDs(TransactionId limitXmin, Oid dbOid)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
/* Exclude prepared transactions */
if (proc->pid == 0)
@@ -2006,7 +2083,7 @@ GetConflictingVirtualXIDs(TransactionId limitXmin, Oid dbOid)
proc->databaseId == dbOid)
{
/* Fetch xmin just once - can't change on us, but good coding */
- TransactionId pxmin = proc->xmin;
+ TransactionId pxmin = pgxact->xmin;
/*
* We ignore an invalid pxmin because this means that backend has
@@ -2050,8 +2127,9 @@ CancelVirtualTransaction(VirtualTransactionId vxid, ProcSignalReason sigmode)
for (index = 0; index < arrayP->numProcs; index++)
{
- VirtualTransactionId procvxid;
- PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ VirtualTransactionId procvxid;
GET_VXID_FROM_PGPROC(procvxid, *proc);
@@ -2104,7 +2182,9 @@ MinimumActiveBackends(int min)
*/
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
/*
* Since we're not holding a lock, need to check that the pointer is
@@ -2122,10 +2202,10 @@ MinimumActiveBackends(int min)
if (proc == MyProc)
continue; /* do not count myself */
+ if (pgxact->xid == InvalidTransactionId)
+ continue; /* do not count if no XID assigned */
if (proc->pid == 0)
continue; /* do not count prepared xacts */
- if (proc->xid == InvalidTransactionId)
- continue; /* do not count if no XID assigned */
if (proc->waitLock != NULL)
continue; /* do not count if blocked on a lock */
count++;
@@ -2150,7 +2230,8 @@ CountDBBackends(Oid databaseid)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
if (proc->pid == 0)
continue; /* do not count prepared xacts */
@@ -2179,7 +2260,8 @@ CancelDBBackends(Oid databaseid, ProcSignalReason sigmode, bool conflictPending)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
if (databaseid == InvalidOid || proc->databaseId == databaseid)
{
@@ -2217,7 +2299,8 @@ CountUserBackends(Oid roleid)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
if (proc->pid == 0)
continue; /* do not count prepared xacts */
@@ -2277,7 +2360,9 @@ CountOtherDBBackends(Oid databaseId, int *nbackends, int *nprepared)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
if (proc->databaseId != databaseId)
continue;
@@ -2291,7 +2376,7 @@ CountOtherDBBackends(Oid databaseId, int *nbackends, int *nprepared)
else
{
(*nbackends)++;
- if ((proc->vacuumFlags & PROC_IS_AUTOVACUUM) &&
+ if ((pgxact->vacuumFlags & PROC_IS_AUTOVACUUM) &&
nautovacs < MAXAUTOVACPIDS)
autovac_pids[nautovacs++] = proc->pid;
}
@@ -2321,8 +2406,8 @@ CountOtherDBBackends(Oid databaseId, int *nbackends, int *nprepared)
#define XidCacheRemove(i) \
do { \
- MyProc->subxids.xids[i] = MyProc->subxids.xids[MyProc->subxids.nxids - 1]; \
- MyProc->subxids.nxids--; \
+ MyProc->subxids.xids[i] = MyProc->subxids.xids[MyPgXact->nxids - 1]; \
+ MyPgXact->nxids--; \
} while (0)
/*
@@ -2361,7 +2446,7 @@ XidCacheRemoveRunningXids(TransactionId xid,
{
TransactionId anxid = xids[i];
- for (j = MyProc->subxids.nxids - 1; j >= 0; j--)
+ for (j = MyPgXact->nxids - 1; j >= 0; j--)
{
if (TransactionIdEquals(MyProc->subxids.xids[j], anxid))
{
@@ -2377,11 +2462,11 @@ XidCacheRemoveRunningXids(TransactionId xid,
* error during AbortSubTransaction. So instead of Assert, emit a
* debug warning.
*/
- if (j < 0 && !MyProc->subxids.overflowed)
+ if (j < 0 && !MyPgXact->overflowed)
elog(WARNING, "did not find subXID %u in MyProc", anxid);
}
- for (j = MyProc->subxids.nxids - 1; j >= 0; j--)
+ for (j = MyPgXact->nxids - 1; j >= 0; j--)
{
if (TransactionIdEquals(MyProc->subxids.xids[j], xid))
{
@@ -2390,7 +2475,7 @@ XidCacheRemoveRunningXids(TransactionId xid,
}
}
/* Ordinarily we should have found it, unless the cache has overflowed */
- if (j < 0 && !MyProc->subxids.overflowed)
+ if (j < 0 && !MyPgXact->overflowed)
elog(WARNING, "did not find subXID %u in MyProc", xid);
/* Also advance global latestCompletedXid while holding the lock */
diff --git a/src/backend/storage/lmgr/deadlock.c b/src/backend/storage/lmgr/deadlock.c
index 7e7f6af..63326b8 100644
--- a/src/backend/storage/lmgr/deadlock.c
+++ b/src/backend/storage/lmgr/deadlock.c
@@ -450,6 +450,7 @@ FindLockCycleRecurse(PGPROC *checkProc,
int *nSoftEdges) /* output argument */
{
PGPROC *proc;
+ PGXACT *pgxact;
LOCK *lock;
PROCLOCK *proclock;
SHM_QUEUE *procLocks;
@@ -516,6 +517,7 @@ FindLockCycleRecurse(PGPROC *checkProc,
while (proclock)
{
proc = proclock->tag.myProc;
+ pgxact = &ProcGlobal->allPgXact[proc->pgprocno];
/* A proc never blocks itself */
if (proc != checkProc)
@@ -541,7 +543,7 @@ FindLockCycleRecurse(PGPROC *checkProc,
* vacuumFlag bit), but we don't do that here to avoid
* grabbing ProcArrayLock.
*/
- if (proc->vacuumFlags & PROC_IS_AUTOVACUUM)
+ if (pgxact->vacuumFlags & PROC_IS_AUTOVACUUM)
blocking_autovacuum_proc = proc;
/* This proc hard-blocks checkProc */
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 905502f..3ba4671 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -3188,9 +3188,10 @@ GetRunningTransactionLocks(int *nlocks)
proclock->tag.myLock->tag.locktag_type == LOCKTAG_RELATION)
{
PGPROC *proc = proclock->tag.myProc;
+ PGXACT *pgxact = &ProcGlobal->allPgXact[proc->pgprocno];
LOCK *lock = proclock->tag.myLock;
- accessExclusiveLocks[index].xid = proc->xid;
+ accessExclusiveLocks[index].xid = pgxact->xid;
accessExclusiveLocks[index].dbOid = lock->tag.locktag_field1;
accessExclusiveLocks[index].relOid = lock->tag.locktag_field2;
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index eda3a98..bcbc802 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -36,6 +36,7 @@
#include <sys/time.h>
#include "access/transam.h"
+#include "access/twophase.h"
#include "access/xact.h"
#include "miscadmin.h"
#include "postmaster/autovacuum.h"
@@ -57,6 +58,7 @@ bool log_lock_waits = false;
/* Pointer to this process's PGPROC struct, if any */
PGPROC *MyProc = NULL;
+PGXACT *MyPgXact = NULL;
/*
* This spinlock protects the freelist of recycled PGPROC structures.
@@ -70,6 +72,7 @@ NON_EXEC_STATIC slock_t *ProcStructLock = NULL;
/* Pointers to shared-memory structures */
PROC_HDR *ProcGlobal = NULL;
NON_EXEC_STATIC PGPROC *AuxiliaryProcs = NULL;
+PGPROC *PreparedXactProcs = NULL;
/* If we are waiting for a lock, this points to the associated LOCALLOCK */
static LOCALLOCK *lockAwaited = NULL;
@@ -106,13 +109,19 @@ ProcGlobalShmemSize(void)
/* ProcGlobal */
size = add_size(size, sizeof(PROC_HDR));
- /* AuxiliaryProcs */
- size = add_size(size, mul_size(NUM_AUXILIARY_PROCS, sizeof(PGPROC)));
/* MyProcs, including autovacuum workers and launcher */
size = add_size(size, mul_size(MaxBackends, sizeof(PGPROC)));
+ /* AuxiliaryProcs */
+ size = add_size(size, mul_size(NUM_AUXILIARY_PROCS, sizeof(PGPROC)));
+ /* Prepared xacts */
+ size = add_size(size, mul_size(max_prepared_xacts, sizeof(PGPROC)));
/* ProcStructLock */
size = add_size(size, sizeof(slock_t));
+ size = add_size(size, mul_size(MaxBackends, sizeof(PGXACT)));
+ size = add_size(size, mul_size(NUM_AUXILIARY_PROCS, sizeof(PGXACT)));
+ size = add_size(size, mul_size(max_prepared_xacts, sizeof(PGXACT)));
+
return size;
}
@@ -157,10 +166,11 @@ void
InitProcGlobal(void)
{
PGPROC *procs;
+ PGXACT *pgxacts;
int i,
j;
bool found;
- uint32 TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS;
+ uint32 TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS + max_prepared_xacts;
/* Create the ProcGlobal shared structure */
ProcGlobal = (PROC_HDR *)
@@ -182,10 +192,11 @@ InitProcGlobal(void)
* those used for 2PC, which are embedded within a GlobalTransactionData
* struct).
*
- * There are three separate consumers of PGPROC structures: (1) normal
- * backends, (2) autovacuum workers and the autovacuum launcher, and (3)
- * auxiliary processes. Each PGPROC structure is dedicated to exactly
- * one of these purposes, and they do not move between groups.
+ * There are four separate consumers of PGPROC structures: (1) normal
+ * backends, (2) autovacuum workers and the autovacuum launcher, (3)
+ * auxiliary processes, and (4) prepared transactions. Each PGPROC
+ * structure is dedicated to exactly one of these purposes, and they do
+ * not move between groups.
*/
procs = (PGPROC *) ShmemAlloc(TotalProcs * sizeof(PGPROC));
ProcGlobal->allProcs = procs;
@@ -195,21 +206,43 @@ InitProcGlobal(void)
(errcode(ERRCODE_OUT_OF_MEMORY),
errmsg("out of shared memory")));
MemSet(procs, 0, TotalProcs * sizeof(PGPROC));
+
+ /*
+ * Also allocate a separate array of PGXACT structures. This is separate
+ * from the main PGPROC array so that the most heavily accessed data is
+ * stored contiguously in memory in as few cache lines as possible. This
+ * provides significant performance benefits, especially on a
+ * multiprocessor system. Thereis one PGXACT structure for every PGPROC
+ * structure.
+ */
+ pgxacts = (PGXACT *) ShmemAlloc(TotalProcs * sizeof(PGXACT));
+ MemSet(pgxacts, 0, TotalProcs * sizeof(PGXACT));
+ ProcGlobal->allPgXact = pgxacts;
+
for (i = 0; i < TotalProcs; i++)
{
/* Common initialization for all PGPROCs, regardless of type. */
- /* Set up per-PGPROC semaphore, latch, and backendLock */
- PGSemaphoreCreate(&(procs[i].sem));
- InitSharedLatch(&(procs[i].procLatch));
- procs[i].backendLock = LWLockAssign();
+ /*
+ * Set up per-PGPROC semaphore, latch, and backendLock. Prepared
+ * xact dummy PGPROCs don't need these though - they're never
+ * associated with a real process
+ */
+ if (i < MaxBackends + NUM_AUXILIARY_PROCS)
+ {
+ PGSemaphoreCreate(&(procs[i].sem));
+ InitSharedLatch(&(procs[i].procLatch));
+ procs[i].backendLock = LWLockAssign();
+ }
+ procs[i].pgprocno = i;
/*
* Newly created PGPROCs for normal backends or for autovacuum must
* be queued up on the appropriate free list. Because there can only
* ever be a small, fixed number of auxiliary processes, no free
* list is used in that case; InitAuxiliaryProcess() instead uses a
- * linear search.
+ * linear search. PGPROCs for prepared transactions are added to a
+ * free list by TwoPhaseShmemInit().
*/
if (i < MaxConnections)
{
@@ -230,10 +263,11 @@ InitProcGlobal(void)
}
/*
- * Save a pointer to the block of PGPROC structures reserved for
- * auxiliary proceses.
+ * Save pointers to the blocks of PGPROC structures reserved for
+ * auxiliary processes and prepared transactions.
*/
AuxiliaryProcs = &procs[MaxBackends];
+ PreparedXactProcs = &procs[MaxBackends + NUM_AUXILIARY_PROCS];
/* Create ProcStructLock spinlock, too */
ProcStructLock = (slock_t *) ShmemAlloc(sizeof(slock_t));
@@ -296,6 +330,7 @@ InitProcess(void)
(errcode(ERRCODE_TOO_MANY_CONNECTIONS),
errmsg("sorry, too many clients already")));
}
+ MyPgXact = &ProcGlobal->allPgXact[MyProc->pgprocno];
/*
* Now that we have a PGPROC, mark ourselves as an active postmaster
@@ -313,18 +348,18 @@ InitProcess(void)
SHMQueueElemInit(&(MyProc->links));
MyProc->waitStatus = STATUS_OK;
MyProc->lxid = InvalidLocalTransactionId;
- MyProc->xid = InvalidTransactionId;
- MyProc->xmin = InvalidTransactionId;
+ MyPgXact->xid = InvalidTransactionId;
+ MyPgXact->xmin = InvalidTransactionId;
MyProc->pid = MyProcPid;
/* backendId, databaseId and roleId will be filled in later */
MyProc->backendId = InvalidBackendId;
MyProc->databaseId = InvalidOid;
MyProc->roleId = InvalidOid;
- MyProc->inCommit = false;
- MyProc->vacuumFlags = 0;
+ MyPgXact->inCommit = false;
+ MyPgXact->vacuumFlags = 0;
/* NB -- autovac launcher intentionally does not set IS_AUTOVACUUM */
if (IsAutoVacuumWorkerProcess())
- MyProc->vacuumFlags |= PROC_IS_AUTOVACUUM;
+ MyPgXact->vacuumFlags |= PROC_IS_AUTOVACUUM;
MyProc->lwWaiting = false;
MyProc->lwExclusive = false;
MyProc->lwWaitLink = NULL;
@@ -462,6 +497,7 @@ InitAuxiliaryProcess(void)
((volatile PGPROC *) auxproc)->pid = MyProcPid;
MyProc = auxproc;
+ MyPgXact = &ProcGlobal->allPgXact[auxproc->pgprocno];
SpinLockRelease(ProcStructLock);
@@ -472,13 +508,13 @@ InitAuxiliaryProcess(void)
SHMQueueElemInit(&(MyProc->links));
MyProc->waitStatus = STATUS_OK;
MyProc->lxid = InvalidLocalTransactionId;
- MyProc->xid = InvalidTransactionId;
- MyProc->xmin = InvalidTransactionId;
+ MyPgXact->xid = InvalidTransactionId;
+ MyPgXact->xmin = InvalidTransactionId;
MyProc->backendId = InvalidBackendId;
MyProc->databaseId = InvalidOid;
MyProc->roleId = InvalidOid;
- MyProc->inCommit = false;
- MyProc->vacuumFlags = 0;
+ MyPgXact->inCommit = false;
+ MyPgXact->vacuumFlags = 0;
MyProc->lwWaiting = false;
MyProc->lwExclusive = false;
MyProc->lwWaitLink = NULL;
@@ -1045,6 +1081,7 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
if (deadlock_state == DS_BLOCKED_BY_AUTOVACUUM && allow_autovacuum_cancel)
{
PGPROC *autovac = GetBlockingAutoVacuumPgproc();
+ PGXACT *autovac_pgxact = &ProcGlobal->allPgXact[autovac->pgprocno];
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
@@ -1053,8 +1090,8 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
* wraparound.
*/
if ((autovac != NULL) &&
- (autovac->vacuumFlags & PROC_IS_AUTOVACUUM) &&
- !(autovac->vacuumFlags & PROC_VACUUM_FOR_WRAPAROUND))
+ (autovac_pgxact->vacuumFlags & PROC_IS_AUTOVACUUM) &&
+ !(autovac_pgxact->vacuumFlags & PROC_VACUUM_FOR_WRAPAROUND))
{
int pid = autovac->pid;
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index 50fb780..814cd23 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -577,7 +577,7 @@ static void
SnapshotResetXmin(void)
{
if (RegisteredSnapshots == 0 && ActiveSnapshot == NULL)
- MyProc->xmin = InvalidTransactionId;
+ MyPgXact->xmin = InvalidTransactionId;
}
/*
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 6e798b1..c7cddc7 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -35,8 +35,6 @@
struct XidCache
{
- bool overflowed;
- int nxids;
TransactionId xids[PGPROC_MAX_CACHED_SUBXIDS];
};
@@ -86,27 +84,14 @@ struct PGPROC
LocalTransactionId lxid; /* local id of top-level transaction currently
* being executed by this proc, if running;
* else InvalidLocalTransactionId */
-
- TransactionId xid; /* id of top-level transaction currently being
- * executed by this proc, if running and XID
- * is assigned; else InvalidTransactionId */
-
- TransactionId xmin; /* minimal running XID as it was when we were
- * starting our xact, excluding LAZY VACUUM:
- * vacuum must not remove tuples deleted by
- * xid >= xmin ! */
-
int pid; /* Backend's process ID; 0 if prepared xact */
+ int pgprocno;
/* These fields are zero while a backend is still starting up: */
BackendId backendId; /* This backend's backend ID (if assigned) */
Oid databaseId; /* OID of database this backend is using */
Oid roleId; /* OID of role using this backend */
- bool inCommit; /* true if within commit critical section */
-
- uint8 vacuumFlags; /* vacuum-related flags, see above */
-
/*
* While in hot standby mode, shows that a conflict signal has been sent
* for the current transaction. Set/cleared while holding ProcArrayLock,
@@ -160,7 +145,33 @@ struct PGPROC
extern PGDLLIMPORT PGPROC *MyProc;
+extern PGDLLIMPORT struct PGXACT *MyPgXact;
+
+/*
+ * Prior to PostgreSQL 9.2, the fieds below were stored as part of the
+ * PGPROC. However, benchmarking revealed that packing these particular
+ * members into a separate array as tightly as possible sped up GetSnapshotData
+ * considerably on systems with many CPU cores, by reducing the number of
+ * cache lines needing to be fetched. Thus, think very carefully before adding
+ * anything else here.
+ */
+typedef struct PGXACT
+{
+ TransactionId xid; /* id of top-level transaction currently being
+ * executed by this proc, if running and XID
+ * is assigned; else InvalidTransactionId */
+
+ TransactionId xmin; /* minimal running XID as it was when we were
+ * starting our xact, excluding LAZY VACUUM:
+ * vacuum must not remove tuples deleted by
+ * xid >= xmin ! */
+
+ uint8 vacuumFlags; /* vacuum-related flags, see above */
+ bool overflowed;
+ bool inCommit; /* true if within commit critical section */
+ uint8 nxids;
+} PGXACT;
/*
* There is one ProcGlobal struct for the whole database cluster.
@@ -169,6 +180,8 @@ typedef struct PROC_HDR
{
/* Array of PGPROC structures (not including dummies for prepared txns) */
PGPROC *allProcs;
+ /* Array of PGXACT structures (not including dummies for prepared txns */
+ PGXACT *allPgXact;
/* Length of allProcs array */
uint32 allProcCount;
/* Head of list of free PGPROC structures */
@@ -186,6 +199,8 @@ typedef struct PROC_HDR
extern PROC_HDR *ProcGlobal;
+extern PGPROC *PreparedXactProcs;
+
/*
* We set aside some extra PGPROC structures for auxiliary processes,
* ie things that aren't full-fledged backends but need shmem access.
On Tue, Nov 22, 2011 at 12:35 PM, Robert Haas <robertmhaas@gmail.com> wrote:
I'm wondering about the changes to how globalxmin is computed in
GetSnapshotData(). That seems like it could hurt performance in the
single-client case, and especially in the case where there is one
active connection and lots of idle connections, and I'm wondering how
thoroughly we've tested that particular bit apart from these other
changes.
I separated this part out - see attached. pgxact-v2.patch is the same
as what I posted on Tuesday with those changes factored out;
recentglobalxmin.patch applies over top of it, and contains just the
changes related to how RecentGlobalXmin is recomputed. I tried a
pgbench -S test, with five minute runs, scale factor 100,
shared_buffers = 8GB, on Nate Boley's 32-core box. I tried this on
unpatched master, with just pgxact-v2.patch, and with both patches.
At 1, 8, and 32, pgxact-v2 is faster than master, but pgxact-v2 +
recentglobalxmin is slower than just pgxact-v2. When I get up to 80
clients, the recentgloblaxmin portion becomes a win.
Just to make sure that I wasn't measuring random noise, I did seven
runs with each configuration rather than my usual three. The table
below shows the lowest, median, and highest tps results for each
tested configuration.
== 1 client ==
master: low 4312.806032 median 4343.971972 high 4395.870212
pgxact-v2: low 4391.273988 median 4408.283875 high 4448.315804
pgxact-v2+recentglobalxmin: low 4328.697157 median 4360.581749 high 4399.763953
== 8 clients ==
master: low 33295.56753 median 33356.682726 high 33639.368929
pgxact-v2: low 33719.004950 median 33927.195872 high 34106.679176
pgxact-v2+recentglobalxmin: low 33565.708794 median 33694.140297 high
33795.072927
== 32 clients ==
master: low 210514.460228 median 213168.532917 high 214169.174032
pgxact-v2: low 216230.082361 median 221678.274469 high 231465.732256
pgxact-v2+recentglobalxmin: low 212202.985015 median 219390.075074
high 223353.421472
== 80 clients ==
master: low 208139.915780 median 209867.352113 high 214868.151835
pgxact-v2: low 217003.716877 median 218360.946541 high 222095.321825
pgxact-v2+recentglobalxmin: low 219390.617005 median 220912.168056
high 221779.348531
I'm going to run some more tests, but my thought is that we should
probably leave the recentglobalxmin changes out for the time being,
pending further study and consideration of other alternatives.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
pgxact-v2.patchapplication/octet-stream; name=pgxact-v2.patchDownload
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 477982d..d2fecb1 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -113,7 +113,8 @@ int max_prepared_xacts = 0;
typedef struct GlobalTransactionData
{
- PGPROC proc; /* dummy proc */
+ GlobalTransaction next;
+ int pgprocno; /* dummy proc */
BackendId dummyBackendId; /* similar to backend id for backends */
TimestampTz prepared_at; /* time of preparation */
XLogRecPtr prepare_lsn; /* XLOG offset of prepare record */
@@ -207,7 +208,8 @@ TwoPhaseShmemInit(void)
sizeof(GlobalTransaction) * max_prepared_xacts));
for (i = 0; i < max_prepared_xacts; i++)
{
- gxacts[i].proc.links.next = (SHM_QUEUE *) TwoPhaseState->freeGXacts;
+ gxacts[i].pgprocno = PreparedXactProcs[i].pgprocno;
+ gxacts[i].next = TwoPhaseState->freeGXacts;
TwoPhaseState->freeGXacts = &gxacts[i];
/*
@@ -243,6 +245,8 @@ MarkAsPreparing(TransactionId xid, const char *gid,
TimestampTz prepared_at, Oid owner, Oid databaseid)
{
GlobalTransaction gxact;
+ PGPROC *proc;
+ PGXACT *pgxact;
int i;
if (strlen(gid) >= GIDSIZE)
@@ -274,7 +278,7 @@ MarkAsPreparing(TransactionId xid, const char *gid,
TwoPhaseState->numPrepXacts--;
TwoPhaseState->prepXacts[i] = TwoPhaseState->prepXacts[TwoPhaseState->numPrepXacts];
/* and put it back in the freelist */
- gxact->proc.links.next = (SHM_QUEUE *) TwoPhaseState->freeGXacts;
+ gxact->next = TwoPhaseState->freeGXacts;
TwoPhaseState->freeGXacts = gxact;
/* Back up index count too, so we don't miss scanning one */
i--;
@@ -302,32 +306,36 @@ MarkAsPreparing(TransactionId xid, const char *gid,
errhint("Increase max_prepared_transactions (currently %d).",
max_prepared_xacts)));
gxact = TwoPhaseState->freeGXacts;
- TwoPhaseState->freeGXacts = (GlobalTransaction) gxact->proc.links.next;
+ TwoPhaseState->freeGXacts = (GlobalTransaction) gxact->next;
- /* Initialize it */
- MemSet(&gxact->proc, 0, sizeof(PGPROC));
- SHMQueueElemInit(&(gxact->proc.links));
- gxact->proc.waitStatus = STATUS_OK;
+ proc = &ProcGlobal->allProcs[gxact->pgprocno];
+ pgxact = &ProcGlobal->allPgXact[gxact->pgprocno];
+
+ /* Initialize the PGPROC entry */
+ MemSet(proc, 0, sizeof(PGPROC));
+ proc->pgprocno = gxact->pgprocno;
+ SHMQueueElemInit(&(proc->links));
+ proc->waitStatus = STATUS_OK;
/* We set up the gxact's VXID as InvalidBackendId/XID */
- gxact->proc.lxid = (LocalTransactionId) xid;
- gxact->proc.xid = xid;
- gxact->proc.xmin = InvalidTransactionId;
- gxact->proc.pid = 0;
- gxact->proc.backendId = InvalidBackendId;
- gxact->proc.databaseId = databaseid;
- gxact->proc.roleId = owner;
- gxact->proc.inCommit = false;
- gxact->proc.vacuumFlags = 0;
- gxact->proc.lwWaiting = false;
- gxact->proc.lwExclusive = false;
- gxact->proc.lwWaitLink = NULL;
- gxact->proc.waitLock = NULL;
- gxact->proc.waitProcLock = NULL;
+ proc->lxid = (LocalTransactionId) xid;
+ pgxact->xid = xid;
+ pgxact->xmin = InvalidTransactionId;
+ pgxact->inCommit = false;
+ pgxact->vacuumFlags = 0;
+ proc->pid = 0;
+ proc->backendId = InvalidBackendId;
+ proc->databaseId = databaseid;
+ proc->roleId = owner;
+ proc->lwWaiting = false;
+ proc->lwExclusive = false;
+ proc->lwWaitLink = NULL;
+ proc->waitLock = NULL;
+ proc->waitProcLock = NULL;
for (i = 0; i < NUM_LOCK_PARTITIONS; i++)
- SHMQueueInit(&(gxact->proc.myProcLocks[i]));
+ SHMQueueInit(&(proc->myProcLocks[i]));
/* subxid data must be filled later by GXactLoadSubxactData */
- gxact->proc.subxids.overflowed = false;
- gxact->proc.subxids.nxids = 0;
+ pgxact->overflowed = false;
+ pgxact->nxids = 0;
gxact->prepared_at = prepared_at;
/* initialize LSN to 0 (start of WAL) */
@@ -358,17 +366,19 @@ static void
GXactLoadSubxactData(GlobalTransaction gxact, int nsubxacts,
TransactionId *children)
{
+ PGPROC *proc = &ProcGlobal->allProcs[gxact->pgprocno];
+ PGXACT *pgxact = &ProcGlobal->allPgXact[gxact->pgprocno];
/* We need no extra lock since the GXACT isn't valid yet */
if (nsubxacts > PGPROC_MAX_CACHED_SUBXIDS)
{
- gxact->proc.subxids.overflowed = true;
+ pgxact->overflowed = true;
nsubxacts = PGPROC_MAX_CACHED_SUBXIDS;
}
if (nsubxacts > 0)
{
- memcpy(gxact->proc.subxids.xids, children,
+ memcpy(proc->subxids.xids, children,
nsubxacts * sizeof(TransactionId));
- gxact->proc.subxids.nxids = nsubxacts;
+ pgxact->nxids = nsubxacts;
}
}
@@ -389,7 +399,7 @@ MarkAsPrepared(GlobalTransaction gxact)
* Put it into the global ProcArray so TransactionIdIsInProgress considers
* the XID as still running.
*/
- ProcArrayAdd(&gxact->proc);
+ ProcArrayAdd(&ProcGlobal->allProcs[gxact->pgprocno]);
}
/*
@@ -406,6 +416,7 @@ LockGXact(const char *gid, Oid user)
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
GlobalTransaction gxact = TwoPhaseState->prepXacts[i];
+ PGPROC *proc = &ProcGlobal->allProcs[gxact->pgprocno];
/* Ignore not-yet-valid GIDs */
if (!gxact->valid)
@@ -436,7 +447,7 @@ LockGXact(const char *gid, Oid user)
* there may be some other issues as well. Hence disallow until
* someone gets motivated to make it work.
*/
- if (MyDatabaseId != gxact->proc.databaseId)
+ if (MyDatabaseId != proc->databaseId)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("prepared transaction belongs to another database"),
@@ -483,7 +494,7 @@ RemoveGXact(GlobalTransaction gxact)
TwoPhaseState->prepXacts[i] = TwoPhaseState->prepXacts[TwoPhaseState->numPrepXacts];
/* and put it back in the freelist */
- gxact->proc.links.next = (SHM_QUEUE *) TwoPhaseState->freeGXacts;
+ gxact->next = TwoPhaseState->freeGXacts;
TwoPhaseState->freeGXacts = gxact;
LWLockRelease(TwoPhaseStateLock);
@@ -518,8 +529,9 @@ TransactionIdIsPrepared(TransactionId xid)
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
GlobalTransaction gxact = TwoPhaseState->prepXacts[i];
+ PGXACT *pgxact = &ProcGlobal->allPgXact[gxact->pgprocno];
- if (gxact->valid && gxact->proc.xid == xid)
+ if (gxact->valid && pgxact->xid == xid)
{
result = true;
break;
@@ -642,6 +654,8 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
while (status->array != NULL && status->currIdx < status->ngxacts)
{
GlobalTransaction gxact = &status->array[status->currIdx++];
+ PGPROC *proc = &ProcGlobal->allProcs[gxact->pgprocno];
+ PGXACT *pgxact = &ProcGlobal->allPgXact[gxact->pgprocno];
Datum values[5];
bool nulls[5];
HeapTuple tuple;
@@ -656,11 +670,11 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
MemSet(values, 0, sizeof(values));
MemSet(nulls, 0, sizeof(nulls));
- values[0] = TransactionIdGetDatum(gxact->proc.xid);
+ values[0] = TransactionIdGetDatum(pgxact->xid);
values[1] = CStringGetTextDatum(gxact->gid);
values[2] = TimestampTzGetDatum(gxact->prepared_at);
values[3] = ObjectIdGetDatum(gxact->owner);
- values[4] = ObjectIdGetDatum(gxact->proc.databaseId);
+ values[4] = ObjectIdGetDatum(proc->databaseId);
tuple = heap_form_tuple(funcctx->tuple_desc, values, nulls);
result = HeapTupleGetDatum(tuple);
@@ -711,10 +725,11 @@ TwoPhaseGetDummyProc(TransactionId xid)
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
GlobalTransaction gxact = TwoPhaseState->prepXacts[i];
+ PGXACT *pgxact = &ProcGlobal->allPgXact[gxact->pgprocno];
- if (gxact->proc.xid == xid)
+ if (pgxact->xid == xid)
{
- result = &gxact->proc;
+ result = &ProcGlobal->allProcs[gxact->pgprocno];
break;
}
}
@@ -841,7 +856,9 @@ save_state_data(const void *data, uint32 len)
void
StartPrepare(GlobalTransaction gxact)
{
- TransactionId xid = gxact->proc.xid;
+ PGPROC *proc = &ProcGlobal->allProcs[gxact->pgprocno];
+ PGXACT *pgxact = &ProcGlobal->allPgXact[gxact->pgprocno];
+ TransactionId xid = pgxact->xid;
TwoPhaseFileHeader hdr;
TransactionId *children;
RelFileNode *commitrels;
@@ -865,7 +882,7 @@ StartPrepare(GlobalTransaction gxact)
hdr.magic = TWOPHASE_MAGIC;
hdr.total_len = 0; /* EndPrepare will fill this in */
hdr.xid = xid;
- hdr.database = gxact->proc.databaseId;
+ hdr.database = proc->databaseId;
hdr.prepared_at = gxact->prepared_at;
hdr.owner = gxact->owner;
hdr.nsubxacts = xactGetCommittedChildren(&children);
@@ -913,7 +930,8 @@ StartPrepare(GlobalTransaction gxact)
void
EndPrepare(GlobalTransaction gxact)
{
- TransactionId xid = gxact->proc.xid;
+ PGXACT *pgxact = &ProcGlobal->allPgXact[gxact->pgprocno];
+ TransactionId xid = pgxact->xid;
TwoPhaseFileHeader *hdr;
char path[MAXPGPATH];
XLogRecData *record;
@@ -1021,7 +1039,7 @@ EndPrepare(GlobalTransaction gxact)
*/
START_CRIT_SECTION();
- MyProc->inCommit = true;
+ MyPgXact->inCommit = true;
gxact->prepare_lsn = XLogInsert(RM_XACT_ID, XLOG_XACT_PREPARE,
records.head);
@@ -1069,7 +1087,7 @@ EndPrepare(GlobalTransaction gxact)
* checkpoint starting after this will certainly see the gxact as a
* candidate for fsyncing.
*/
- MyProc->inCommit = false;
+ MyPgXact->inCommit = false;
END_CRIT_SECTION();
@@ -1242,6 +1260,8 @@ void
FinishPreparedTransaction(const char *gid, bool isCommit)
{
GlobalTransaction gxact;
+ PGPROC *proc;
+ PGXACT *pgxact;
TransactionId xid;
char *buf;
char *bufptr;
@@ -1260,7 +1280,9 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
* try to commit the same GID at once.
*/
gxact = LockGXact(gid, GetUserId());
- xid = gxact->proc.xid;
+ proc = &ProcGlobal->allProcs[gxact->pgprocno];
+ pgxact = &ProcGlobal->allPgXact[gxact->pgprocno];
+ xid = pgxact->xid;
/*
* Read and validate the state file
@@ -1309,7 +1331,7 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
hdr->nsubxacts, children,
hdr->nabortrels, abortrels);
- ProcArrayRemove(&gxact->proc, latestXid);
+ ProcArrayRemove(proc, latestXid);
/*
* In case we fail while running the callbacks, mark the gxact invalid so
@@ -1540,10 +1562,11 @@ CheckPointTwoPhase(XLogRecPtr redo_horizon)
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
GlobalTransaction gxact = TwoPhaseState->prepXacts[i];
+ PGXACT *pgxact = &ProcGlobal->allPgXact[gxact->pgprocno];
if (gxact->valid &&
XLByteLE(gxact->prepare_lsn, redo_horizon))
- xids[nxids++] = gxact->proc.xid;
+ xids[nxids++] = pgxact->xid;
}
LWLockRelease(TwoPhaseStateLock);
@@ -1972,7 +1995,7 @@ RecordTransactionCommitPrepared(TransactionId xid,
START_CRIT_SECTION();
/* See notes in RecordTransactionCommit */
- MyProc->inCommit = true;
+ MyPgXact->inCommit = true;
/* Emit the XLOG commit record */
xlrec.xid = xid;
@@ -2037,7 +2060,7 @@ RecordTransactionCommitPrepared(TransactionId xid,
TransactionIdCommitTree(xid, nchildren, children);
/* Checkpoint can proceed now */
- MyProc->inCommit = false;
+ MyPgXact->inCommit = false;
END_CRIT_SECTION();
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 61dcfed..443e5e4 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -54,7 +54,7 @@ GetNewTransactionId(bool isSubXact)
if (IsBootstrapProcessingMode())
{
Assert(!isSubXact);
- MyProc->xid = BootstrapTransactionId;
+ MyPgXact->xid = BootstrapTransactionId;
return BootstrapTransactionId;
}
@@ -208,20 +208,21 @@ GetNewTransactionId(bool isSubXact)
* TransactionId and int fetch/store are atomic.
*/
volatile PGPROC *myproc = MyProc;
+ volatile PGXACT *mypgxact = MyPgXact;
if (!isSubXact)
- myproc->xid = xid;
+ mypgxact->xid = xid;
else
{
- int nxids = myproc->subxids.nxids;
+ int nxids = mypgxact->nxids;
if (nxids < PGPROC_MAX_CACHED_SUBXIDS)
{
myproc->subxids.xids[nxids] = xid;
- myproc->subxids.nxids = nxids + 1;
+ mypgxact->nxids = nxids + 1;
}
else
- myproc->subxids.overflowed = true;
+ mypgxact->overflowed = true;
}
}
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index c151d3b..c383011 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -981,7 +981,7 @@ RecordTransactionCommit(void)
* bit fuzzy, but it doesn't matter.
*/
START_CRIT_SECTION();
- MyProc->inCommit = true;
+ MyPgXact->inCommit = true;
SetCurrentTransactionStopTimestamp();
@@ -1155,7 +1155,7 @@ RecordTransactionCommit(void)
*/
if (markXidCommitted)
{
- MyProc->inCommit = false;
+ MyPgXact->inCommit = false;
END_CRIT_SECTION();
}
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 32985a4..3143246 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -223,7 +223,7 @@ analyze_rel(Oid relid, VacuumStmt *vacstmt, BufferAccessStrategy bstrategy)
* OK, let's do it. First let other backends know I'm in ANALYZE.
*/
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- MyProc->vacuumFlags |= PROC_IN_ANALYZE;
+ MyPgXact->vacuumFlags |= PROC_IN_ANALYZE;
LWLockRelease(ProcArrayLock);
/*
@@ -250,7 +250,7 @@ analyze_rel(Oid relid, VacuumStmt *vacstmt, BufferAccessStrategy bstrategy)
* because the vacuum flag is cleared by the end-of-xact code.
*/
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- MyProc->vacuumFlags &= ~PROC_IN_ANALYZE;
+ MyPgXact->vacuumFlags &= ~PROC_IN_ANALYZE;
LWLockRelease(ProcArrayLock);
}
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index f42504c..89e190d 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -893,9 +893,9 @@ vacuum_rel(Oid relid, VacuumStmt *vacstmt, bool do_toast, bool for_wraparound)
* which is probably Not Good.
*/
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- MyProc->vacuumFlags |= PROC_IN_VACUUM;
+ MyPgXact->vacuumFlags |= PROC_IN_VACUUM;
if (for_wraparound)
- MyProc->vacuumFlags |= PROC_VACUUM_FOR_WRAPAROUND;
+ MyPgXact->vacuumFlags |= PROC_VACUUM_FOR_WRAPAROUND;
LWLockRelease(ProcArrayLock);
}
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 6758083..963189d 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -430,6 +430,7 @@ typedef struct
slock_t *ProcStructLock;
PROC_HDR *ProcGlobal;
PGPROC *AuxiliaryProcs;
+ PGPROC *PreparedXactProcs;
PMSignalData *PMSignalState;
InheritableSocket pgStatSock;
pid_t PostmasterPid;
@@ -4724,6 +4725,7 @@ save_backend_variables(BackendParameters *param, Port *port,
param->ProcStructLock = ProcStructLock;
param->ProcGlobal = ProcGlobal;
param->AuxiliaryProcs = AuxiliaryProcs;
+ param->PreparedXactProcs = PreparedXactProcs;
param->PMSignalState = PMSignalState;
if (!write_inheritable_socket(¶m->pgStatSock, pgStatSock, childPid))
return false;
@@ -4947,6 +4949,7 @@ restore_backend_variables(BackendParameters *param, Port *port)
ProcStructLock = param->ProcStructLock;
ProcGlobal = param->ProcGlobal;
AuxiliaryProcs = param->AuxiliaryProcs;
+ PreparedXactProcs = param->PreparedXactProcs;
PMSignalState = param->PMSignalState;
read_inheritable_socket(&pgStatSock, ¶m->pgStatSock);
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index dd2d6ee..ea86520 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -702,7 +702,7 @@ ProcessStandbyHSFeedbackMessage(void)
* safe, and if we're moving it backwards, well, the data is at risk
* already since a VACUUM could have just finished calling GetOldestXmin.)
*/
- MyProc->xmin = reply.xmin;
+ MyPgXact->xmin = reply.xmin;
}
/* Main loop of walsender process */
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 56c0bd8..bb8b832 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -192,7 +192,6 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
XLOGShmemInit();
CLOGShmemInit();
SUBTRANSShmemInit();
- TwoPhaseShmemInit();
MultiXactShmemInit();
InitBufferPool();
@@ -213,6 +212,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
InitProcGlobal();
CreateSharedProcArray();
CreateSharedBackendStatus();
+ TwoPhaseShmemInit();
/*
* Set up shared-inval messaging
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 1a48485..19ff524 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -82,14 +82,17 @@ typedef struct ProcArrayStruct
TransactionId lastOverflowedXid;
/*
- * We declare procs[] as 1 entry because C wants a fixed-size array, but
+ * We declare pgprocnos[] as 1 entry because C wants a fixed-size array, but
* actually it is maxProcs entries long.
*/
- PGPROC *procs[1]; /* VARIABLE LENGTH ARRAY */
+ int pgprocnos[1]; /* VARIABLE LENGTH ARRAY */
} ProcArrayStruct;
static ProcArrayStruct *procArray;
+static PGPROC *allProcs;
+static PGXACT *allPgXact;
+
/*
* Bookkeeping for tracking emulated transactions in recovery
*/
@@ -169,8 +172,8 @@ ProcArrayShmemSize(void)
/* Size of the ProcArray structure itself */
#define PROCARRAY_MAXPROCS (MaxBackends + max_prepared_xacts)
- size = offsetof(ProcArrayStruct, procs);
- size = add_size(size, mul_size(sizeof(PGPROC *), PROCARRAY_MAXPROCS));
+ size = offsetof(ProcArrayStruct, pgprocnos);
+ size = add_size(size, mul_size(sizeof(int), PROCARRAY_MAXPROCS));
/*
* During Hot Standby processing we have a data structure called
@@ -211,8 +214,8 @@ CreateSharedProcArray(void)
/* Create or attach to the ProcArray shared structure */
procArray = (ProcArrayStruct *)
ShmemInitStruct("Proc Array",
- add_size(offsetof(ProcArrayStruct, procs),
- mul_size(sizeof(PGPROC *),
+ add_size(offsetof(ProcArrayStruct, pgprocnos),
+ mul_size(sizeof(int),
PROCARRAY_MAXPROCS)),
&found);
@@ -231,6 +234,9 @@ CreateSharedProcArray(void)
procArray->lastOverflowedXid = InvalidTransactionId;
}
+ allProcs = ProcGlobal->allProcs;
+ allPgXact = ProcGlobal->allPgXact;
+
/* Create or attach to the KnownAssignedXids arrays too, if needed */
if (EnableHotStandby)
{
@@ -253,6 +259,7 @@ void
ProcArrayAdd(PGPROC *proc)
{
ProcArrayStruct *arrayP = procArray;
+ int index;
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
@@ -269,7 +276,28 @@ ProcArrayAdd(PGPROC *proc)
errmsg("sorry, too many clients already")));
}
- arrayP->procs[arrayP->numProcs] = proc;
+ /*
+ * Keep the procs array sorted by (PGPROC *) so that we can utilize
+ * locality of references much better. This is useful while traversing the
+ * ProcArray because there is a increased likelyhood of finding the next
+ * PGPROC structure in the cache.
+ *
+ * Since the occurance of adding/removing a proc is much lower than the
+ * access to the ProcArray itself, the overhead should be marginal
+ */
+ for (index = 0; index < arrayP->numProcs; index++)
+ {
+ /*
+ * If we are the first PGPROC or if we have found our right position in
+ * the array, break
+ */
+ if ((arrayP->pgprocnos[index] == -1) || (arrayP->pgprocnos[index] > proc->pgprocno))
+ break;
+ }
+
+ memmove(&arrayP->pgprocnos[index + 1], &arrayP->pgprocnos[index],
+ (arrayP->numProcs - index) * sizeof (int));
+ arrayP->pgprocnos[index] = proc->pgprocno;
arrayP->numProcs++;
LWLockRelease(ProcArrayLock);
@@ -301,7 +329,7 @@ ProcArrayRemove(PGPROC *proc, TransactionId latestXid)
if (TransactionIdIsValid(latestXid))
{
- Assert(TransactionIdIsValid(proc->xid));
+ Assert(TransactionIdIsValid(allPgXact[proc->pgprocno].xid));
/* Advance global latestCompletedXid while holding the lock */
if (TransactionIdPrecedes(ShmemVariableCache->latestCompletedXid,
@@ -311,15 +339,17 @@ ProcArrayRemove(PGPROC *proc, TransactionId latestXid)
else
{
/* Shouldn't be trying to remove a live transaction here */
- Assert(!TransactionIdIsValid(proc->xid));
+ Assert(!TransactionIdIsValid(allPgXact[proc->pgprocno].xid));
}
for (index = 0; index < arrayP->numProcs; index++)
{
- if (arrayP->procs[index] == proc)
+ if (arrayP->pgprocnos[index] == proc->pgprocno)
{
- arrayP->procs[index] = arrayP->procs[arrayP->numProcs - 1];
- arrayP->procs[arrayP->numProcs - 1] = NULL; /* for debugging */
+ /* Keep the PGPROC array sorted. See notes above */
+ memmove(&arrayP->pgprocnos[index], &arrayP->pgprocnos[index + 1],
+ (arrayP->numProcs - index - 1) * sizeof (int));
+ arrayP->pgprocnos[arrayP->numProcs - 1] = -1; /* for debugging */
arrayP->numProcs--;
LWLockRelease(ProcArrayLock);
return;
@@ -349,29 +379,31 @@ ProcArrayRemove(PGPROC *proc, TransactionId latestXid)
void
ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
{
+ PGXACT *pgxact = &allPgXact[proc->pgprocno];
+
if (TransactionIdIsValid(latestXid))
{
/*
- * We must lock ProcArrayLock while clearing proc->xid, so that we do
- * not exit the set of "running" transactions while someone else is
- * taking a snapshot. See discussion in
+ * We must lock ProcArrayLock while clearing our advertised XID, so
+ * that we do not exit the set of "running" transactions while someone
+ * else is taking a snapshot. See discussion in
* src/backend/access/transam/README.
*/
- Assert(TransactionIdIsValid(proc->xid));
+ Assert(TransactionIdIsValid(allPgXact[proc->pgprocno].xid));
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- proc->xid = InvalidTransactionId;
+ pgxact->xid = InvalidTransactionId;
proc->lxid = InvalidLocalTransactionId;
- proc->xmin = InvalidTransactionId;
+ pgxact->xmin = InvalidTransactionId;
/* must be cleared with xid/xmin: */
- proc->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
- proc->inCommit = false; /* be sure this is cleared in abort */
+ pgxact->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
+ pgxact->inCommit = false; /* be sure this is cleared in abort */
proc->recoveryConflictPending = false;
/* Clear the subtransaction-XID cache too while holding the lock */
- proc->subxids.nxids = 0;
- proc->subxids.overflowed = false;
+ pgxact->nxids = 0;
+ pgxact->overflowed = false;
/* Also advance global latestCompletedXid while holding the lock */
if (TransactionIdPrecedes(ShmemVariableCache->latestCompletedXid,
@@ -387,17 +419,17 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
* anyone else's calculation of a snapshot. We might change their
* estimate of global xmin, but that's OK.
*/
- Assert(!TransactionIdIsValid(proc->xid));
+ Assert(!TransactionIdIsValid(allPgXact[proc->pgprocno].xid));
proc->lxid = InvalidLocalTransactionId;
- proc->xmin = InvalidTransactionId;
+ pgxact->xmin = InvalidTransactionId;
/* must be cleared with xid/xmin: */
- proc->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
- proc->inCommit = false; /* be sure this is cleared in abort */
+ pgxact->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
+ pgxact->inCommit = false; /* be sure this is cleared in abort */
proc->recoveryConflictPending = false;
- Assert(proc->subxids.nxids == 0);
- Assert(proc->subxids.overflowed == false);
+ Assert(pgxact->nxids == 0);
+ Assert(pgxact->overflowed == false);
}
}
@@ -413,24 +445,26 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
void
ProcArrayClearTransaction(PGPROC *proc)
{
+ PGXACT *pgxact = &allPgXact[proc->pgprocno];
+
/*
* We can skip locking ProcArrayLock here, because this action does not
* actually change anyone's view of the set of running XIDs: our entry is
* duplicate with the gxact that has already been inserted into the
* ProcArray.
*/
- proc->xid = InvalidTransactionId;
+ pgxact->xid = InvalidTransactionId;
proc->lxid = InvalidLocalTransactionId;
- proc->xmin = InvalidTransactionId;
+ pgxact->xmin = InvalidTransactionId;
proc->recoveryConflictPending = false;
/* redundant, but just in case */
- proc->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
- proc->inCommit = false;
+ pgxact->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
+ pgxact->inCommit = false;
/* Clear the subtransaction-XID cache too */
- proc->subxids.nxids = 0;
- proc->subxids.overflowed = false;
+ pgxact->nxids = 0;
+ pgxact->overflowed = false;
}
/*
@@ -811,15 +845,17 @@ TransactionIdIsInProgress(TransactionId xid)
/* No shortcuts, gotta grovel through the array */
for (i = 0; i < arrayP->numProcs; i++)
{
- volatile PGPROC *proc = arrayP->procs[i];
- TransactionId pxid;
+ int pgprocno = arrayP->pgprocnos[i];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
+ TransactionId pxid;
/* Ignore my own proc --- dealt with it above */
if (proc == MyProc)
continue;
/* Fetch xid just once - see GetNewTransactionId */
- pxid = proc->xid;
+ pxid = pgxact->xid;
if (!TransactionIdIsValid(pxid))
continue;
@@ -844,7 +880,7 @@ TransactionIdIsInProgress(TransactionId xid)
/*
* Step 2: check the cached child-Xids arrays
*/
- for (j = proc->subxids.nxids - 1; j >= 0; j--)
+ for (j = pgxact->nxids - 1; j >= 0; j--)
{
/* Fetch xid just once - see GetNewTransactionId */
TransactionId cxid = proc->subxids.xids[j];
@@ -864,7 +900,7 @@ TransactionIdIsInProgress(TransactionId xid)
* we hold ProcArrayLock. So we can't miss an Xid that we need to
* worry about.)
*/
- if (proc->subxids.overflowed)
+ if (pgxact->overflowed)
xids[nxids++] = pxid;
}
@@ -965,10 +1001,13 @@ TransactionIdIsActive(TransactionId xid)
for (i = 0; i < arrayP->numProcs; i++)
{
- volatile PGPROC *proc = arrayP->procs[i];
+ int pgprocno = arrayP->pgprocnos[i];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
+ TransactionId pxid;
/* Fetch xid just once - see GetNewTransactionId */
- TransactionId pxid = proc->xid;
+ pxid = pgxact->xid;
if (!TransactionIdIsValid(pxid))
continue;
@@ -1060,9 +1099,11 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
- if (ignoreVacuum && (proc->vacuumFlags & PROC_IN_VACUUM))
+ if (ignoreVacuum && (pgxact->vacuumFlags & PROC_IN_VACUUM))
continue;
if (allDbs ||
@@ -1070,7 +1111,7 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
proc->databaseId == 0) /* always include WalSender */
{
/* Fetch xid just once - see GetNewTransactionId */
- TransactionId xid = proc->xid;
+ TransactionId xid = pgxact->xid;
/* First consider the transaction's own Xid, if any */
if (TransactionIdIsNormal(xid) &&
@@ -1084,7 +1125,7 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
* have an Xmin but not (yet) an Xid; conversely, if it has an
* Xid, that could determine some not-yet-set Xmin.
*/
- xid = proc->xmin; /* Fetch just once */
+ xid = pgxact->xmin; /* Fetch just once */
if (TransactionIdIsNormal(xid) &&
TransactionIdPrecedes(xid, result))
result = xid;
@@ -1261,31 +1302,33 @@ GetSnapshotData(Snapshot snapshot)
if (!snapshot->takenDuringRecovery)
{
+ int *pgprocnos = arrayP->pgprocnos;
+ int numProcs;
+
/*
* Spin over procArray checking xid, xmin, and subxids. The goal is
* to gather all active xids, find the lowest xmin, and try to record
- * subxids. During recovery no xids will be assigned, so all normal
- * backends can be ignored, nor are there any VACUUMs running. All
- * prepared transaction xids are held in KnownAssignedXids, so these
- * will be seen without needing to loop through procs here.
+ * subxids.
*/
- for (index = 0; index < arrayP->numProcs; index++)
+ numProcs = arrayP->numProcs;
+ for (index = 0; index < numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
- TransactionId xid;
+ int pgprocno = pgprocnos[index];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
+ TransactionId xid;
/* Ignore procs running LAZY VACUUM */
- if (proc->vacuumFlags & PROC_IN_VACUUM)
+ if (pgxact->vacuumFlags & PROC_IN_VACUUM)
continue;
/* Update globalxmin to be the smallest valid xmin */
- xid = proc->xmin; /* fetch just once */
+ xid = pgxact->xmin; /* fetch just once */
if (TransactionIdIsNormal(xid) &&
TransactionIdPrecedes(xid, globalxmin))
- globalxmin = xid;
+ globalxmin = xid;
/* Fetch xid just once - see GetNewTransactionId */
- xid = proc->xid;
+ xid = pgxact->xid;
/*
* If the transaction has been assigned an xid < xmax we add it to
@@ -1300,7 +1343,7 @@ GetSnapshotData(Snapshot snapshot)
{
if (TransactionIdFollowsOrEquals(xid, xmax))
continue;
- if (proc != MyProc)
+ if (pgxact != MyPgXact)
snapshot->xip[count++] = xid;
if (TransactionIdPrecedes(xid, xmin))
xmin = xid;
@@ -1321,16 +1364,17 @@ GetSnapshotData(Snapshot snapshot)
*
* Again, our own XIDs are not included in the snapshot.
*/
- if (!suboverflowed && proc != MyProc)
+ if (!suboverflowed && pgxact != MyPgXact)
{
- if (proc->subxids.overflowed)
+ if (pgxact->overflowed)
suboverflowed = true;
else
{
- int nxids = proc->subxids.nxids;
+ int nxids = pgxact->nxids;
if (nxids > 0)
{
+ volatile PGPROC *proc = &allProcs[pgprocno];
memcpy(snapshot->subxip + subcount,
(void *) proc->subxids.xids,
nxids * sizeof(TransactionId));
@@ -1372,9 +1416,8 @@ GetSnapshotData(Snapshot snapshot)
suboverflowed = true;
}
- if (!TransactionIdIsValid(MyProc->xmin))
- MyProc->xmin = TransactionXmin = xmin;
-
+ if (!TransactionIdIsValid(MyPgXact->xmin))
+ MyPgXact->xmin = TransactionXmin = xmin;
LWLockRelease(ProcArrayLock);
/*
@@ -1436,14 +1479,16 @@ ProcArrayInstallImportedXmin(TransactionId xmin, TransactionId sourcexid)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
- TransactionId xid;
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
+ TransactionId xid;
/* Ignore procs running LAZY VACUUM */
- if (proc->vacuumFlags & PROC_IN_VACUUM)
+ if (pgxact->vacuumFlags & PROC_IN_VACUUM)
continue;
- xid = proc->xid; /* fetch just once */
+ xid = pgxact->xid; /* fetch just once */
if (xid != sourcexid)
continue;
@@ -1459,7 +1504,7 @@ ProcArrayInstallImportedXmin(TransactionId xmin, TransactionId sourcexid)
/*
* Likewise, let's just make real sure its xmin does cover us.
*/
- xid = proc->xmin; /* fetch just once */
+ xid = pgxact->xmin; /* fetch just once */
if (!TransactionIdIsNormal(xid) ||
!TransactionIdPrecedesOrEquals(xid, xmin))
continue;
@@ -1470,7 +1515,7 @@ ProcArrayInstallImportedXmin(TransactionId xmin, TransactionId sourcexid)
* GetSnapshotData first, we'll be overwriting a valid xmin here,
* so we don't check that.)
*/
- MyProc->xmin = TransactionXmin = xmin;
+ MyPgXact->xmin = TransactionXmin = xmin;
result = true;
break;
@@ -1562,12 +1607,14 @@ GetRunningTransactionData(void)
*/
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
TransactionId xid;
int nxids;
/* Fetch xid just once - see GetNewTransactionId */
- xid = proc->xid;
+ xid = pgxact->xid;
/*
* We don't need to store transactions that don't have a TransactionId
@@ -1585,7 +1632,7 @@ GetRunningTransactionData(void)
* Save subtransaction XIDs. Other backends can't add or remove
* entries while we're holding XidGenLock.
*/
- nxids = proc->subxids.nxids;
+ nxids = pgxact->nxids;
if (nxids > 0)
{
memcpy(&xids[count], (void *) proc->subxids.xids,
@@ -1593,7 +1640,7 @@ GetRunningTransactionData(void)
count += nxids;
subcount += nxids;
- if (proc->subxids.overflowed)
+ if (pgxact->overflowed)
suboverflowed = true;
/*
@@ -1653,11 +1700,12 @@ GetOldestActiveTransactionId(void)
*/
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
TransactionId xid;
/* Fetch xid just once - see GetNewTransactionId */
- xid = proc->xid;
+ xid = pgxact->xid;
if (!TransactionIdIsNormal(xid))
continue;
@@ -1709,12 +1757,14 @@ GetTransactionsInCommit(TransactionId **xids_p)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
+ TransactionId pxid;
/* Fetch xid just once - see GetNewTransactionId */
- TransactionId pxid = proc->xid;
+ pxid = pgxact->xid;
- if (proc->inCommit && TransactionIdIsValid(pxid))
+ if (pgxact->inCommit && TransactionIdIsValid(pxid))
xids[nxids++] = pxid;
}
@@ -1744,12 +1794,14 @@ HaveTransactionsInCommit(TransactionId *xids, int nxids)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
+ TransactionId pxid;
/* Fetch xid just once - see GetNewTransactionId */
- TransactionId pxid = proc->xid;
+ pxid = pgxact->xid;
- if (proc->inCommit && TransactionIdIsValid(pxid))
+ if (pgxact->inCommit && TransactionIdIsValid(pxid))
{
int i;
@@ -1792,7 +1844,7 @@ BackendPidGetProc(int pid)
for (index = 0; index < arrayP->numProcs; index++)
{
- PGPROC *proc = arrayP->procs[index];
+ PGPROC *proc = &allProcs[arrayP->pgprocnos[index]];
if (proc->pid == pid)
{
@@ -1833,9 +1885,11 @@ BackendXidGetPid(TransactionId xid)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
- if (proc->xid == xid)
+ if (pgxact->xid == xid)
{
result = proc->pid;
break;
@@ -1901,18 +1955,20 @@ GetCurrentVirtualXIDs(TransactionId limitXmin, bool excludeXmin0,
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
if (proc == MyProc)
continue;
- if (excludeVacuum & proc->vacuumFlags)
+ if (excludeVacuum & pgxact->vacuumFlags)
continue;
if (allDbs || proc->databaseId == MyDatabaseId)
{
/* Fetch xmin just once - might change on us */
- TransactionId pxmin = proc->xmin;
+ TransactionId pxmin = pgxact->xmin;
if (excludeXmin0 && !TransactionIdIsValid(pxmin))
continue;
@@ -1996,7 +2052,9 @@ GetConflictingVirtualXIDs(TransactionId limitXmin, Oid dbOid)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
/* Exclude prepared transactions */
if (proc->pid == 0)
@@ -2006,7 +2064,7 @@ GetConflictingVirtualXIDs(TransactionId limitXmin, Oid dbOid)
proc->databaseId == dbOid)
{
/* Fetch xmin just once - can't change on us, but good coding */
- TransactionId pxmin = proc->xmin;
+ TransactionId pxmin = pgxact->xmin;
/*
* We ignore an invalid pxmin because this means that backend has
@@ -2050,8 +2108,9 @@ CancelVirtualTransaction(VirtualTransactionId vxid, ProcSignalReason sigmode)
for (index = 0; index < arrayP->numProcs; index++)
{
- VirtualTransactionId procvxid;
- PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ VirtualTransactionId procvxid;
GET_VXID_FROM_PGPROC(procvxid, *proc);
@@ -2104,7 +2163,9 @@ MinimumActiveBackends(int min)
*/
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
/*
* Since we're not holding a lock, need to check that the pointer is
@@ -2122,10 +2183,10 @@ MinimumActiveBackends(int min)
if (proc == MyProc)
continue; /* do not count myself */
+ if (pgxact->xid == InvalidTransactionId)
+ continue; /* do not count if no XID assigned */
if (proc->pid == 0)
continue; /* do not count prepared xacts */
- if (proc->xid == InvalidTransactionId)
- continue; /* do not count if no XID assigned */
if (proc->waitLock != NULL)
continue; /* do not count if blocked on a lock */
count++;
@@ -2150,7 +2211,8 @@ CountDBBackends(Oid databaseid)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
if (proc->pid == 0)
continue; /* do not count prepared xacts */
@@ -2179,7 +2241,8 @@ CancelDBBackends(Oid databaseid, ProcSignalReason sigmode, bool conflictPending)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
if (databaseid == InvalidOid || proc->databaseId == databaseid)
{
@@ -2217,7 +2280,8 @@ CountUserBackends(Oid roleid)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
if (proc->pid == 0)
continue; /* do not count prepared xacts */
@@ -2277,7 +2341,9 @@ CountOtherDBBackends(Oid databaseId, int *nbackends, int *nprepared)
for (index = 0; index < arrayP->numProcs; index++)
{
- volatile PGPROC *proc = arrayP->procs[index];
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
if (proc->databaseId != databaseId)
continue;
@@ -2291,7 +2357,7 @@ CountOtherDBBackends(Oid databaseId, int *nbackends, int *nprepared)
else
{
(*nbackends)++;
- if ((proc->vacuumFlags & PROC_IS_AUTOVACUUM) &&
+ if ((pgxact->vacuumFlags & PROC_IS_AUTOVACUUM) &&
nautovacs < MAXAUTOVACPIDS)
autovac_pids[nautovacs++] = proc->pid;
}
@@ -2321,8 +2387,8 @@ CountOtherDBBackends(Oid databaseId, int *nbackends, int *nprepared)
#define XidCacheRemove(i) \
do { \
- MyProc->subxids.xids[i] = MyProc->subxids.xids[MyProc->subxids.nxids - 1]; \
- MyProc->subxids.nxids--; \
+ MyProc->subxids.xids[i] = MyProc->subxids.xids[MyPgXact->nxids - 1]; \
+ MyPgXact->nxids--; \
} while (0)
/*
@@ -2361,7 +2427,7 @@ XidCacheRemoveRunningXids(TransactionId xid,
{
TransactionId anxid = xids[i];
- for (j = MyProc->subxids.nxids - 1; j >= 0; j--)
+ for (j = MyPgXact->nxids - 1; j >= 0; j--)
{
if (TransactionIdEquals(MyProc->subxids.xids[j], anxid))
{
@@ -2377,11 +2443,11 @@ XidCacheRemoveRunningXids(TransactionId xid,
* error during AbortSubTransaction. So instead of Assert, emit a
* debug warning.
*/
- if (j < 0 && !MyProc->subxids.overflowed)
+ if (j < 0 && !MyPgXact->overflowed)
elog(WARNING, "did not find subXID %u in MyProc", anxid);
}
- for (j = MyProc->subxids.nxids - 1; j >= 0; j--)
+ for (j = MyPgXact->nxids - 1; j >= 0; j--)
{
if (TransactionIdEquals(MyProc->subxids.xids[j], xid))
{
@@ -2390,7 +2456,7 @@ XidCacheRemoveRunningXids(TransactionId xid,
}
}
/* Ordinarily we should have found it, unless the cache has overflowed */
- if (j < 0 && !MyProc->subxids.overflowed)
+ if (j < 0 && !MyPgXact->overflowed)
elog(WARNING, "did not find subXID %u in MyProc", xid);
/* Also advance global latestCompletedXid while holding the lock */
diff --git a/src/backend/storage/lmgr/deadlock.c b/src/backend/storage/lmgr/deadlock.c
index 7e7f6af..63326b8 100644
--- a/src/backend/storage/lmgr/deadlock.c
+++ b/src/backend/storage/lmgr/deadlock.c
@@ -450,6 +450,7 @@ FindLockCycleRecurse(PGPROC *checkProc,
int *nSoftEdges) /* output argument */
{
PGPROC *proc;
+ PGXACT *pgxact;
LOCK *lock;
PROCLOCK *proclock;
SHM_QUEUE *procLocks;
@@ -516,6 +517,7 @@ FindLockCycleRecurse(PGPROC *checkProc,
while (proclock)
{
proc = proclock->tag.myProc;
+ pgxact = &ProcGlobal->allPgXact[proc->pgprocno];
/* A proc never blocks itself */
if (proc != checkProc)
@@ -541,7 +543,7 @@ FindLockCycleRecurse(PGPROC *checkProc,
* vacuumFlag bit), but we don't do that here to avoid
* grabbing ProcArrayLock.
*/
- if (proc->vacuumFlags & PROC_IS_AUTOVACUUM)
+ if (pgxact->vacuumFlags & PROC_IS_AUTOVACUUM)
blocking_autovacuum_proc = proc;
/* This proc hard-blocks checkProc */
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 905502f..3ba4671 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -3188,9 +3188,10 @@ GetRunningTransactionLocks(int *nlocks)
proclock->tag.myLock->tag.locktag_type == LOCKTAG_RELATION)
{
PGPROC *proc = proclock->tag.myProc;
+ PGXACT *pgxact = &ProcGlobal->allPgXact[proc->pgprocno];
LOCK *lock = proclock->tag.myLock;
- accessExclusiveLocks[index].xid = proc->xid;
+ accessExclusiveLocks[index].xid = pgxact->xid;
accessExclusiveLocks[index].dbOid = lock->tag.locktag_field1;
accessExclusiveLocks[index].relOid = lock->tag.locktag_field2;
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index eda3a98..bcbc802 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -36,6 +36,7 @@
#include <sys/time.h>
#include "access/transam.h"
+#include "access/twophase.h"
#include "access/xact.h"
#include "miscadmin.h"
#include "postmaster/autovacuum.h"
@@ -57,6 +58,7 @@ bool log_lock_waits = false;
/* Pointer to this process's PGPROC struct, if any */
PGPROC *MyProc = NULL;
+PGXACT *MyPgXact = NULL;
/*
* This spinlock protects the freelist of recycled PGPROC structures.
@@ -70,6 +72,7 @@ NON_EXEC_STATIC slock_t *ProcStructLock = NULL;
/* Pointers to shared-memory structures */
PROC_HDR *ProcGlobal = NULL;
NON_EXEC_STATIC PGPROC *AuxiliaryProcs = NULL;
+PGPROC *PreparedXactProcs = NULL;
/* If we are waiting for a lock, this points to the associated LOCALLOCK */
static LOCALLOCK *lockAwaited = NULL;
@@ -106,13 +109,19 @@ ProcGlobalShmemSize(void)
/* ProcGlobal */
size = add_size(size, sizeof(PROC_HDR));
- /* AuxiliaryProcs */
- size = add_size(size, mul_size(NUM_AUXILIARY_PROCS, sizeof(PGPROC)));
/* MyProcs, including autovacuum workers and launcher */
size = add_size(size, mul_size(MaxBackends, sizeof(PGPROC)));
+ /* AuxiliaryProcs */
+ size = add_size(size, mul_size(NUM_AUXILIARY_PROCS, sizeof(PGPROC)));
+ /* Prepared xacts */
+ size = add_size(size, mul_size(max_prepared_xacts, sizeof(PGPROC)));
/* ProcStructLock */
size = add_size(size, sizeof(slock_t));
+ size = add_size(size, mul_size(MaxBackends, sizeof(PGXACT)));
+ size = add_size(size, mul_size(NUM_AUXILIARY_PROCS, sizeof(PGXACT)));
+ size = add_size(size, mul_size(max_prepared_xacts, sizeof(PGXACT)));
+
return size;
}
@@ -157,10 +166,11 @@ void
InitProcGlobal(void)
{
PGPROC *procs;
+ PGXACT *pgxacts;
int i,
j;
bool found;
- uint32 TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS;
+ uint32 TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS + max_prepared_xacts;
/* Create the ProcGlobal shared structure */
ProcGlobal = (PROC_HDR *)
@@ -182,10 +192,11 @@ InitProcGlobal(void)
* those used for 2PC, which are embedded within a GlobalTransactionData
* struct).
*
- * There are three separate consumers of PGPROC structures: (1) normal
- * backends, (2) autovacuum workers and the autovacuum launcher, and (3)
- * auxiliary processes. Each PGPROC structure is dedicated to exactly
- * one of these purposes, and they do not move between groups.
+ * There are four separate consumers of PGPROC structures: (1) normal
+ * backends, (2) autovacuum workers and the autovacuum launcher, (3)
+ * auxiliary processes, and (4) prepared transactions. Each PGPROC
+ * structure is dedicated to exactly one of these purposes, and they do
+ * not move between groups.
*/
procs = (PGPROC *) ShmemAlloc(TotalProcs * sizeof(PGPROC));
ProcGlobal->allProcs = procs;
@@ -195,21 +206,43 @@ InitProcGlobal(void)
(errcode(ERRCODE_OUT_OF_MEMORY),
errmsg("out of shared memory")));
MemSet(procs, 0, TotalProcs * sizeof(PGPROC));
+
+ /*
+ * Also allocate a separate array of PGXACT structures. This is separate
+ * from the main PGPROC array so that the most heavily accessed data is
+ * stored contiguously in memory in as few cache lines as possible. This
+ * provides significant performance benefits, especially on a
+ * multiprocessor system. Thereis one PGXACT structure for every PGPROC
+ * structure.
+ */
+ pgxacts = (PGXACT *) ShmemAlloc(TotalProcs * sizeof(PGXACT));
+ MemSet(pgxacts, 0, TotalProcs * sizeof(PGXACT));
+ ProcGlobal->allPgXact = pgxacts;
+
for (i = 0; i < TotalProcs; i++)
{
/* Common initialization for all PGPROCs, regardless of type. */
- /* Set up per-PGPROC semaphore, latch, and backendLock */
- PGSemaphoreCreate(&(procs[i].sem));
- InitSharedLatch(&(procs[i].procLatch));
- procs[i].backendLock = LWLockAssign();
+ /*
+ * Set up per-PGPROC semaphore, latch, and backendLock. Prepared
+ * xact dummy PGPROCs don't need these though - they're never
+ * associated with a real process
+ */
+ if (i < MaxBackends + NUM_AUXILIARY_PROCS)
+ {
+ PGSemaphoreCreate(&(procs[i].sem));
+ InitSharedLatch(&(procs[i].procLatch));
+ procs[i].backendLock = LWLockAssign();
+ }
+ procs[i].pgprocno = i;
/*
* Newly created PGPROCs for normal backends or for autovacuum must
* be queued up on the appropriate free list. Because there can only
* ever be a small, fixed number of auxiliary processes, no free
* list is used in that case; InitAuxiliaryProcess() instead uses a
- * linear search.
+ * linear search. PGPROCs for prepared transactions are added to a
+ * free list by TwoPhaseShmemInit().
*/
if (i < MaxConnections)
{
@@ -230,10 +263,11 @@ InitProcGlobal(void)
}
/*
- * Save a pointer to the block of PGPROC structures reserved for
- * auxiliary proceses.
+ * Save pointers to the blocks of PGPROC structures reserved for
+ * auxiliary processes and prepared transactions.
*/
AuxiliaryProcs = &procs[MaxBackends];
+ PreparedXactProcs = &procs[MaxBackends + NUM_AUXILIARY_PROCS];
/* Create ProcStructLock spinlock, too */
ProcStructLock = (slock_t *) ShmemAlloc(sizeof(slock_t));
@@ -296,6 +330,7 @@ InitProcess(void)
(errcode(ERRCODE_TOO_MANY_CONNECTIONS),
errmsg("sorry, too many clients already")));
}
+ MyPgXact = &ProcGlobal->allPgXact[MyProc->pgprocno];
/*
* Now that we have a PGPROC, mark ourselves as an active postmaster
@@ -313,18 +348,18 @@ InitProcess(void)
SHMQueueElemInit(&(MyProc->links));
MyProc->waitStatus = STATUS_OK;
MyProc->lxid = InvalidLocalTransactionId;
- MyProc->xid = InvalidTransactionId;
- MyProc->xmin = InvalidTransactionId;
+ MyPgXact->xid = InvalidTransactionId;
+ MyPgXact->xmin = InvalidTransactionId;
MyProc->pid = MyProcPid;
/* backendId, databaseId and roleId will be filled in later */
MyProc->backendId = InvalidBackendId;
MyProc->databaseId = InvalidOid;
MyProc->roleId = InvalidOid;
- MyProc->inCommit = false;
- MyProc->vacuumFlags = 0;
+ MyPgXact->inCommit = false;
+ MyPgXact->vacuumFlags = 0;
/* NB -- autovac launcher intentionally does not set IS_AUTOVACUUM */
if (IsAutoVacuumWorkerProcess())
- MyProc->vacuumFlags |= PROC_IS_AUTOVACUUM;
+ MyPgXact->vacuumFlags |= PROC_IS_AUTOVACUUM;
MyProc->lwWaiting = false;
MyProc->lwExclusive = false;
MyProc->lwWaitLink = NULL;
@@ -462,6 +497,7 @@ InitAuxiliaryProcess(void)
((volatile PGPROC *) auxproc)->pid = MyProcPid;
MyProc = auxproc;
+ MyPgXact = &ProcGlobal->allPgXact[auxproc->pgprocno];
SpinLockRelease(ProcStructLock);
@@ -472,13 +508,13 @@ InitAuxiliaryProcess(void)
SHMQueueElemInit(&(MyProc->links));
MyProc->waitStatus = STATUS_OK;
MyProc->lxid = InvalidLocalTransactionId;
- MyProc->xid = InvalidTransactionId;
- MyProc->xmin = InvalidTransactionId;
+ MyPgXact->xid = InvalidTransactionId;
+ MyPgXact->xmin = InvalidTransactionId;
MyProc->backendId = InvalidBackendId;
MyProc->databaseId = InvalidOid;
MyProc->roleId = InvalidOid;
- MyProc->inCommit = false;
- MyProc->vacuumFlags = 0;
+ MyPgXact->inCommit = false;
+ MyPgXact->vacuumFlags = 0;
MyProc->lwWaiting = false;
MyProc->lwExclusive = false;
MyProc->lwWaitLink = NULL;
@@ -1045,6 +1081,7 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
if (deadlock_state == DS_BLOCKED_BY_AUTOVACUUM && allow_autovacuum_cancel)
{
PGPROC *autovac = GetBlockingAutoVacuumPgproc();
+ PGXACT *autovac_pgxact = &ProcGlobal->allPgXact[autovac->pgprocno];
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
@@ -1053,8 +1090,8 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
* wraparound.
*/
if ((autovac != NULL) &&
- (autovac->vacuumFlags & PROC_IS_AUTOVACUUM) &&
- !(autovac->vacuumFlags & PROC_VACUUM_FOR_WRAPAROUND))
+ (autovac_pgxact->vacuumFlags & PROC_IS_AUTOVACUUM) &&
+ !(autovac_pgxact->vacuumFlags & PROC_VACUUM_FOR_WRAPAROUND))
{
int pid = autovac->pid;
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index 50fb780..814cd23 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -577,7 +577,7 @@ static void
SnapshotResetXmin(void)
{
if (RegisteredSnapshots == 0 && ActiveSnapshot == NULL)
- MyProc->xmin = InvalidTransactionId;
+ MyPgXact->xmin = InvalidTransactionId;
}
/*
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 6e798b1..c7cddc7 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -35,8 +35,6 @@
struct XidCache
{
- bool overflowed;
- int nxids;
TransactionId xids[PGPROC_MAX_CACHED_SUBXIDS];
};
@@ -86,27 +84,14 @@ struct PGPROC
LocalTransactionId lxid; /* local id of top-level transaction currently
* being executed by this proc, if running;
* else InvalidLocalTransactionId */
-
- TransactionId xid; /* id of top-level transaction currently being
- * executed by this proc, if running and XID
- * is assigned; else InvalidTransactionId */
-
- TransactionId xmin; /* minimal running XID as it was when we were
- * starting our xact, excluding LAZY VACUUM:
- * vacuum must not remove tuples deleted by
- * xid >= xmin ! */
-
int pid; /* Backend's process ID; 0 if prepared xact */
+ int pgprocno;
/* These fields are zero while a backend is still starting up: */
BackendId backendId; /* This backend's backend ID (if assigned) */
Oid databaseId; /* OID of database this backend is using */
Oid roleId; /* OID of role using this backend */
- bool inCommit; /* true if within commit critical section */
-
- uint8 vacuumFlags; /* vacuum-related flags, see above */
-
/*
* While in hot standby mode, shows that a conflict signal has been sent
* for the current transaction. Set/cleared while holding ProcArrayLock,
@@ -160,7 +145,33 @@ struct PGPROC
extern PGDLLIMPORT PGPROC *MyProc;
+extern PGDLLIMPORT struct PGXACT *MyPgXact;
+
+/*
+ * Prior to PostgreSQL 9.2, the fieds below were stored as part of the
+ * PGPROC. However, benchmarking revealed that packing these particular
+ * members into a separate array as tightly as possible sped up GetSnapshotData
+ * considerably on systems with many CPU cores, by reducing the number of
+ * cache lines needing to be fetched. Thus, think very carefully before adding
+ * anything else here.
+ */
+typedef struct PGXACT
+{
+ TransactionId xid; /* id of top-level transaction currently being
+ * executed by this proc, if running and XID
+ * is assigned; else InvalidTransactionId */
+
+ TransactionId xmin; /* minimal running XID as it was when we were
+ * starting our xact, excluding LAZY VACUUM:
+ * vacuum must not remove tuples deleted by
+ * xid >= xmin ! */
+
+ uint8 vacuumFlags; /* vacuum-related flags, see above */
+ bool overflowed;
+ bool inCommit; /* true if within commit critical section */
+ uint8 nxids;
+} PGXACT;
/*
* There is one ProcGlobal struct for the whole database cluster.
@@ -169,6 +180,8 @@ typedef struct PROC_HDR
{
/* Array of PGPROC structures (not including dummies for prepared txns) */
PGPROC *allProcs;
+ /* Array of PGXACT structures (not including dummies for prepared txns */
+ PGXACT *allPgXact;
/* Length of allProcs array */
uint32 allProcCount;
/* Head of list of free PGPROC structures */
@@ -186,6 +199,8 @@ typedef struct PROC_HDR
extern PROC_HDR *ProcGlobal;
+extern PGPROC *PreparedXactProcs;
+
/*
* We set aside some extra PGPROC structures for auxiliary processes,
* ie things that aren't full-fledged backends but need shmem access.
recentglobalxmin.patchapplication/octet-stream; name=recentglobalxmin.patchDownload
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 19ff524..6986c57 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -1241,6 +1241,8 @@ GetSnapshotData(Snapshot snapshot)
int count = 0;
int subcount = 0;
bool suboverflowed = false;
+ static TransactionId *xmins = NULL;
+ int numProcs;
Assert(snapshot != NULL);
@@ -1276,6 +1278,15 @@ GetSnapshotData(Snapshot snapshot)
errmsg("out of memory")));
}
+ if (xmins == NULL)
+ {
+ xmins = malloc(procArray->maxProcs * sizeof(TransactionId));
+ if (xmins == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of memory")));
+ }
+
/*
* It is sufficient to get shared lock on ProcArrayLock, even if we are
* going to set MyProc->xmin.
@@ -1303,7 +1314,6 @@ GetSnapshotData(Snapshot snapshot)
if (!snapshot->takenDuringRecovery)
{
int *pgprocnos = arrayP->pgprocnos;
- int numProcs;
/*
* Spin over procArray checking xid, xmin, and subxids. The goal is
@@ -1319,13 +1329,13 @@ GetSnapshotData(Snapshot snapshot)
/* Ignore procs running LAZY VACUUM */
if (pgxact->vacuumFlags & PROC_IN_VACUUM)
+ {
+ xmins[index] = InvalidTransactionId;
continue;
+ }
/* Update globalxmin to be the smallest valid xmin */
- xid = pgxact->xmin; /* fetch just once */
- if (TransactionIdIsNormal(xid) &&
- TransactionIdPrecedes(xid, globalxmin))
- globalxmin = xid;
+ xmins[index] = pgxact->xmin; /* fetch just once */
/* Fetch xid just once - see GetNewTransactionId */
xid = pgxact->xid;
@@ -1386,6 +1396,7 @@ GetSnapshotData(Snapshot snapshot)
}
else
{
+ numProcs = 0;
/*
* We're in hot standby, so get XIDs from KnownAssignedXids.
*
@@ -1425,6 +1436,14 @@ GetSnapshotData(Snapshot snapshot)
* different way of computing it than GetOldestXmin uses, but should give
* the same result.
*/
+ for (index = 0; index < numProcs; index++)
+ {
+ TransactionId xid = xmins[index];
+ if (TransactionIdIsNormal(xid) &&
+ TransactionIdPrecedes(xid, globalxmin))
+ globalxmin = xid;
+ }
+
if (TransactionIdPrecedes(xmin, globalxmin))
globalxmin = xmin;
On Thu, Nov 24, 2011 at 1:30 PM, Robert Haas <robertmhaas@gmail.com> wrote:
I'm going to run some more tests, but my thought is that we should
probably leave the recentglobalxmin changes out for the time being,
pending further study and consideration of other alternatives.
Agreed
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Nov 24, 2011 at 7:24 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Thu, Nov 24, 2011 at 1:30 PM, Robert Haas <robertmhaas@gmail.com> wrote:
I'm going to run some more tests, but my thought is that we should
probably leave the recentglobalxmin changes out for the time being,
pending further study and consideration of other alternatives.Agreed
+1. These are independent patches and should be pursued like that.
BTW, I reviewed the pgxact-v2.patch and I have no objections to that
and it looks good to go in. Thanks Robert for making the necessary
changes and also running the benchmark tests.
Thanks,
Pavan
--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com
On Thu, Nov 24, 2011 at 11:54 PM, Pavan Deolasee
<pavan.deolasee@gmail.com> wrote:
+1. These are independent patches and should be pursued like that.
BTW, I reviewed the pgxact-v2.patch and I have no objections to that
and it looks good to go in. Thanks Robert for making the necessary
changes and also running the benchmark tests.
I ran some write tests with master, pgxact-v2, and
pgxact-v2+recentglobalxmin. shared_buffers = 8GB,
maintenance_work_mem = 1GB, synchronous_commit = off,
checkpoint_segments = 300, checkpoint_timeout = 15min,
checkpoint_completion_target = 0.9, wal_writer_delay = 20ms. Seven
five-minute runs at scale factor 100 for each configuration.
Here's the executive summary: On the read-only test,
recentglobalxmin.patch was only a win at the very highest concurrency
I tested (80 clients); on the read-write test, it starts to show a
benefit at much lower concurrencies (32 clients, certainly, perhaps
even 8 clients, on unlogged tables). However, pgxact-v2.patch seems
to be a win on both read and write tests and at any concurrency level,
including the single-client case.
== 1 client, unlogged tables ==
master: low 671.861618 median 677.324867 high 765.824313 (but the
second highest was only 679.491822)
pgxact-v2: low 663.901614 median 689.496716 high 696.812065
pgxact-v2+recentglobalxmin: low 665.554342 median 685.401979 high 688.832906
== 8 clients, unlogged tables ==
master: low 4722.011063 median 4758.201239 high 4920.130891
pgxact-v2: low 4684.759859 median 4840.081663 high 4979.036845
pgxact-v2+recentglobalxmin: low 4723.743270 median 4856.513904 high 4997.528399
== 32 clients, unlogged tables ==
master: low 10878.959662 median 10901.523672 high 10934.699151
pgxact-v2: low 17944.914228 median 18060.058996 high 19281.541088
pgxact-v2+recentglobalxmin: low 18894.860512 median 19637.190567 high
19817.089456
== 80 clients, unlogged tables ==
master: low 7872.934292 median 7897.811216 high 7909.410723
pgxact-v2: low 12032.684380 median 12397.316995 high 13279.998414
pgxact-v2+recentglobalxmin: low 16964.227483 median 17801.478747 high
18107.646170
== 1 client, permanent tables ==
master: low 625.502929 median 628.442284 high 677.451660
pgxact-v2: low 636.755782 median 640.083573 high 645.273888
pgxact-v2+recentglobalxmin: low 633.320412 median 636.898945 high 637.886099
== 8 clients, permanent tables ==
master: low 4497.012143 median 4624.844801 high 4849.233268
pgxact-v2: low 4561.914897 median 4625.443111 high 4776.095552
pgxact-v2+recentglobalxmin: low 4469.742226 median 4789.249847 high 4824.033794
== 32 clients, permanent tables ==
master: low 10468.362239 median 10511.425102 high 10531.069684
pgxact-v2: low 12821.732396 median 14500.067726 high 14546.538281
pgxact-v2+recentglobalxmin: low 14907.122746 median 15129.665408 high
15186.743199
== 80 clients, permanent tables ==
master: low 7601.067552 median 7612.898321 high 7631.487355
pgxact-v2: low 11712.895410 median 12004.807309 high 12512.078569
pgxact-v2+recentglobalxmin: low 15186.695057 median 15810.452158 high
16166.272699
I don't see much reason to wait around any further on the core patch
(pgact-v2.patch) so I'll commit that now.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Mon, Nov 7, 2011 at 7:28 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Nov 7, 2011 at 6:45 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:While looking at GetSnapshotData(), it occurred to me that when a PGPROC
entry does not have its xid set, ie. xid == InvalidTransactionId, it's
pointless to check the subxid array for that proc. If a transaction has no
xid, none of its subtransactions can have an xid either. That's a trivial
optimization, see attached. I tested this with "pgbench -S -M prepared -c
500" on the 8-core box, and it seemed to make a 1-2% difference (without the
other patch). So, almost in the noise, but it also doesn't cost us anything
in terms of readability or otherwise.Oh, that's a good idea. +1 for doing that now, and then we can keep
working on the rest of it.
I spent some time looking at (and benchmarking) this idea today. I
rearranged that section of code a little more into what seemed like
the optimal order to avoid doing more work than necessary, but I
wasn't getting much of a gain out of it even on unlogged tables and on
permanent tables it was looking like a small loss, though I didn't
bother letting the benchmarks run for long enough to nail that down
because it didn't seem to amount to much one way or the other. I
added then added one more change that helped quite a lot: I introduced
a macro NormalTransactionIdPrecedes() which works like
TransactionIdPrecdes() but (a) is only guaranteed to work if the
arguments are known to be normal transaction IDs (which it also
asserts, for safety) and (b) is a macro rather than a function call.
I found three places to use that inside GetSnapshotData(), and the
results with that change look fairly promising.
Nate Boley's box, m = master, s = with the attached patch, median of
three 5-minute runs at scale factor 100, config same as my other
recent tests:
Permanent Tables
m01 tps = 617.734793 (including connections establishing)
s01 tps = 620.330099 (including connections establishing)
m08 tps = 4589.566969 (including connections establishing)
s08 tps = 4545.942588 (including connections establishing)
m16 tps = 7618.842377 (including connections establishing)
s16 tps = 7596.759619 (including connections establishing)
m24 tps = 11770.534809 (including connections establishing)
s24 tps = 11789.424152 (including connections establishing)
m32 tps = 10776.660575 (including connections establishing)
s32 tps = 10643.361817 (including connections establishing)
m80 tps = 11057.353339 (including connections establishing)
s80 tps = 10598.254713 (including connections establishing)
Unlogged Tables
m01 tps = 668.145495 (including connections establishing)
s01 tps = 676.793337 (including connections establishing)
m08 tps = 4715.214745 (including connections establishing)
s08 tps = 4737.833913 (including connections establishing)
m16 tps = 8110.607784 (including connections establishing)
s16 tps = 8192.013414 (including connections establishing)
m24 tps = 14120.753841 (including connections establishing)
s24 tps = 14302.915040 (including connections establishing)
m32 tps = 17886.032656 (including connections establishing)
s32 tps = 18735.319468 (including connections establishing)
m80 tps = 12711.930739 (including connections establishing)
s80 tps = 17592.715471 (including connections establishing)
Now, you might not initially find those numbers all that encouraging.
Of course, the unlogged tables numbers are quite good, especially at
32 and 80 clients, where the gains are quite marked. But the
permanent table numbers are at best a wash, and maybe a small loss.
Interestingly enough, some recent benchmarking of the FlexLocks patch
showed a similar (though more pronounced) effect:
http://archives.postgresql.org/pgsql-hackers/2011-12/msg00095.php
Now, both the FlexLocks patch and this patch aim to mitigate
contention problems around ProcArrayLock. But they do it in
completely different ways. When I got a small regression on permanent
tables with the FlexLocks patch, I thought the problem was with the
patch itself, which is believable, since it was tinkering with the
LWLock machinery, a fairly global sort of change that you can well
imagine might break something. But it's hard to imagine how that
could possibly be the case here, especially given the speedups on
unlogged tables, because this patch is dead simple and entirely
isolated to GetSnapshotData(). So I have a new theory: on permanent
tables, *anything* that reduces ProcArrayLock contention causes an
approximately equal increase in WALInsertLock contention (or maybe
some other lock), and in some cases that increase in contention
elsewhere can cost more than the amount we're saving here.
On that theory, I'm inclined to think that's not really a problem.
We'll go nuts if we refuse to commit anything until it shows a
meaningful win on every imaginable workload, and it seems like this
can't really be worse than the status quo; any case where it is must
be some kind of artifact. We're better of getting rid of as much
ProcArrayLock contention as possible, rather than keeping it around
because there are corner cases where it decreases contention on some
other lock.
One small detail I'm noticing on further review is that I've slightly
changed the semantics here:
if (!TransactionIdIsNormal(xid)
|| NormalTransactionIdPrecedes(xmax, xid))
continue;
That really ought to be testing <=, not just <. That doesn't seem
like it should affect correctness, though: at worst, we unnecessarily
include one or more XIDs in the snapshot that will be ignored anyway.
I'll fix that and rerun the tests but I don't think it'll make any
difference.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
getsnapshotdata-tweaks.patchapplication/octet-stream; name=getsnapshotdata-tweaks.patchDownload
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 19ff524..fa98660 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -1324,30 +1324,33 @@ GetSnapshotData(Snapshot snapshot)
/* Update globalxmin to be the smallest valid xmin */
xid = pgxact->xmin; /* fetch just once */
if (TransactionIdIsNormal(xid) &&
- TransactionIdPrecedes(xid, globalxmin))
+ NormalTransactionIdPrecedes(xid, globalxmin))
globalxmin = xid;
/* Fetch xid just once - see GetNewTransactionId */
xid = pgxact->xid;
/*
- * If the transaction has been assigned an xid < xmax we add it to
- * the snapshot, and update xmin if necessary. There's no need to
- * store XIDs >= xmax, since we'll treat them as running anyway.
- * We don't bother to examine their subxids either.
- *
- * We don't include our own XID (if any) in the snapshot, but we
- * must include it into xmin.
+ * If the transaction has no XID assigned, we can skip it; it won't
+ * have sub-XIDs either. If the XID is >= xmax, we can also skip
+ * it; such transactions will be treated as running anyway (and any
+ * sub-XIDs will also be >= xmax).
*/
- if (TransactionIdIsNormal(xid))
- {
- if (TransactionIdFollowsOrEquals(xid, xmax))
+ if (!TransactionIdIsNormal(xid)
+ || NormalTransactionIdPrecedes(xmax, xid))
continue;
- if (pgxact != MyPgXact)
- snapshot->xip[count++] = xid;
- if (TransactionIdPrecedes(xid, xmin))
- xmin = xid;
- }
+
+ /*
+ * We don't include our own XIDs (if any) in the snapshot, but we
+ * must include them in xmin.
+ */
+ if (NormalTransactionIdPrecedes(xid, xmin))
+ xmin = xid;
+ if (pgxact == MyPgXact)
+ continue;
+
+ /* Add XID to snapshot. */
+ snapshot->xip[count++] = xid;
/*
* Save subtransaction XIDs if possible (if we've already
@@ -1364,7 +1367,7 @@ GetSnapshotData(Snapshot snapshot)
*
* Again, our own XIDs are not included in the snapshot.
*/
- if (!suboverflowed && pgxact != MyPgXact)
+ if (!suboverflowed)
{
if (pgxact->overflowed)
suboverflowed = true;
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index c038fd9..29ea4e9 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -58,6 +58,10 @@
(dest)--; \
} while ((dest) < FirstNormalTransactionId)
+/* compare two XIDs already known to be normal; this is a macro for speed */
+#define NormalTransactionIdPrecedes(id1, id2) \
+ (AssertMacro(TransactionIdIsNormal(id1) && TransactionIdIsNormal(id2)), \
+ (int32) (id1 - id2) < 0)
/* ----------
* Object ID (OID) zero is InvalidOid.
On Thu, Dec 15, 2011 at 11:35 PM, Robert Haas <robertmhaas@gmail.com> wrote:
One small detail I'm noticing on further review is that I've slightly
changed the semantics here:if (!TransactionIdIsNormal(xid)
|| NormalTransactionIdPrecedes(xmax, xid))
continue;That really ought to be testing <=, not just <. That doesn't seem
like it should affect correctness, though: at worst, we unnecessarily
include one or more XIDs in the snapshot that will be ignored anyway.
I'll fix that and rerun the tests but I don't think it'll make any
difference.
New results with the attached, updated patch. As predicted, these are
quite similar to the last batch...
m01 tps = 618.460567 (including connections establishing)
s01 tps = 628.394270 (including connections establishing)
m08 tps = 4558.930585 (including connections establishing)
s08 tps = 4490.285074 (including connections establishing)
m16 tps = 7577.677079 (including connections establishing)
s16 tps = 7582.611380 (including connections establishing)
m24 tps = 11556.680518 (including connections establishing)
s24 tps = 11527.982307 (including connections establishing)
m32 tps = 10807.216084 (including connections establishing)
s32 tps = 10871.625992 (including connections establishing)
m80 tps = 10818.092314 (including connections establishing)
s80 tps = 10866.780660 (including connections establishing)
Unlogged Tables:
m01 tps = 670.328782 (including connections establishing)
s01 tps = 818.666971 (including connections establishing)
m08 tps = 4793.337235 (including connections establishing)
s08 tps = 4888.600945 (including connections establishing)
m16 tps = 8016.092785 (including connections establishing)
s16 tps = 8123.217451 (including connections establishing)
m24 tps = 14082.694567 (including connections establishing)
s24 tps = 14114.466246 (including connections establishing)
m32 tps = 17881.340576 (including connections establishing)
s32 tps = 18291.739818 (including connections establishing)
m80 tps = 12767.535657 (including connections establishing)
s80 tps = 17380.055381 (including connections establishing)
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
getsnapshotdata-tweaks-v2.patchapplication/octet-stream; name=getsnapshotdata-tweaks-v2.patchDownload
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 19ff524..fc95d93 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -1324,30 +1324,33 @@ GetSnapshotData(Snapshot snapshot)
/* Update globalxmin to be the smallest valid xmin */
xid = pgxact->xmin; /* fetch just once */
if (TransactionIdIsNormal(xid) &&
- TransactionIdPrecedes(xid, globalxmin))
+ NormalTransactionIdPrecedes(xid, globalxmin))
globalxmin = xid;
/* Fetch xid just once - see GetNewTransactionId */
xid = pgxact->xid;
/*
- * If the transaction has been assigned an xid < xmax we add it to
- * the snapshot, and update xmin if necessary. There's no need to
- * store XIDs >= xmax, since we'll treat them as running anyway.
- * We don't bother to examine their subxids either.
- *
- * We don't include our own XID (if any) in the snapshot, but we
- * must include it into xmin.
+ * If the transaction has no XID assigned, we can skip it; it won't
+ * have sub-XIDs either. If the XID is >= xmax, we can also skip
+ * it; such transactions will be treated as running anyway (and any
+ * sub-XIDs will also be >= xmax).
*/
- if (TransactionIdIsNormal(xid))
- {
- if (TransactionIdFollowsOrEquals(xid, xmax))
+ if (!TransactionIdIsNormal(xid)
+ || !NormalTransactionIdPrecedes(xid, xmax))
continue;
- if (pgxact != MyPgXact)
- snapshot->xip[count++] = xid;
- if (TransactionIdPrecedes(xid, xmin))
- xmin = xid;
- }
+
+ /*
+ * We don't include our own XIDs (if any) in the snapshot, but we
+ * must include them in xmin.
+ */
+ if (NormalTransactionIdPrecedes(xid, xmin))
+ xmin = xid;
+ if (pgxact == MyPgXact)
+ continue;
+
+ /* Add XID to snapshot. */
+ snapshot->xip[count++] = xid;
/*
* Save subtransaction XIDs if possible (if we've already
@@ -1364,7 +1367,7 @@ GetSnapshotData(Snapshot snapshot)
*
* Again, our own XIDs are not included in the snapshot.
*/
- if (!suboverflowed && pgxact != MyPgXact)
+ if (!suboverflowed)
{
if (pgxact->overflowed)
suboverflowed = true;
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index c038fd9..3ac1403 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -58,6 +58,10 @@
(dest)--; \
} while ((dest) < FirstNormalTransactionId)
+/* compare two XIDs already known to be normal; this is a macro for speed */
+#define NormalTransactionIdPrecedes(id1, id2) \
+ (AssertMacro(TransactionIdIsNormal(id1) && TransactionIdIsNormal(id2)), \
+ (int32) ((id1) - (id2)) < 0)
/* ----------
* Object ID (OID) zero is InvalidOid.
Robert Haas wrote:
On that theory, I'm inclined to think that's not really a problem.
We'll go nuts if we refuse to commit anything until it shows a
meaningful win on every imaginable workload, and it seems like this
can't really be worse than the status quo; any case where it is must
be some kind of artifact. We're better of getting rid of as much
ProcArrayLock contention as possible, rather than keeping it around
because there are corner cases where it decreases contention on some
other lock.
Interesting conclusion, and it makes sense. Seems once this is applied
we will have more places to look for contention improvements.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
On Fri, Dec 16, 2011 at 8:25 PM, Bruce Momjian <bruce@momjian.us> wrote:
Robert Haas wrote:
On that theory, I'm inclined to think that's not really a problem.
We'll go nuts if we refuse to commit anything until it shows a
meaningful win on every imaginable workload, and it seems like this
can't really be worse than the status quo; any case where it is must
be some kind of artifact. We're better of getting rid of as much
ProcArrayLock contention as possible, rather than keeping it around
because there are corner cases where it decreases contention on some
other lock.Interesting conclusion, and it makes sense. Seems once this is applied
we will have more places to look for contention improvements.
Yeah. The performance results I posted the other day seem to show
that on some of these tests we're thrashing our CLOG buffers, and the
difference between unlogged tables and permanent tables seems to
indicate pretty clearly that WALInsertLock is a huge problem. I'm
going to look more at the CLOG stuff next week, and also keep poking
at ProcArrayLock, where I think there's still room for further
improvement. I am leaving WALInsertLock to Heikki for now, since (1)
I don't want to collide with what he's working on, (2) he knows more
about it than I do, anyway, and (3) it's a really hard problem and I
don't have any particularly good ideas about how to fix it. :-(
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Dec 16, 2011, at 7:25 PM, Bruce Momjian wrote:
Robert Haas wrote:
On that theory, I'm inclined to think that's not really a problem.
We'll go nuts if we refuse to commit anything until it shows a
meaningful win on every imaginable workload, and it seems like this
can't really be worse than the status quo; any case where it is must
be some kind of artifact. We're better of getting rid of as much
ProcArrayLock contention as possible, rather than keeping it around
because there are corner cases where it decreases contention on some
other lock.Interesting conclusion, and it makes sense. Seems once this is applied
we will have more places to look for contention improvements.
I also wonder how much this throws some previous performance tests into suspicion. If it's not uncommon for performance improvement attempts to shift a bottleneck to a different part of the system and marginally hurt performance then we might be throwing away good performance improvement ideas before we should...
--
Jim C. Nasby, Database Architect jim@nasby.net
512.569.9461 (cell) http://jim.nasby.net
On Sat, Dec 17, 2011 at 1:00 AM, Jim Nasby <jim@nasby.net> wrote:
I also wonder how much this throws some previous performance tests into suspicion. If it's not uncommon for performance improvement attempts to shift a bottleneck to a different part of the system and marginally hurt performance then we might be throwing away good performance improvement ideas before we should...
I think we are (mostly) OK on this point, at least as far as the work
I've been doing. We've actually had a few previous instances of this
phenomenon - e.g. when I first committed my fastlock patch,
performance actually got worse if you had >40 cores doing read-only
queries, because speeding up the lock manager made it possible for the
spinlock protection SInvalReadLock to mess things up royally.
Nevertheless, we got it committed - and fixed the SInvalReadLock
problem, too. This one is/was somewhat more subtle, but I'm feeling
pretty good about our chances of making at least some further progress
in time for 9.2.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company