Re: POC: Cache data in GetSnapshotData()

Started by Mithun Cyalmost 10 years ago37 messages

mithun.cy@enterprisedb.com

almost 10 years ago

1 attachment(s)

On Mon, Jan 4, 2016 at 2:28 PM, Andres Freund <andres@anarazel.de> wrote:

I think at the very least the cache should be protected by a separate
lock, and that lock should be acquired with TryLock. I.e. the cache is
updated opportunistically. I'd go for an lwlock in the first iteration.

I tried to implement a simple patch which protect the cache. Of all the
backend which
compute the snapshot(when cache is invalid) only one of them will write to
cache.
This is done with one atomic compare and swap operation.

After above fix memory corruption is not visible. But I see some more
failures at higher client sessions(128 it is easily reproducible).

Errors are
DETAIL: Key (bid)=(24) already exists.
STATEMENT: UPDATE pgbench_branches SET bbalance = bbalance + $1 WHERE bid
= $2;
client 17 aborted in state 11: ERROR: duplicate key value violates unique
constraint "pgbench_branches_pkey"
DETAIL: Key (bid)=(24) already exists.
client 26 aborted in state 11: ERROR: duplicate key value violates unique
constraint "pgbench_branches_pkey"
DETAIL: Key (bid)=(87) already exists.
ERROR: duplicate key value violates unique constraint
"pgbench_branches_pkey"
DETAIL: Key (bid)=(113) already exists.

After some analysis I think In GetSnapshotData() while computing snapshot.
/*
* We don't include our own XIDs (if any) in the snapshot,
but we
* must include them in xmin.
*/
if (NormalTransactionIdPrecedes(xid, xmin))
xmin = xid;
*********** if (pgxact == MyPgXact) ******************
continue;

We do not add our own xid to xip array, I am wondering if other backend try
to use
the same snapshot will it be able to see changes made by me(current
backend).
I think since we compute a snapshot which will be used by other backends we
need to
add our xid to xip array to tell transaction is open.
--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Attachments:

cache_snapshot_data_avoid_cuncurrent_write_to_cache.patchtext/x-patch; charset=US-ASCII; name=cache_snapshot_data_avoid_cuncurrent_write_to_cache.patchDownload

*** a/src/backend/commands/cluster.c
--- b/src/backend/commands/cluster.c
***************
*** 1559,1564 **** finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
--- 1559,1565 ----
  			elog(ERROR, "cache lookup failed for relation %u", OIDOldHeap);
  		relform = (Form_pg_class) GETSTRUCT(reltup);
  
+ 		Assert(TransactionIdIsNormal(frozenXid));
  		relform->relfrozenxid = frozenXid;
  		relform->relminmxid = cutoffMulti;
  
*** a/src/backend/storage/ipc/procarray.c
--- b/src/backend/storage/ipc/procarray.c
***************
*** 410,415 **** ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
--- 410,417 ----
  		if (LWLockConditionalAcquire(ProcArrayLock, LW_EXCLUSIVE))
  		{
  			ProcArrayEndTransactionInternal(proc, pgxact, latestXid);
+             pg_atomic_write_u32(&ProcGlobal->snapshot_cached, 0);
+             ProcGlobal->cached_snapshot_valid = false;
  			LWLockRelease(ProcArrayLock);
  		}
  		else
***************
*** 557,562 **** ProcArrayGroupClearXid(PGPROC *proc, TransactionId latestXid)
--- 559,568 ----
  		/* Move to next proc in list. */
  		nextidx = pg_atomic_read_u32(&proc->nextClearXidElem);
  	}
+         
+     pg_atomic_write_u32(&ProcGlobal->snapshot_cached, 0);
+ 
+     ProcGlobal->cached_snapshot_valid = false;
  
  	/* We're done with the lock now. */
  	LWLockRelease(ProcArrayLock);
***************
*** 1543,1548 **** GetSnapshotData(Snapshot snapshot)
--- 1549,1556 ----
  					 errmsg("out of memory")));
  	}
  
+         snapshot->takenDuringRecovery = RecoveryInProgress();
+ 
  	/*
  	 * It is sufficient to get shared lock on ProcArrayLock, even if we are
  	 * going to set MyPgXact->xmin.
***************
*** 1557,1568 **** GetSnapshotData(Snapshot snapshot)
  	/* initialize xmin calculation with xmax */
  	globalxmin = xmin = xmax;
  
! 	snapshot->takenDuringRecovery = RecoveryInProgress();
  
! 	if (!snapshot->takenDuringRecovery)
  	{
  		int		   *pgprocnos = arrayP->pgprocnos;
  		int			numProcs;
  
  		/*
  		 * Spin over procArray checking xid, xmin, and subxids.  The goal is
--- 1565,1599 ----
  	/* initialize xmin calculation with xmax */
  	globalxmin = xmin = xmax;
  
! 	if (!snapshot->takenDuringRecovery && ProcGlobal->cached_snapshot_valid)
! 	{
! 		Snapshot csnap = &ProcGlobal->cached_snapshot;
! 		TransactionId *saved_xip;
! 		TransactionId *saved_subxip;
  
! 		saved_xip = snapshot->xip;
! 		saved_subxip = snapshot->subxip;
! 
! 		memcpy(snapshot, csnap, sizeof(SnapshotData));
! 
! 		snapshot->xip = saved_xip;
! 		snapshot->subxip = saved_subxip;
! 
! 		memcpy(snapshot->xip, csnap->xip,
! 			   sizeof(TransactionId) * csnap->xcnt);
! 		memcpy(snapshot->subxip, csnap->subxip,
! 			   sizeof(TransactionId) * csnap->subxcnt);
! 		globalxmin = ProcGlobal->cached_snapshot_globalxmin;
! 		xmin = csnap->xmin;
! 
! 		Assert(TransactionIdIsValid(globalxmin));
! 		Assert(TransactionIdIsValid(xmin));
! 	}
! 	else if (!snapshot->takenDuringRecovery)
  	{
  		int		   *pgprocnos = arrayP->pgprocnos;
  		int			numProcs;
+ 		const uint32 snapshot_cached= 0;
  
  		/*
  		 * Spin over procArray checking xid, xmin, and subxids.  The goal is
***************
*** 1577,1591 **** GetSnapshotData(Snapshot snapshot)
  			TransactionId xid;
  
  			/*
! 			 * Backend is doing logical decoding which manages xmin
! 			 * separately, check below.
  			 */
! 			if (pgxact->vacuumFlags & PROC_IN_LOGICAL_DECODING)
! 				continue;
! 
! 			/* Ignore procs running LAZY VACUUM */
! 			if (pgxact->vacuumFlags & PROC_IN_VACUUM)
! 				continue;
  
  			/* Update globalxmin to be the smallest valid xmin */
  			xid = pgxact->xmin; /* fetch just once */
--- 1608,1619 ----
  			TransactionId xid;
  
  			/*
! 			 * Ignore procs running LAZY VACUUM (which don't need to retain
! 			 * rows) and backends doing logical decoding (which manages xmin
! 			 * separately, check below).
  			 */
! 			if (pgxact->vacuumFlags & (PROC_IN_LOGICAL_DECODING | PROC_IN_VACUUM))
!                                 continue;
  
  			/* Update globalxmin to be the smallest valid xmin */
  			xid = pgxact->xmin; /* fetch just once */
***************
*** 1653,1658 **** GetSnapshotData(Snapshot snapshot)
--- 1681,1715 ----
  				}
  			}
  		}
+ 
+ 		/*
+ 		 * Let only one of the caller cache the computed snapshot, and others
+ 		 * can continue as before.
+ 		 */
+ 		if (pg_atomic_compare_exchange_u32(&ProcGlobal->snapshot_cached,
+ 								           &snapshot_cached, 1))
+ 		{
+ 			Snapshot csnap = &ProcGlobal->cached_snapshot;
+ 			TransactionId *saved_xip;
+ 			TransactionId *saved_subxip;
+ 
+ 			ProcGlobal->cached_snapshot_globalxmin = globalxmin;
+ 
+ 			saved_xip = csnap->xip;
+ 			saved_subxip = csnap->subxip;
+ 			memcpy(csnap, snapshot, sizeof(SnapshotData));
+ 			csnap->xip = saved_xip;
+ 			csnap->subxip = saved_subxip;
+ 
+ 			/* not yet stored. Yuck */
+ 			csnap->xmin = xmin;
+ 
+ 			memcpy(csnap->xip, snapshot->xip,
+ 				   sizeof(TransactionId) * csnap->xcnt);
+ 			memcpy(csnap->subxip, snapshot->subxip,
+ 				   sizeof(TransactionId) * csnap->subxcnt);
+ 			ProcGlobal->cached_snapshot_valid = true;
+ 		}
  	}
  	else
  	{
*** a/src/backend/storage/lmgr/proc.c
--- b/src/backend/storage/lmgr/proc.c
***************
*** 51,57 ****
  #include "storage/spin.h"
  #include "utils/timeout.h"
  #include "utils/timestamp.h"
! 
  
  /* GUC variables */
  int			DeadlockTimeout = 1000;
--- 51,57 ----
  #include "storage/spin.h"
  #include "utils/timeout.h"
  #include "utils/timestamp.h"
! #include "port/atomics.h"
  
  /* GUC variables */
  int			DeadlockTimeout = 1000;
***************
*** 114,119 **** ProcGlobalShmemSize(void)
--- 114,126 ----
  	size = add_size(size, mul_size(NUM_AUXILIARY_PROCS, sizeof(PGXACT)));
  	size = add_size(size, mul_size(max_prepared_xacts, sizeof(PGXACT)));
  
+ 	/* for the cached snapshot */
+ #define PROCARRAY_MAXPROCS	(MaxBackends + max_prepared_xacts)
+ 	size = add_size(size, mul_size(sizeof(TransactionId), PROCARRAY_MAXPROCS));
+ #define TOTAL_MAX_CACHED_SUBXIDS \
+ 	((PGPROC_MAX_CACHED_SUBXIDS + 1) * PROCARRAY_MAXPROCS)
+ 	size = add_size(size, mul_size(sizeof(TransactionId), TOTAL_MAX_CACHED_SUBXIDS));
+ 
  	return size;
  }
  
***************
*** 275,280 **** InitProcGlobal(void)
--- 282,294 ----
  	/* Create ProcStructLock spinlock, too */
  	ProcStructLock = (slock_t *) ShmemAlloc(sizeof(slock_t));
  	SpinLockInit(ProcStructLock);
+ 
+ 	/* cached snapshot */
+ 	ProcGlobal->cached_snapshot_valid = false;
+     pg_atomic_write_u32(&ProcGlobal->snapshot_cached, 0);
+ 	ProcGlobal->cached_snapshot.xip = ShmemAlloc(PROCARRAY_MAXPROCS * sizeof(TransactionId));
+ 	ProcGlobal->cached_snapshot.subxip = ShmemAlloc(TOTAL_MAX_CACHED_SUBXIDS * sizeof(TransactionId));
+ 	ProcGlobal->cached_snapshot_globalxmin = InvalidTransactionId;
  }
  
  /*
*** a/src/include/storage/proc.h
--- b/src/include/storage/proc.h
***************
*** 16,21 ****
--- 16,22 ----
  
  #include "access/xlogdefs.h"
  #include "lib/ilist.h"
+ #include "utils/snapshot.h"
  #include "storage/latch.h"
  #include "storage/lock.h"
  #include "storage/pg_sema.h"
***************
*** 220,225 **** typedef struct PROC_HDR
--- 221,232 ----
  	int			startupProcPid;
  	/* Buffer id of the buffer that Startup process waits for pin on, or -1 */
  	int			startupBufferPinWaitBufId;
+ 
+ 	/* Cached snapshot */
+ 	volatile bool		cached_snapshot_valid;
+     pg_atomic_uint32    snapshot_cached;
+ 	SnapshotData cached_snapshot;
+ 	TransactionId cached_snapshot_globalxmin;
  } PROC_HDR;
  
  extern PROC_HDR *ProcGlobal;

Amit Kapila

amit.kapila16@gmail.com

almost 10 years ago

In reply to: Mithun Cy (#1)

On Fri, Jan 15, 2016 at 11:23 AM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:

On Mon, Jan 4, 2016 at 2:28 PM, Andres Freund <andres@anarazel.de> wrote:

I think at the very least the cache should be protected by a separate
lock, and that lock should be acquired with TryLock. I.e. the cache is
updated opportunistically. I'd go for an lwlock in the first iteration.

I tried to implement a simple patch which protect the cache. Of all the
backend which
compute the snapshot(when cache is invalid) only one of them will write to
cache.
This is done with one atomic compare and swap operation.

After above fix memory corruption is not visible. But I see some more
failures at higher client sessions(128 it is easily reproducible).

Don't you think we need to update the snapshot fields like count,
subcount before saving it to shared memory?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Amit Kapila

amit.kapila16@gmail.com

almost 10 years ago

In reply to: Mithun Cy (#1)

On Fri, Jan 15, 2016 at 11:23 AM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:

On Mon, Jan 4, 2016 at 2:28 PM, Andres Freund <andres@anarazel.de> wrote:

I think at the very least the cache should be protected by a separate
lock, and that lock should be acquired with TryLock. I.e. the cache is
updated opportunistically. I'd go for an lwlock in the first iteration.

I tried to implement a simple patch which protect the cache. Of all the
backend which
compute the snapshot(when cache is invalid) only one of them will write to
cache.
This is done with one atomic compare and swap operation.

After above fix memory corruption is not visible. But I see some more
failures at higher client sessions(128 it is easily reproducible).

Errors are
DETAIL: Key (bid)=(24) already exists.
STATEMENT: UPDATE pgbench_branches SET bbalance = bbalance + $1 WHERE bid
= $2;
client 17 aborted in state 11: ERROR: duplicate key value violates unique
constraint "pgbench_branches_pkey"
DETAIL: Key (bid)=(24) already exists.
client 26 aborted in state 11: ERROR: duplicate key value violates unique
constraint "pgbench_branches_pkey"
DETAIL: Key (bid)=(87) already exists.
ERROR: duplicate key value violates unique constraint
"pgbench_branches_pkey"
DETAIL: Key (bid)=(113) already exists.

After some analysis I think In GetSnapshotData() while computing snapshot.
/*
* We don't include our own XIDs (if any) in the snapshot,
but we
* must include them in xmin.
*/
if (NormalTransactionIdPrecedes(xid, xmin))
xmin = xid;
*********** if (pgxact == MyPgXact) ******************
continue;

We do not add our own xid to xip array, I am wondering if other backend
try to use
the same snapshot will it be able to see changes made by me(current
backend).
I think since we compute a snapshot which will be used by other backends
we need to
add our xid to xip array to tell transaction is open.

I also think this observation of yours is right and currently that is
okay because we always first check TransactionIdIsCurrentTransactionId().
Refer comments on top of XidInMVCCSnapshot() [1]Note: GetSnapshotData never stores either top xid or subxids of our own backend into a snapshot, so these xids will not be reported as "running" by this function. This is OK for current uses, because we always check TransactionIdIsCurrentTransactionId first, except for known-committed XIDs which could not be ours anyway.. However, I
am not sure if it is good idea to start including current backends xid
in snapshot, because that can lead to including its subxids as well
which can make snapshot size bigger for cases where current transaction
has many sub-transactions. One way could be that we store current
backends transaction and sub-transaction id's only for the saved
snapshot, does that sound reasonable to you?

Other than that, I think patch needs to clear saved snapshot for
Commit Prepared Transaction as well (refer function
FinishPreparedTransaction()).

! else if (!snapshot->takenDuringRecovery)
{
int *pgprocnos = arrayP->pgprocnos;
int numProcs;
+ const uint32 snapshot_cached= 0;

I don't think the const is required for above variable.

[1]: Note: GetSnapshotData never stores either top xid or subxids of our own backend into a snapshot, so these xids will not be reported as "running" by this function. This is OK for current uses, because we always check TransactionIdIsCurrentTransactionId first, except for known-committed XIDs which could not be ours anyway.
Note: GetSnapshotData never stores either top xid or subxids of our own
backend into a snapshot, so these xids will not be reported as "running"
by this function. This is OK for current uses, because we always check
TransactionIdIsCurrentTransactionId first, except for known-committed
XIDs which could not be ours anyway.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Mithun Cy

mithun.cy@enterprisedb.com

almost 10 years ago

In reply to: Amit Kapila (#3)

1 attachment(s)

On Sat, Jan 16, 2016 at 10:23 AM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Fri, Jan 15, 2016 at 11:23 AM, Mithun Cy <mithun.cy@enterprisedb.com>

wrote:

On Mon, Jan 4, 2016 at 2:28 PM, Andres Freund <andres@anarazel.de> wrote:

I think at the very least the cache should be protected by a separate
lock, and that lock should be acquired with TryLock. I.e. the cache is
updated opportunistically. I'd go for an lwlock in the first iteration.

I also think this observation of yours is right and currently that is
okay because we always first check TransactionIdIsCurrentTransactionId().

+ const uint32 snapshot_cached= 0;

I have fixed all of the issues reported by regress test. Also now when
backend try to cache the snapshot we also try to store the self-xid and sub
xid, so other backends can use them.

I also did some read-only perf tests.

Non-Default Settings.
================
scale_factor=300.

./postgres -c shared_buffers=16GB -N 200 -c min_wal_size=15GB -c
max_wal_size=20GB -c checkpoint_timeout=900 -c maintenance_work_mem=1GB -c
checkpoint_completion_target=0.9

./pgbench -c $clients -j $clients -T 300 -M prepared -S postgres

Machine Detail:
cpu : POWER8
cores: 24 (192 with HT).

Clients Base With cached snapshot
1 19653.914409 19926.884664
16 190764.519336 190040.299297
32 339327.881272 354467.445076
48 462632.02493 464767.917813
64 522642.515148 533271.556703
80 515262.813189 513353.962521

But did not see any perf improvement. Will continue testing the same.

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Attachments:

Cache_data_in_GetSnapshotData_POC.patchtext/x-patch; charset=US-ASCII; name=Cache_data_in_GetSnapshotData_POC.patchDownload

diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 5cb28cf..a441ca0 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1561,6 +1561,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 			elog(ERROR, "cache lookup failed for relation %u", OIDOldHeap);
 		relform = (Form_pg_class) GETSTRUCT(reltup);
 
+		Assert(TransactionIdIsNormal(frozenXid));
 		relform->relfrozenxid = frozenXid;
 		relform->relminmxid = cutoffMulti;
 
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 97e8962..8db028f 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -372,11 +372,19 @@ ProcArrayRemove(PGPROC *proc, TransactionId latestXid)
 					(arrayP->numProcs - index - 1) * sizeof(int));
 			arrayP->pgprocnos[arrayP->numProcs - 1] = -1;		/* for debugging */
 			arrayP->numProcs--;
+
+			pg_atomic_write_u32(&ProcGlobal->snapshot_cached, 0);
+
+			ProcGlobal->cached_snapshot_valid = false;
 			LWLockRelease(ProcArrayLock);
 			return;
 		}
 	}
 
+	pg_atomic_write_u32(&ProcGlobal->snapshot_cached, 0);
+
+	ProcGlobal->cached_snapshot_valid = false;
+
 	/* Ooops */
 	LWLockRelease(ProcArrayLock);
 
@@ -420,6 +428,8 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
 		if (LWLockConditionalAcquire(ProcArrayLock, LW_EXCLUSIVE))
 		{
 			ProcArrayEndTransactionInternal(proc, pgxact, latestXid);
+			pg_atomic_write_u32(&ProcGlobal->snapshot_cached, 0);
+			ProcGlobal->cached_snapshot_valid = false;
 			LWLockRelease(ProcArrayLock);
 		}
 		else
@@ -568,6 +578,9 @@ ProcArrayGroupClearXid(PGPROC *proc, TransactionId latestXid)
 		nextidx = pg_atomic_read_u32(&proc->procArrayGroupNext);
 	}
 
+	pg_atomic_write_u32(&ProcGlobal->snapshot_cached, 0);
+	ProcGlobal->cached_snapshot_valid = false;
+
 	/* We're done with the lock now. */
 	LWLockRelease(ProcArrayLock);
 
@@ -1553,6 +1566,8 @@ GetSnapshotData(Snapshot snapshot)
 					 errmsg("out of memory")));
 	}
 
+	snapshot->takenDuringRecovery = RecoveryInProgress();
+
 	/*
 	 * It is sufficient to get shared lock on ProcArrayLock, even if we are
 	 * going to set MyPgXact->xmin.
@@ -1567,12 +1582,39 @@ GetSnapshotData(Snapshot snapshot)
 	/* initialize xmin calculation with xmax */
 	globalxmin = xmin = xmax;
 
-	snapshot->takenDuringRecovery = RecoveryInProgress();
+	if (!snapshot->takenDuringRecovery && ProcGlobal->cached_snapshot_valid)
+	{
+		Snapshot csnap = &ProcGlobal->cached_snapshot;
+		TransactionId *saved_xip;
+		TransactionId *saved_subxip;
 
-	if (!snapshot->takenDuringRecovery)
+		saved_xip = snapshot->xip;
+		saved_subxip = snapshot->subxip;
+
+		memcpy(snapshot, csnap, sizeof(SnapshotData));
+
+		snapshot->xip = saved_xip;
+		snapshot->subxip = saved_subxip;
+
+		memcpy(snapshot->xip, csnap->xip,
+			   sizeof(TransactionId) * csnap->xcnt);
+		memcpy(snapshot->subxip, csnap->subxip,
+			   sizeof(TransactionId) * csnap->subxcnt);
+		globalxmin = ProcGlobal->cached_snapshot_globalxmin;
+		xmin = csnap->xmin;
+
+		count = csnap->xcnt;
+		subcount = csnap->subxcnt;
+		suboverflowed = csnap->suboverflowed;
+
+		Assert(TransactionIdIsValid(globalxmin));
+		Assert(TransactionIdIsValid(xmin));
+	}
+	else if (!snapshot->takenDuringRecovery)
 	{
 		int		   *pgprocnos = arrayP->pgprocnos;
 		int			numProcs;
+		uint32 snapshot_cached= 0;
 
 		/*
 		 * Spin over procArray checking xid, xmin, and subxids.  The goal is
@@ -1587,14 +1629,11 @@ GetSnapshotData(Snapshot snapshot)
 			TransactionId xid;
 
 			/*
-			 * Backend is doing logical decoding which manages xmin
-			 * separately, check below.
+			 * Ignore procs running LAZY VACUUM (which don't need to retain
+			 * rows) and backends doing logical decoding (which manages xmin
+			 * separately, check below).
 			 */
-			if (pgxact->vacuumFlags & PROC_IN_LOGICAL_DECODING)
-				continue;
-
-			/* Ignore procs running LAZY VACUUM */
-			if (pgxact->vacuumFlags & PROC_IN_VACUUM)
+			if (pgxact->vacuumFlags & (PROC_IN_LOGICAL_DECODING | PROC_IN_VACUUM))
 				continue;
 
 			/* Update globalxmin to be the smallest valid xmin */
@@ -1663,6 +1702,62 @@ GetSnapshotData(Snapshot snapshot)
 				}
 			}
 		}
+
+		/*
+		 * Let only one of the caller cache the computed snapshot, and others
+		 * can continue as before.
+		 */
+		if (pg_atomic_compare_exchange_u32(&ProcGlobal->snapshot_cached,
+					&snapshot_cached, 1))
+		{
+			Snapshot csnap = &ProcGlobal->cached_snapshot;
+			TransactionId *saved_xip;
+			TransactionId *saved_subxip;
+			int curr_subcount= subcount;
+
+			ProcGlobal->cached_snapshot_globalxmin = globalxmin;
+
+			saved_xip = csnap->xip;
+			saved_subxip = csnap->subxip;
+			memcpy(csnap, snapshot, sizeof(SnapshotData));
+			csnap->xip = saved_xip;
+			csnap->subxip = saved_subxip;
+
+			/* not yet stored. Yuck */
+			csnap->xmin = xmin;
+
+			memcpy(csnap->xip, snapshot->xip,
+				   sizeof(TransactionId) * count);
+			memcpy(csnap->subxip, snapshot->subxip,
+				   sizeof(TransactionId) * subcount);
+
+			/* Add my own XID to snapshot. */
+			csnap->xip[count] = MyPgXact->xid;
+			csnap->xcnt = count + 1;
+
+			if (!suboverflowed)
+			{
+				if (MyPgXact->overflowed)
+					suboverflowed = true;
+				else
+				{
+					int			nxids = MyPgXact->nxids;
+
+					if (nxids > 0)
+					{
+						memcpy(csnap->subxip + subcount,
+							   (void *) MyProc->subxids.xids,
+							   nxids * sizeof(TransactionId));
+						curr_subcount += nxids;
+					}
+				}
+			}
+
+			csnap->subxcnt = curr_subcount;
+			csnap->suboverflowed = suboverflowed;
+
+			ProcGlobal->cached_snapshot_valid = true;
+		}
 	}
 	else
 	{
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 6453b88..b8d0297 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -51,7 +51,7 @@
 #include "storage/spin.h"
 #include "utils/timeout.h"
 #include "utils/timestamp.h"
-
+#include "port/atomics.h"
 
 /* GUC variables */
 int			DeadlockTimeout = 1000;
@@ -114,6 +114,13 @@ ProcGlobalShmemSize(void)
 	size = add_size(size, mul_size(NUM_AUXILIARY_PROCS, sizeof(PGXACT)));
 	size = add_size(size, mul_size(max_prepared_xacts, sizeof(PGXACT)));
 
+	/* for the cached snapshot */
+#define PROCARRAY_MAXPROCS	(MaxBackends + max_prepared_xacts)
+	size = add_size(size, mul_size(sizeof(TransactionId), PROCARRAY_MAXPROCS));
+#define TOTAL_MAX_CACHED_SUBXIDS \
+	((PGPROC_MAX_CACHED_SUBXIDS + 1) * PROCARRAY_MAXPROCS)
+	size = add_size(size, mul_size(sizeof(TransactionId), TOTAL_MAX_CACHED_SUBXIDS));
+
 	return size;
 }
 
@@ -278,6 +285,13 @@ InitProcGlobal(void)
 	/* Create ProcStructLock spinlock, too */
 	ProcStructLock = (slock_t *) ShmemAlloc(sizeof(slock_t));
 	SpinLockInit(ProcStructLock);
+
+	/* cached snapshot */
+	ProcGlobal->cached_snapshot_valid = false;
+    pg_atomic_write_u32(&ProcGlobal->snapshot_cached, 0);
+	ProcGlobal->cached_snapshot.xip = ShmemAlloc(PROCARRAY_MAXPROCS * sizeof(TransactionId));
+	ProcGlobal->cached_snapshot.subxip = ShmemAlloc(TOTAL_MAX_CACHED_SUBXIDS * sizeof(TransactionId));
+	ProcGlobal->cached_snapshot_globalxmin = InvalidTransactionId;
 }
 
 /*
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index dbcdd3f..73a424e 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -16,6 +16,7 @@
 
 #include "access/xlogdefs.h"
 #include "lib/ilist.h"
+#include "utils/snapshot.h"
 #include "storage/latch.h"
 #include "storage/lock.h"
 #include "storage/pg_sema.h"
@@ -234,6 +235,12 @@ typedef struct PROC_HDR
 	int			startupProcPid;
 	/* Buffer id of the buffer that Startup process waits for pin on, or -1 */
 	int			startupBufferPinWaitBufId;
+
+	/* Cached snapshot */
+	volatile bool		cached_snapshot_valid;
+    pg_atomic_uint32    snapshot_cached;
+	SnapshotData cached_snapshot;
+	TransactionId cached_snapshot_globalxmin;
 } PROC_HDR;
 
 extern PROC_HDR *ProcGlobal;

Robert Haas

robertmhaas@gmail.com

almost 10 years ago

In reply to: Mithun Cy (#4)

On Thu, Feb 25, 2016 at 12:57 PM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:

I have fixed all of the issues reported by regress test. Also now when
backend try to cache the snapshot we also try to store the self-xid and sub
xid, so other backends can use them.

I also did some read-only perf tests.

I'm not sure what you are doing that keeps breaking threading for
Gmail, but this thread is getting split up for me into multiple
threads with only a few messages in each. The same seems to be
happening in the community archives. Please try to figure out what is
causing that and stop doing it.

I notice you seem to not to have implemented this suggestion by Andres:

http://www.postgresql.org//message-id/20160104085845.m5nrypvmmpea5nm7@alap3.anarazel.de

Also, you should test this on a machine with more than 2 cores.
Andres's original test seems to have been on a 4-core system, where
this would be more likely to work out.

Also, Andres suggested testing this on an 80-20 write mix, where as
you tested it on 100% read-only. In that case there is no blocking
ProcArrayLock, which reduces the chances that the patch will benefit.

I suspect, too, that the chances of this patch working out have
probably been reduced by 0e141c0fbb211bdd23783afa731e3eef95c9ad7a,
which seems to be targeting mostly the same problem. I mean it's
possible that this patch could win even when no ProcArrayLock-related
blocking is happening, but the original idea seems to have been that
it would help mostly with that case. You could try benchmarking it on
the commit just before that one and then on current sources and see if
you get the same results on both, or if there was more benefit before
that commit.

Also, you could consider repeating the LWLOCK_STATS testing that Amit
did in his original reply to Andres. That would tell you whether the
performance is not increasing because the patch doesn't reduce
ProcArrayLock contention, or whether the performance is not increasing
because the patch DOES reduce ProcArrayLock contention but then the
contention shifts to CLogControlLock or elsewhere.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Amit Kapila

amit.kapila16@gmail.com

almost 10 years ago

In reply to: Robert Haas (#5)

On Fri, Feb 26, 2016 at 2:55 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Feb 25, 2016 at 12:57 PM, Mithun Cy <mithun.cy@enterprisedb.com>

wrote:

I have fixed all of the issues reported by regress test. Also now when
backend try to cache the snapshot we also try to store the self-xid and

sub

xid, so other backends can use them.

+ /* Add my own XID to snapshot. */
+ csnap->xip[count] = MyPgXact->xid;
+ csnap->xcnt = count + 1;

Don't we need to add this only when the xid of current transaction is
valid? Also, I think it will be better if we can explain why we need to
add the our own transaction id while caching the snapshot.

I also did some read-only perf tests.

I'm not sure what you are doing that keeps breaking threading for
Gmail, but this thread is getting split up for me into multiple
threads with only a few messages in each. The same seems to be
happening in the community archives. Please try to figure out what is
causing that and stop doing it.

I notice you seem to not to have implemented this suggestion by Andres:

http://www.postgresql.org//message-id/20160104085845.m5nrypvmmpea5nm7@alap3.anarazel.de

As far as I can understand by reading the patch, it is kind of already
implemented the first suggestion by Andres which is to use try lock, now
the patch is using atomic ops to implement the same instead of using
lwlock, but I think it should show the same impact, do you see any problem
with the same?

Now talking about second suggestion which is to use some form of 'snapshot
slots' to reduce the contention further, it seems that could be beneficial,
if see any gain with try lock. If you think that can benefit over current
approach taken in patch, then we can discuss how to implement the same.

Also, you should test this on a machine with more than 2 cores.
Andres's original test seems to have been on a 4-core system, where
this would be more likely to work out.

Also, Andres suggested testing this on an 80-20 write mix, where as
you tested it on 100% read-only. In that case there is no blocking
ProcArrayLock, which reduces the chances that the patch will benefit.

Also, you could consider repeating the LWLOCK_STATS testing that Amit
did in his original reply to Andres. That would tell you whether the
performance is not increasing because the patch doesn't reduce
ProcArrayLock contention, or whether the performance is not increasing
because the patch DOES reduce ProcArrayLock contention but then the
contention shifts to CLogControlLock or elsewhere.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Mithun Cy

mithun.cy@enterprisedb.com

almost 10 years ago

In reply to: Amit Kapila (#6)

2 attachment(s)

On Tue, Mar 1, 2016 at 12:59 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

Don't we need to add this only when the xid of current transaction is

valid? Also, I think it will be better if we can explain why we need to
add the our >own transaction id while caching the snapshot.
I have fixed the same thing and patch is attached.

Some more tests done after that

*pgbench write tests: on 8 socket, 64 core machine.*

/postgres -c shared_buffers=16GB -N 200 -c min_wal_size=15GB -c
max_wal_size=20GB -c checkpoint_timeout=900 -c maintenance_work_mem=1GB -c
checkpoint_completion_target=0.9

./pgbench -c $clients -j $clients -T 1800 -M prepared postgres

[image: Inline image 3]

A small improvement in performance at 64 thread.

*LWLock_Stats data:*

ProcArrayLock: Base.

=================

postgresql-2016-03-01_115252.log:PID 110019 lwlock main 4: shacq 1867601
exacq 35625 blk 134682 spindelay 128 dequeue self 28871

postgresql-2016-03-01_115253.log:PID 110115 lwlock main 4: shacq 2201613
exacq 43489 blk 155499 spindelay 127 dequeue self 32751

postgresql-2016-03-01_115253.log:PID 110122 lwlock main 4: shacq 2231327
exacq 44824 blk 159440 spindelay 128 dequeue self 33336

postgresql-2016-03-01_115254.log:PID 110126 lwlock main 4: shacq 2247983
exacq 44632 blk 158669 spindelay 131 dequeue self 33365

postgresql-2016-03-01_115254.log:PID 110059 lwlock main 4: shacq 2036809
exacq 38607 blk 143538 spindelay 117 dequeue self 31008

ProcArrayLock: With Patch.

=====================

postgresql-2016-03-01_124747.log:PID 1789 lwlock main 4: shacq 2273958
exacq 61605 blk 79581 spindelay 307 dequeue self 66088

postgresql-2016-03-01_124748.log:PID 1880 lwlock main 4: shacq 2456388
exacq 65996 blk 82300 spindelay 470 dequeue self 68770

postgresql-2016-03-01_124748.log:PID 1765 lwlock main 4: shacq 2244083
exacq 60835 blk 79042 spindelay 336 dequeue self 65212

postgresql-2016-03-01_124749.log:PID 1882 lwlock main 4: shacq 2489271
exacq 67043 blk 85650 spindelay 463 dequeue self 68401

postgresql-2016-03-01_124749.log:PID 1753 lwlock main 4: shacq 2232791
exacq 60647 blk 78659 spindelay 364 dequeue self 65180

postgresql-2016-03-01_124750.log:PID 1849 lwlock main 4: shacq 2374922
exacq 64075 blk 81860 spindelay 339 dequeue self 67584

*-------------Block time of ProcArrayLock has reduced significantly.*

ClogControlLock : Base.

===================

postgresql-2016-03-01_115302.log:PID 110040 lwlock main 11: shacq 586129
exacq 268808 blk 90570 spindelay 261 dequeue self 59619

postgresql-2016-03-01_115303.log:PID 110047 lwlock main 11: shacq 593492
exacq 271019 blk 89547 spindelay 268 dequeue self 59999

postgresql-2016-03-01_115303.log:PID 110078 lwlock main 11: shacq 620830
exacq 285244 blk 92939 spindelay 262 dequeue self 61912

postgresql-2016-03-01_115304.log:PID 110083 lwlock main 11: shacq 633101
exacq 289983 blk 93485 spindelay 262 dequeue self 62394

postgresql-2016-03-01_115305.log:PID 110105 lwlock main 11: shacq 646584
exacq 297598 blk 93331 spindelay 312 dequeue self 63279

ClogControlLock : With Patch.

=======================

postgresql-2016-03-01_124730.log:PID 1865 lwlock main 11: shacq 722881
exacq 330273 blk 106163 spindelay 468 dequeue self 80316

postgresql-2016-03-01_124731.log:PID 1857 lwlock main 11: shacq 713720
exacq 327158 blk 106719 spindelay 439 dequeue self 79996

postgresql-2016-03-01_124732.log:PID 1826 lwlock main 11: shacq 696762
exacq 317239 blk 104523 spindelay 448 dequeue self 79374

postgresql-2016-03-01_124732.log:PID 1862 lwlock main 11: shacq 721272
exacq 330350 blk 105965 spindelay 492 dequeue self 81036

postgresql-2016-03-01_124733.log:PID 1879 lwlock main 11: shacq 737398
exacq 335357 blk 105424 spindelay 520 dequeue self 80977

*-------------Block time of ClogControlLock has increased slightly.*

Will continue with further tests on lower clients.

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Attachments:

image.pngimage/png; name=image.pngDownload

Cache_data_in_GetSnapshotData_POC_01.patchapplication/octet-stream; name=Cache_data_in_GetSnapshotData_POC_01.patchDownload

diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 5cb28cf..a441ca0 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1561,6 +1561,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 			elog(ERROR, "cache lookup failed for relation %u", OIDOldHeap);
 		relform = (Form_pg_class) GETSTRUCT(reltup);
 
+		Assert(TransactionIdIsNormal(frozenXid));
 		relform->relfrozenxid = frozenXid;
 		relform->relminmxid = cutoffMulti;
 
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 97e8962..e1e4681 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -372,11 +372,19 @@ ProcArrayRemove(PGPROC *proc, TransactionId latestXid)
 					(arrayP->numProcs - index - 1) * sizeof(int));
 			arrayP->pgprocnos[arrayP->numProcs - 1] = -1;		/* for debugging */
 			arrayP->numProcs--;
+
+			pg_atomic_write_u32(&ProcGlobal->snapshot_cached, 0);
+
+			ProcGlobal->cached_snapshot_valid = false;
 			LWLockRelease(ProcArrayLock);
 			return;
 		}
 	}
 
+	pg_atomic_write_u32(&ProcGlobal->snapshot_cached, 0);
+
+	ProcGlobal->cached_snapshot_valid = false;
+
 	/* Ooops */
 	LWLockRelease(ProcArrayLock);
 
@@ -420,6 +428,8 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
 		if (LWLockConditionalAcquire(ProcArrayLock, LW_EXCLUSIVE))
 		{
 			ProcArrayEndTransactionInternal(proc, pgxact, latestXid);
+			pg_atomic_write_u32(&ProcGlobal->snapshot_cached, 0);
+			ProcGlobal->cached_snapshot_valid = false;
 			LWLockRelease(ProcArrayLock);
 		}
 		else
@@ -568,6 +578,9 @@ ProcArrayGroupClearXid(PGPROC *proc, TransactionId latestXid)
 		nextidx = pg_atomic_read_u32(&proc->procArrayGroupNext);
 	}
 
+	pg_atomic_write_u32(&ProcGlobal->snapshot_cached, 0);
+	ProcGlobal->cached_snapshot_valid = false;
+
 	/* We're done with the lock now. */
 	LWLockRelease(ProcArrayLock);
 
@@ -1553,6 +1566,8 @@ GetSnapshotData(Snapshot snapshot)
 					 errmsg("out of memory")));
 	}
 
+	snapshot->takenDuringRecovery = RecoveryInProgress();
+
 	/*
 	 * It is sufficient to get shared lock on ProcArrayLock, even if we are
 	 * going to set MyPgXact->xmin.
@@ -1567,12 +1582,39 @@ GetSnapshotData(Snapshot snapshot)
 	/* initialize xmin calculation with xmax */
 	globalxmin = xmin = xmax;
 
-	snapshot->takenDuringRecovery = RecoveryInProgress();
+	if (!snapshot->takenDuringRecovery && ProcGlobal->cached_snapshot_valid)
+	{
+		Snapshot csnap = &ProcGlobal->cached_snapshot;
+		TransactionId *saved_xip;
+		TransactionId *saved_subxip;
+
+		saved_xip = snapshot->xip;
+		saved_subxip = snapshot->subxip;
 
-	if (!snapshot->takenDuringRecovery)
+		memcpy(snapshot, csnap, sizeof(SnapshotData));
+
+		snapshot->xip = saved_xip;
+		snapshot->subxip = saved_subxip;
+
+		memcpy(snapshot->xip, csnap->xip,
+			   sizeof(TransactionId) * csnap->xcnt);
+		memcpy(snapshot->subxip, csnap->subxip,
+			   sizeof(TransactionId) * csnap->subxcnt);
+		globalxmin = ProcGlobal->cached_snapshot_globalxmin;
+		xmin = csnap->xmin;
+
+		count = csnap->xcnt;
+		subcount = csnap->subxcnt;
+		suboverflowed = csnap->suboverflowed;
+
+		Assert(TransactionIdIsValid(globalxmin));
+		Assert(TransactionIdIsValid(xmin));
+	}
+	else if (!snapshot->takenDuringRecovery)
 	{
 		int		   *pgprocnos = arrayP->pgprocnos;
 		int			numProcs;
+		uint32 snapshot_cached= 0;
 
 		/*
 		 * Spin over procArray checking xid, xmin, and subxids.  The goal is
@@ -1587,14 +1629,11 @@ GetSnapshotData(Snapshot snapshot)
 			TransactionId xid;
 
 			/*
-			 * Backend is doing logical decoding which manages xmin
-			 * separately, check below.
+			 * Ignore procs running LAZY VACUUM (which don't need to retain
+			 * rows) and backends doing logical decoding (which manages xmin
+			 * separately, check below).
 			 */
-			if (pgxact->vacuumFlags & PROC_IN_LOGICAL_DECODING)
-				continue;
-
-			/* Ignore procs running LAZY VACUUM */
-			if (pgxact->vacuumFlags & PROC_IN_VACUUM)
+			if (pgxact->vacuumFlags & (PROC_IN_LOGICAL_DECODING | PROC_IN_VACUUM))
 				continue;
 
 			/* Update globalxmin to be the smallest valid xmin */
@@ -1663,6 +1702,79 @@ GetSnapshotData(Snapshot snapshot)
 				}
 			}
 		}
+
+		/*
+		 * Let only one of the caller cache the computed snapshot, and others
+		 * can continue as before.
+		 */
+		if (pg_atomic_compare_exchange_u32(&ProcGlobal->snapshot_cached,
+					&snapshot_cached, 1))
+		{
+			Snapshot csnap = &ProcGlobal->cached_snapshot;
+			TransactionId *saved_xip;
+			TransactionId *saved_subxip;
+			int curr_subcount= subcount;
+
+			ProcGlobal->cached_snapshot_globalxmin = globalxmin;
+
+			saved_xip = csnap->xip;
+			saved_subxip = csnap->subxip;
+			memcpy(csnap, snapshot, sizeof(SnapshotData));
+			csnap->xip = saved_xip;
+			csnap->subxip = saved_subxip;
+
+			/* not yet stored. Yuck */
+			csnap->xmin = xmin;
+
+			memcpy(csnap->xip, snapshot->xip,
+				   sizeof(TransactionId) * count);
+			memcpy(csnap->subxip, snapshot->subxip,
+				   sizeof(TransactionId) * subcount);
+			/*
+			 * If the transaction has no XID assigned, we can skip it; it
+			 * won't have sub-XIDs either.  If the XID is >= xmax, we can also
+			 * skip it; such transactions will be treated as running anyway
+			 * (and any sub-XIDs will also be >= xmax).
+			 */
+			if (TransactionIdIsNormal(MyPgXact->xid) && NormalTransactionIdPrecedes(MyPgXact->xid, xmax))
+		    {
+			
+				/* Add my own XID and sub-XIDs to snapshot. */
+				csnap->xip[count] = MyPgXact->xid;
+				csnap->xcnt = count + 1;
+
+				if (!suboverflowed)
+				{
+					if (MyPgXact->overflowed)
+						suboverflowed = true;
+					else
+					{
+						int			nxids = MyPgXact->nxids;
+
+						if (nxids > 0)
+						{
+							memcpy(csnap->subxip + subcount,
+							   	(void *) MyProc->subxids.xids,
+							   	nxids * sizeof(TransactionId));
+							curr_subcount += nxids;
+						}
+					}
+				}
+
+			}
+			else
+				csnap->xcnt = count;
+
+			csnap->subxcnt = curr_subcount;
+			csnap->suboverflowed = suboverflowed;
+
+	       /*
+	        * The memory barrier has be to be placed here to ensure that
+			* cached_snapshot_valid is set only after snapshot is cached.
+            */
+		    pg_write_barrier();
+			ProcGlobal->cached_snapshot_valid = true;
+		}
 	}
 	else
 	{
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 6453b88..b8d0297 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -51,7 +51,7 @@
 #include "storage/spin.h"
 #include "utils/timeout.h"
 #include "utils/timestamp.h"
-
+#include "port/atomics.h"
 
 /* GUC variables */
 int			DeadlockTimeout = 1000;
@@ -114,6 +114,13 @@ ProcGlobalShmemSize(void)
 	size = add_size(size, mul_size(NUM_AUXILIARY_PROCS, sizeof(PGXACT)));
 	size = add_size(size, mul_size(max_prepared_xacts, sizeof(PGXACT)));
 
+	/* for the cached snapshot */
+#define PROCARRAY_MAXPROCS	(MaxBackends + max_prepared_xacts)
+	size = add_size(size, mul_size(sizeof(TransactionId), PROCARRAY_MAXPROCS));
+#define TOTAL_MAX_CACHED_SUBXIDS \
+	((PGPROC_MAX_CACHED_SUBXIDS + 1) * PROCARRAY_MAXPROCS)
+	size = add_size(size, mul_size(sizeof(TransactionId), TOTAL_MAX_CACHED_SUBXIDS));
+
 	return size;
 }
 
@@ -278,6 +285,13 @@ InitProcGlobal(void)
 	/* Create ProcStructLock spinlock, too */
 	ProcStructLock = (slock_t *) ShmemAlloc(sizeof(slock_t));
 	SpinLockInit(ProcStructLock);
+
+	/* cached snapshot */
+	ProcGlobal->cached_snapshot_valid = false;
+    pg_atomic_write_u32(&ProcGlobal->snapshot_cached, 0);
+	ProcGlobal->cached_snapshot.xip = ShmemAlloc(PROCARRAY_MAXPROCS * sizeof(TransactionId));
+	ProcGlobal->cached_snapshot.subxip = ShmemAlloc(TOTAL_MAX_CACHED_SUBXIDS * sizeof(TransactionId));
+	ProcGlobal->cached_snapshot_globalxmin = InvalidTransactionId;
 }
 
 /*
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index dbcdd3f..73a424e 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -16,6 +16,7 @@
 
 #include "access/xlogdefs.h"
 #include "lib/ilist.h"
+#include "utils/snapshot.h"
 #include "storage/latch.h"
 #include "storage/lock.h"
 #include "storage/pg_sema.h"
@@ -234,6 +235,12 @@ typedef struct PROC_HDR
 	int			startupProcPid;
 	/* Buffer id of the buffer that Startup process waits for pin on, or -1 */
 	int			startupBufferPinWaitBufId;
+
+	/* Cached snapshot */
+	volatile bool		cached_snapshot_valid;
+    pg_atomic_uint32    snapshot_cached;
+	SnapshotData cached_snapshot;
+	TransactionId cached_snapshot_globalxmin;
 } PROC_HDR;
 
 extern PROC_HDR *ProcGlobal;

Robert Haas

robertmhaas@gmail.com

almost 10 years ago

In reply to: Mithun Cy (#7)

1 attachment(s)

On Thu, Mar 3, 2016 at 6:20 AM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:

On Tue, Mar 1, 2016 at 12:59 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

Don't we need to add this only when the xid of current transaction is

valid? Also, I think it will be better if we can explain why we need to
add the our >own transaction id while caching the snapshot.
I have fixed the same thing and patch is attached.

Some more tests done after that

*pgbench write tests: on 8 socket, 64 core machine.*

/postgres -c shared_buffers=16GB -N 200 -c min_wal_size=15GB -c
max_wal_size=20GB -c checkpoint_timeout=900 -c maintenance_work_mem=1GB -c
checkpoint_completion_target=0.9

./pgbench -c $clients -j $clients -T 1800 -M prepared postgres

[image: Inline image 3]

A small improvement in performance at 64 thread.

What if you apply both this and Amit's clog control log patch(es)? Maybe
the combination of the two helps substantially more than either one alone.

Or maybe not, but seems worth checking.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Mithun Cy

mithun.cy@enterprisedb.com

almost 10 years ago

In reply to: Robert Haas (#8)

On Thu, Mar 3, 2016 at 11:50 PM, Robert Haas <robertmhaas@gmail.com> wrote:

What if you apply both this and Amit's clog control log patch(es)? Maybe

the combination of the two helps substantially more than either >one alone.

I did the above tests along with Amit's clog patch. Machine :8 socket, 64
core. 128 hyperthread.

clients BASE ONLY CLOG CHANGES % Increase ONLY SAVE SNAPSHOT % Increase CLOG
CHANGES + SAVE SNAPSHOT % Increase
64 29247.658034 30855.728835 5.4981181711 29254.532186 0.0235032562
32691.832776 11.7758992463
88 31214.305391 33313.393095 6.7247618606 32109.248609 2.8670931702
35433.655203 13.5173592978
128 30896.673949 34015.362152 10.0939285832 *** *** 34947.296355
13.110221549
256 27183.780921 31192.895437 14.7481857938 *** *** 32873.782735
20.9316056164
With clog changes, perf of caching the snapshot patch improves.

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

#10

Amit Kapila

amit.kapila16@gmail.com

almost 10 years ago

In reply to: Mithun Cy (#9)

On Thu, Mar 10, 2016 at 1:04 PM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:

On Thu, Mar 3, 2016 at 11:50 PM, Robert Haas <robertmhaas@gmail.com>
wrote:

What if you apply both this and Amit's clog control log patch(es)? Maybe

the combination of the two helps substantially more than either >one alone.

I did the above tests along with Amit's clog patch. Machine :8 socket, 64
core. 128 hyperthread.

clients BASE ONLY CLOG CHANGES % Increase ONLY SAVE SNAPSHOT % Increase CLOG
CHANGES + SAVE SNAPSHOT % Increase
64 29247.658034 30855.728835 5.4981181711 29254.532186 0.0235032562
32691.832776 11.7758992463
88 31214.305391 33313.393095 6.7247618606 32109.248609 2.8670931702
35433.655203 13.5173592978
128 30896.673949 34015.362152 10.0939285832 *** *** 34947.296355
13.110221549
256 27183.780921 31192.895437 14.7481857938 *** *** 32873.782735
20.9316056164
With clog changes, perf of caching the snapshot patch improves.

This data looks promising to me and indicates that saving the snapshot has
benefits and we can see noticeable performance improvement especially once
the CLog contention gets reduced. I wonder if we should try these tests
with unlogged tables, because I suspect most of the contention after
CLogControlLock and ProcArrayLock is for WAL related locks, so you might be
able to see better gain with these patches. Do you think it makes sense to
check the performance by increasing CLOG buffers (patch for same is posted
in Speed up Clog thread [1]/messages/by-id/CAA4eK1LMMGNQ439BUm0LcS3p0sb8S9kc-cUGU_ThNqMwA8_Tug@mail.gmail.com) as that also relieves contention on CLOG as
per the tests I have done?

[1]: /messages/by-id/CAA4eK1LMMGNQ439BUm0LcS3p0sb8S9kc-cUGU_ThNqMwA8_Tug@mail.gmail.com
/messages/by-id/CAA4eK1LMMGNQ439BUm0LcS3p0sb8S9kc-cUGU_ThNqMwA8_Tug@mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#11

Mithun Cy

mithun.cy@enterprisedb.com

almost 10 years ago

In reply to: Amit Kapila (#10)

Thanks Amit,
I did a quick pgbench write tests for unlogged tables at 88 clients as it
had the peak performance from previous test. There is big jump in TPS due
to clog changes.

clients BASE ONLY CLOG CHANGES % Increase ONLY SAVE SNAPSHOT % Increase CLOG
CHANGES + SAVE SNAPSHOT % Increase
88 36055.425005 52796.618434 46.4318294034 37728.004118 4.6389111008
56025.454917 55.3870323515
Clients + WITH INCREASE IN CLOG BUFFER % increase
88 58217.924138 61.4678626862

I will continue to benchmark above tests with much wider range of clients.

On Thu, Mar 10, 2016 at 1:36 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Thu, Mar 10, 2016 at 1:04 PM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:

On Thu, Mar 3, 2016 at 11:50 PM, Robert Haas <robertmhaas@gmail.com>
wrote:

What if you apply both this and Amit's clog control log patch(es)?

Maybe the combination of the two helps substantially more than either >one
alone.

I did the above tests along with Amit's clog patch. Machine :8 socket, 64
core. 128 hyperthread.

clients BASE ONLY CLOG CHANGES % Increase ONLY SAVE SNAPSHOT % Increase CLOG
CHANGES + SAVE SNAPSHOT % Increase
64 29247.658034 30855.728835 5.4981181711 29254.532186 0.0235032562
32691.832776 11.7758992463
88 31214.305391 33313.393095 6.7247618606 32109.248609 2.8670931702
35433.655203 13.5173592978
128 30896.673949 34015.362152 10.0939285832 *** *** 34947.296355
13.110221549
256 27183.780921 31192.895437 14.7481857938 *** *** 32873.782735
20.9316056164
With clog changes, perf of caching the snapshot patch improves.

This data looks promising to me and indicates that saving the snapshot has
benefits and we can see noticeable performance improvement especially once
the CLog contention gets reduced. I wonder if we should try these tests
with unlogged tables, because I suspect most of the contention after
CLogControlLock and ProcArrayLock is for WAL related locks, so you might be
able to see better gain with these patches. Do you think it makes sense to
check the performance by increasing CLOG buffers (patch for same is posted
in Speed up Clog thread [1]) as that also relieves contention on CLOG as
per the tests I have done?

[1] -
/messages/by-id/CAA4eK1LMMGNQ439BUm0LcS3p0sb8S9kc-cUGU_ThNqMwA8_Tug@mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

#12

Mithun Cy

mithun.cy@enterprisedb.com

almost 10 years ago

In reply to: Mithun Cy (#11)

On Thu, Mar 3, 2016 at 6:20 AM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:

I will continue to benchmark above tests with much wider range of clients.

Latest Benchmarking shows following results for unlogged tables.

clients BASE ONLY CLOG CHANGES % Increase CLOG CHANGES + SAVE SNAPSHOT %
Increase
1 1198.326337 1328.069656 10.8270439357 1234.078342 2.9834948875
32 37455.181727 38295.250519 2.2428640131 41023.126293 9.5259037641
64 48838.016451 50675.845885 3.7631123611 51662.814319 5.7840143259
88 36878.187766 53173.577363 44.1870671639 56025.454917 51.9203038731
128 35901.537773 52026.024098 44.9130798434 53864.486733 50.0339263281
256 28130.354402 46793.134156 66.3439197647 46817.04602 66.4289235427
Performance diff in 1 client seems just a run to run variance.

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

#13

Robert Haas

robertmhaas@gmail.com

almost 10 years ago

In reply to: Mithun Cy (#12)

On Wed, Mar 16, 2016 at 4:40 AM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:

On Thu, Mar 3, 2016 at 6:20 AM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:

I will continue to benchmark above tests with much wider range of

clients.

Latest Benchmarking shows following results for unlogged tables.

clients BASE ONLY CLOG CHANGES % Increase CLOG CHANGES + SAVE SNAPSHOT %
Increase
1 1198.326337 1328.069656 10.8270439357 1234.078342 2.9834948875
32 37455.181727 38295.250519 2.2428640131 41023.126293 9.5259037641
64 48838.016451 50675.845885 3.7631123611 51662.814319 5.7840143259
88 36878.187766 53173.577363 44.1870671639 56025.454917 51.9203038731
128 35901.537773 52026.024098 44.9130798434 53864.486733 50.0339263281
256 28130.354402 46793.134156 66.3439197647 46817.04602 66.4289235427

Whoa. At 64 clients, we're hardly getting any benefit, but then by 88
clients, we're getting a huge benefit. I wonder why there's that sharp
change there.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#14

Andres Freund

andres@anarazel.de

almost 10 years ago

In reply to: Robert Haas (#13)

On 2016-03-16 13:29:22 -0400, Robert Haas wrote:

Whoa. At 64 clients, we're hardly getting any benefit, but then by 88
clients, we're getting a huge benefit. I wonder why there's that sharp
change there.

What's the specifics of the machine tested? I wonder if it either
correlates with the number of hardware threads, real cores, or cache
sizes.

- Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Robert Haas

robertmhaas@gmail.com

almost 10 years ago

In reply to: Andres Freund (#14)

On Wed, Mar 16, 2016 at 1:33 PM, Andres Freund <andres@anarazel.de> wrote:

On 2016-03-16 13:29:22 -0400, Robert Haas wrote:

Whoa. At 64 clients, we're hardly getting any benefit, but then by 88
clients, we're getting a huge benefit. I wonder why there's that sharp
change there.

What's the specifics of the machine tested? I wonder if it either
correlates with the number of hardware threads, real cores, or cache
sizes.

I think this was done on cthulhu, whose /proc/cpuinfo output ends this way:

processor : 127
vendor_id : GenuineIntel
cpu family : 6
model : 47
model name : Intel(R) Xeon(R) CPU E7- 8830 @ 2.13GHz
stepping : 2
microcode : 0x37
cpu MHz : 2129.000
cache size : 24576 KB
physical id : 3
siblings : 16
core id : 25
cpu cores : 8
apicid : 243
initial apicid : 243
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts
rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64
monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1
sse4_2 x2apic popcnt aes lahf_lm ida arat epb dtherm tpr_shadow vnmi
flexpriority ept vpid
bogomips : 4266.62
clflush size : 64
cache_alignment : 64
address sizes : 44 bits physical, 48 bits virtual
power management:

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16

Amit Kapila

amit.kapila16@gmail.com

almost 10 years ago

In reply to: Robert Haas (#13)

On Wed, Mar 16, 2016 at 10:59 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Mar 16, 2016 at 4:40 AM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:

On Thu, Mar 3, 2016 at 6:20 AM, Mithun Cy <mithun.cy@enterprisedb.com>
wrote:

I will continue to benchmark above tests with much wider range of

clients.

Latest Benchmarking shows following results for unlogged tables.

clients BASE ONLY CLOG CHANGES % Increase CLOG CHANGES + SAVE SNAPSHOT %
Increase
1 1198.326337 1328.069656 10.8270439357 1234.078342 2.9834948875
32 37455.181727 38295.250519 2.2428640131 41023.126293 9.5259037641
64 48838.016451 50675.845885 3.7631123611 51662.814319 5.7840143259
88 36878.187766 53173.577363 44.1870671639 56025.454917 51.9203038731
128 35901.537773 52026.024098 44.9130798434 53864.486733 50.0339263281
256 28130.354402 46793.134156 66.3439197647 46817.04602 66.4289235427

Whoa. At 64 clients, we're hardly getting any benefit, but then by 88
clients, we're getting a huge benefit. I wonder why there's that sharp
change there.

If you see, for the Base readings, there is a performance increase up till
64 clients and then there is a fall at 88 clients, which to me indicates
that it hits very high-contention around CLogControlLock at 88 clients
which CLog patch is able to control to a great degree (if required, I think
the same can be verified by LWLock stats data). One reason for hitting
contention at 88 clients is that this machine seems to have 64-cores (it
has 8 sockets and 8 Core(s) per socket) as per below information of lscpu
command.

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 128
On-line CPU(s) list: 0-127
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 8
NUMA node(s): 8
Vendor ID: GenuineIntel
CPU family: 6
Model: 47
Model name: Intel(R) Xeon(R) CPU E7- 8830 @ 2.13GHz
Stepping: 2
CPU MHz: 1064.000
BogoMIPS: 4266.62
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 24576K
NUMA node0 CPU(s): 0,65-71,96-103
NUMA node1 CPU(s): 72-79,104-111
NUMA node2 CPU(s): 80-87,112-119
NUMA node3 CPU(s): 88-95,120-127
NUMA node4 CPU(s): 1-8,33-40
NUMA node5 CPU(s): 9-16,41-48
NUMA node6 CPU(s): 17-24,49-56
NUMA node7 CPU(s): 25-32,57-64

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#17

Mithun Cy

mithun.cy@enterprisedb.com

almost 10 years ago

In reply to: Amit Kapila (#16)

1 attachment(s)

On Thu, Mar 17, 2016 at 9:00 AM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

If you see, for the Base readings, there is a performance increase up till

64 clients and then there is a fall at 88 clients, which to me >indicates
that it hits very high-contention around CLogControlLock at 88 clients
which CLog patch is able to control to a great degree (if >required, I
think the same can be verified by LWLock stats data). One reason for
hitting contention at 88 clients is that this machine >seems to have
64-cores (it has 8 sockets and 8 Core(s) per socket) as per below
information of lscpu command.

I am attaching LWLock stats data for above test setups(unlogged tables).

*BASE* At 64 clients Block-time unit At 88 clients Block-time unit
ProcArrayLock 182946 117827
ClogControlLock 107420 120266
*clog patch*

ProcArrayLock 183663 121215
ClogControlLock 72806 65220
*clog patch + save snapshot*

ProcArrayLock 128260 83356
ClogControlLock 78921 74011
This is for unlogged lables, I mainly see ProcArrayLock have higher
contention at 64 clients and at 88 contention is slightly moved to other
entities.

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Attachments:

lwlock details.odtapplication/vnd.oasis.opendocument.text; name="lwlock details.odt"Download

PK�LrH^�2''mimetypeapplication/vnd.oasis.opendocument.textPK�LrH	��7�7Thumbnails/thumbnail.png�PNG


IHDR����rPLTE ".?*((--5/007+&10-?692D,;K9<D?JW9PVC3B:.J>8RD;DBEJMUKQYVMHPOUZRHXVXEVeK_rWZc^a[U`h\`c\ckXhv_pshXH`VVb[TbYYh^Zf_cjc[xaOsfZecdcdjbjkjfbjejmjbkklejrbmzifsjlrlnynrlmtwsjcpmj{lcqmuwrl|||]r�ct�cv�kw�ky�n�ut�s|�s|�yw�z|�}�z�}u��t��r��|��{��|��y���mY�r_�wk�|v�~l�~����y��v��|��u��{��x��}��}����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������;4�IDATx������} ��]�n���?�&�z/�m�{iw�8�*��5�z���Vje���l�Z�oY�8����E�('�H�	FC
F�I�g%�q�b�v�2�8C:#��e�!i4�w���2���hw��P��x��~ ��;��;��Xc=��0��Xc=��0��Xc=��0���o3��6?X�
'>�-��a��7>1������������{~w��_`�?xxv�l�������m�}[��o��ksvV�5r�fw��j��[�����������rq��=>{����7�3����C�+��Oo�6�wd�'m�R��g�0�qJZ�wdN��Y5�����e�L�������0�B���������ZX�)����U�}���owUa�`C���������0���}5Z�?�n��j���+���7���>H�7���|�bs��/��=�wR?���M�S���������{�M�_�6y6m��w����R���To�f���|y�&8���W0���P���Gt���)4���L�H������E�z���������jn������k_��x�L��y3�~��U^���s~� ^P�![`U����8�����;�Qb���[K�=4y�<Q>9kd��y���s�������(��%�<v��w&;��z)�z�!�br;+Z$���0�^���HR�QQ�c����\������W)�L]D�/(����"jT@�&�I�$��t~�ap���x�G���s�<Z����Xv&�^��N,����5���
���"R��>����ar�vnV��KiC�m+����K�������G����
r�������K!�������-��d�gg������>k����%��h�Q�.�#�_
G�����G�eU��b����;�]oM��SF�f:M(�-�Pm9u�2&���X��B�B^f��x�~)�nZF��{�F
50��U����e8F??�cX�9{�i8��?E�BM�}�cXf�U���*������&���V���u|O}k����5
K7�q���=����"^/H���#���seS���i@.+�^�M-#����D?�gO�au���j�s�H���"d�dx��D����f������!/{�T����bH��=24��������������V��R�Gd^m��bfj
����T~�kw_��,���"�^�nL���������;���R��
�:�����2���69�_��-$��d��!�8yCW�;z������4�PiN_�>����p�GQ��`���|��&�J�N�$!xib' B,M���G�4��N\�sr�����
���u@��o�<�T���T��UF(�&���a�x+����1}b�@����s�x� ��,<�M
�T��]��@��'7O��k�E s��U������G���Q�Jn��E��� �
��.���N�/4����lz��92Oc�}����=��_����(����P�E_����������P���c��M��/	4&�~�����������(}���������"hFhS1������Vj0W���tjps
g���1l��-��F�9��aW-/s���p��}�7^�%�D��4m��k���}���1.lWz(4-��9�lc�r�0��J��x�����,��f����Os�Z�1BV�Ye���u��z�i�J���j���]�b8�-L�m�wn5�W9�P$��R6&�4%Sm}@���V�s�Z�Bm��������T�o����J�F�`(����
���G�7��p[�x���f����������Np-� ���i1���EQ�Q�a��(��`�������[���B#QL�u�����d�����wj���M���B����d��s�3�*w��lR��`��9*�� $Z4^J_�_��/���O�� 'SS\�����e�$��p�Et�?�L�S�j��,1(��K�f�%�<����	�����.$:{�l�x���Ct�w�0����?�O��4���$���V�w�x�&�%
g��0pH�XQ����sO���
��P��{��A �m�_���@ �>6�J���x��/�����A����;z&-�)�����>
(@:��a��Y� A����-��+d�9Q����ZB���K>����o���l�v��	�_))N������;'�O������.DX��2�)rd��4b(��!���U��,2;�E��m�������`��nf��o�N��D�6\��l9���N�u�n����9��+��M�7�E
�21�Jx���dY��l����ZN��u����C�'����:W���F�i[�=��f������!�������O�U0�6����+�P0�o�,m���p��

�PV��gifl���riU������/O����F���)�X����b�P|�w�N�m���~���k���?�?�z����B������P^9T��^U�iA���5�.�|V�\	q�9m���y))��O,dp�	sLqV���5�F����������J�&:��9�?� ���M�32�xz�?�J�]��i���m�}�8���7��#2��<��'�Y�z�J�2%r���Gq�:�������hi����(0�X��5���0��{��O}����f�����*��`��3'{��|��-��AV ���U��@t����G�����v����P_(��9OlW��B�y4���gF�4"BT8������%��{�'V�����Mb���A�?���|ll�����qc�t�5�yo6��0�> ����[�o0l��(�D��.�}����f�R.���	���8����	Ep:�����m���B���I�P�S�D�����E?�g��������+����0~U{��/|��g�|�_�g�����w������j��~e��.\v������}�������C�=���������-������?<�Ej��r������S�����6������7��Xjq�{6}���?���@����a��\|����������^����w?t�;�O��K���������/|��������������[~�����:���A\�Dw����:�rH��)S,#��o���������$�����7�T����C��fa%�wF
�w8�����LR>���(�"I����|C*�?Dz�b_�"�B1��o>��%�?M������������M\�4�`����^����t��������4(��q��P.w���*�Q�T��Wk�����c]���m��8�_��X���M�7���0nik�*�6T�0tK=���d�"y[�%'��1st���R�� J�z�P�H�7���rh3u�m���j����P�K)%�J�7���O_���z)~��Q�gNFm�k�e��jK�+��%(�M��Q-%XMh�b�u#���g<@�G<�@0CM��%�b�����U����)�^z����v��$��=�o�����<dw��>e�@~_<�g���W"���(m*�'�2C�/$j�~-���e�t�wA�/7�ND�>��f� ��8zm�rI�O��Zm�MrB���� �f�|(�
�C�r����������0F��:���,��_�O-,{w3-=�T.�<�l��l�fS����%����]l0���� i�?�/���CH��2����6=?9�?S���$�����;��P�>&-�2�{�p+
��n�,���eI.suU�m��7�8��m��Pa�z� ��nR������+u�5A�=���&y���q||����0hK�$��=��B�I1Ic��&~/�k�?z��>q�Hl0����a������zs�f�?ky�#c�;
kpk�K;��Z*�/��8��R�T����bf�rY��"��^(�+�U�����]Dk��[��x;�7��~+a���j�|��e��;�|c���A�n-E����/Lj���f�#%1��l�'���^�+������,u
R3�/�X
Y�T��F��h7����������$��b<�M�H���JMWm����	/�������` 1Q���G����� {���h�C�J��K�������O%���u���������<}r��}N�0Ea����P<4��{�v0|����\hO�MD|�V�?����>����9��d_�wf���$y�`�"�G��C�'KA�����o��If��oM�2x�T2S ����1wM���]����2�4Kn'��n�����)��l���^�q��B��2�
�z��0�� ��\���(�U.�p�d�(@��b,9�d��y.���v�cl����-��c�*s��=��GV�;�:��d�>�~x��SW�{�P�����TP����U�L�A����$��B\&1��1��!��E��B��]���>��-�^{ah�E���o>����c��c���������y7@�1&�`��C������\�H+7�������q���Z�����:��25A�����i��3<������������<��$�:������,�� yQsx���AQ��NKW����D�����/f��c�w���z��
����i�f."��Sqs5?��������9a<�4:q���~��/57~�}���?���>���G�{`��������������C����}x�������=|����aec�g��"��z�a�B&�V�p������ ����i�+����Yq����8BtmR�,E2m�v�����@{)�P}�����6r���]����i������6���l/���t��|0�Nr�B��z��
�fx�.t�9���0����H�6��$)��r�=���^�sR����������'��e/�����q�;s�|-�I@>:l�^J� Fs_2�v��0&�.s��~�Qh����/]8�ggB��G��x8u��@�K��0�����o:��C�`�n?��yB����}���K�dn�5v"������K%I����fy����
�>���?p�l�th!�����.G�
�G/8K�
��'��
���`�:$�K����B��>�?E��&w��$u������n~k��~�|k%���:�����K4o|����"�w�v�]�joS��~Y���A�jYR�����7?���-]���.���&j�.1��~���qIF<����jH�)
���W=��j�'�S��T`J%-����� ��q���]��6*\rByp�wC=��vH�i�@�D(�"��x��!���Se�aoI�<�R�������D�.���ZU��<.��2�)��cP�QQ�24�YPQ��$���WtQ���\���?�0:o>���<���������?�F[�DMd�v�vO��N.��y�@�rB���&L�?
��^�r����l�[�'�������8�e����fK3��uY��_:�4����gVN���q�h?-F.f�u��[��,L2�2����Gy8����G�
��{v[�����_�=����g*4��/@>���10sr�&� <�����Qu�K���.�ED�:�b�����I�������7$J������15N�.��'������<������8��K�����#3�E��dNh��`��b_��X�0�4�b��$]��1�p@�"�/�����Q�`tc��^l�����z�7��" ���1��fS���M�����_e�J��^�_x���cW�;saU�I��t0����������l�5�����y�?>�U9�g��@�R�b�n��te��w��zL�O�|��PBB���)U���)�>1��Y�����.6��B_�5�F��P6+�l0�r��g���_��B�H1B_��g����A����'����.1B������RqNU��b�����
����u��
P���rnN0ky���S�i�dk�?��kw�o���h����]�{'�8�An6�jYKm��4:Vv�1=���b�Ue��y���u��T.j1h�6��}"p�3^�/*�$�kGa��fZ�r,��P�Y�uz�����,���A��;��\i5i�� �@��]��sn���n�x�A`��NQB��p���i1�X����E���H�T`l0����j��������q	�)��"�!��1�Fz��,"������O]�*0�K�RM`��a���	�����mX��[��d�����&����/|A�;�����e&�;�^��	��&i3(��4��0���qcD�,�UF[�D�d�ZEB8�1���!���\�����}�K���O��%��N>������������ZQ(�< "27���]6��������4o?��_�_��y?�r�^�������W7�l��G�o9����vN�S�j�+�����onzz������,��#�P��?����@�\���@x7CRI2t��� �����w@��5F�'�����vx�~�"��D��yf�!��`3��v�8�;t�
�Y���a���C�Rrm4F�����U/�����o���������Z����}%��j��r��{}z�DN�@�/A���_C�Q�]k;�p�z�����Q�nj�n�b+,6�(c)tK�,7��r��y����vL�����WM��9����7�n~=�1��n	��7�&����������gG�i0)8���S�h�y:���89sr���vJ�Wyu���gwp�����L�F���lS��e����:K�!)��������v�MUJ��	uT��~+)e�WqSF�qbo�&�NB
da�������p�O�Q�O����~�J����M��0��c�p��H�%���4J����91��^����s;*��I��-|3�.�����g���W96���
�lK������_���Q�O���������$��;
P��y��r:/r�TN���&+U���1�f�"��N-��Ig�(32/Ee	�cBn�%r<����eN&�"�@e1s,����O�$�%$��.��2	��&HT�p$�s*g�

�r����*�"(i�)�i�9r�a(�=��[��x�$��3�"��-+L�p���m(��:%s�TNX9����T�wB)U2y-��Z<��?����F��w���@>
��$h0����)B@R>;��_�y����4I$H����P��G!5��q ��@b=��!�d����`��H�l(B��\`,�?�T�?0�������@D@������u\�[8M�-d�.\i\qz��|�w��+x�J��`Y�ush6���qJ����Y��9�����n ���q/Djc������0��W��e�B�N�L7�7�kT��W��^�O7�K�3������@���������z+����DJs�#IZh(�d28-�������/~���g��95�����DI����G=�z�M>@��Z�yK�1=�x�?���H=��WP�.L����<�m��bY)�z��#��r�������E�����K��\n����3��0|&.dM�?����V�k�/�����ynV�;�b3�R�M�}�`|���^I$����e�Ac=�^��)v���o�x����`2{4����W���/��!Zd��Z��������7^L�b���1��[�R*_fU�l�.R�N:�P0PaeAb�R~�����H��rNs�4�dh�9�����T�{R2V1���C��.[�a�8�B��d��p
y^;V��*B-�7C&?}���rm�U��n��[��(�[�������m(��q2��(�K����U�h�����[�k2)�O+d�S���������a���R
�������>�?8��=G��CC�������1sk����y1F%�(�~O$�Gy\��`����T��0A/�_����B���B�E �����5,dw���ug{��nw+�c7���FW�
�*�f���[��,���;�������~���|�4�'������������V�jwh�RSUq��s�������}j�����yi��k���F�������[�/m@��(��/����B�e0��Bk)?�f���33gf��n�Jym���8T��I_&�8	��l�D*%�W��U����
�S���TU��*�^�spJ����Cg���T��-���-�o�?xw�;!R.�"������4�+��^��(t�����?
�,����i���v���2��E���O�{2�rM,�,�P���-��<� �R����iD�xa���8��?�������S��,��}+������=~�9!��\`�8-mw�Bn�jgN��T.R�����W�K�m
�b��{��$��?�/
s5i&�A��i����G�R�#���&��;\5m*�c�qE�_���"*<���9"�<W���|���/��U��{K���G��%6�F���n��X�!Ke�0�uS�M�����qQ6��)��:��E�Z��������6j*zl��C" R�g`�%����0�k^@~�>un3���D�;z6�a���N�>V\����Cq�O*Q'3eG��l��R��MV��������?������HL���rI���t��n��o�9��N��A:��>�g^�c���9!��9���]�`��}�!�� ����=��|N���oek��z�a�B��9��Jy���0�wy������C@L3�o��wf����m�����Cw�7p����h��ah�L��& Kpb��$��z��*�_�{-������+w�^+�r���wGm����}�c�A�#��
�����?����n����~~�N
�?�>�Wc�>HT�}�v��w	���C�g6�����\�x�q7�yx��������51�������VQ
)m�6���j!�n3���Bw�}�]1_��2���U��f���=0Q�z���)�{,^�v�M��*UC65[���g���\���P��34U��R���="y�##���2���E���G� %����f�_��s}??���s�#L��Tj	y�X����
��l�[a"R4T�0��T9n�����_��~U�DY@?|����5_��s�.��z^9�W��yw�*�Q(MGT�W�������7s��K��HD�C��P��t�=�"�'X�snH�������}�����7�b�����N<����>�$$C��K���p[����W�����-�Q��4�6C�<��Zp&�s������|���W���8��_��)�o��`�'��&�A,���B9���#%����n�
�+C���*��x��Xp��O.NB��t
��PU�r*,� �8�WL���6-ad�YA��R��/����B������z�4P�9[Z�7�{l�&v������K�kj�~F���s���T��������/�!.(�F��m;���d�)���*��x? <A�o{����-���1�� r�}�c�����i���CR>�dvf����������?|�����>�
���
m��<q������e.��~&7�T��T��+(o�Q����������g+M�,B�M��:R5�fNed�
�+�/��R%y�1iF����Q97��yp�r����rm����R����
���
_��J���++���ARW�Y*�1y����dR�EG����6O��<��0�:����?!�.�|;�3<��'}Og�SK1�~�����s����t�����>��Nx�/�v}�/;C4}3���E�M�{����+�*$A�����g88Lx�7��Y��
LE|r�wHA>xV����S��5~R��qO��v]�56����c�~cl�r��c�vE*u�,�6�v1
��+iO&6yH�.�6����l{�k�(���	��y��9q���|�X��Az��d=���?�oS����CD�@`�a��uJ��F����f��5h��k��[�M��Y�;�_���j�N��[zj���i�����8�j�VA��@z���u��W�����If��z�\�/�����]���j�����p���8�(wJU�P� e�����W��7���z��{��ZWz�A9?;����P�����U������r��
E��*%�'��%�T�\�I�6|���eC�u���}�d�9@���z`��c�Q���l8���}�FD���g;�q��}�}��@��!D�������^t�)�3��5L^���8��jO�{���H��@�9/�L�KC���
g�A���g\��<�������_ZM,n�;�r����y%l����T&�FZl��amN���K��k���s3������9��/�CY]�G�5'(B7��'�K�RDr8��0�P	�r�*K]p�
�h2md�����O@e_~n����w� J���
3�y\�O�J�I��1����I���_����HeD�uc�n���$���D���&X��0���(=�&�m����Dn������x	���<�����$���O`���������xyy��{�<[�^����\D�T9��0%�?�s�W��0H-���_/>�������g������?��g|)���1i~BI��@��>�tR�`��c��M4���n�q�����L�s�3?X��-w���o����~p�~E�����������7��W���=�wo\����������t��`&~�����p��x���
���_������w�����|���w_~W����'~}d�����}����/�Z��������u�{�?e9���;7�O���W��+x+g�o��y;����4
K*�j	���l�������"��d���;�W�F����^�5��x� �"����}�n��$��0}=�^���&��>�����	l�+�j,7�2�J�%?n�;�<���(���0�r3����v��������t�:�����S�
�>�����&��Oa��g���8����
Q�P���c��_����RQ>��2]>bd���x���E��A��1�R���V9��)��P"`� A�3t�`.�h�p�����������Y��{�f8������H�*�kID0l�9���$��	��U{��
�O���"Ra�>����W�L]D'��o���S8�{�����
/_5�a/�����*����Q�6=b'g&��KC�	5@����^���k��y������.{�������e�B=�"��O4��#D�?w���@�0Cg����Pk������;�jq6�@n�U��h��At9��#�����.7��
�h��u��it�p�>)�Q��p��t3?�2d���o���/�f�������N����aX�N�p�B�di��3�:�����6�e�����Q��'���ct
d��s��Qk5w�a��j��z�p������o�q�1q������4�cv��q;��o�OU����97���:�4 ���O-#hh���@��^�w�GGBQ7����1T�\�O��y�{��>O�:�i�yI����~
@d���iU�0���F�QcC����������1x�i8�Z�n<y=��)����m�
������M|�Ut�q��Pp
��A��X����c�n�TUF�u�A%�rB���,G��M�x=�D�w;�����h�9
=�v��l5{t�ZV���L�F��g�m]���l
��(�����������v�Bd>�MH�&,r�D,up�S9�:��|��?�'���G�g�����S��lD1��^������_`l�Nm���gq��k�dg�M ���A	FO�5���x���S�U��y��<���v)��^v��/B"u	�@��B$��}d��^�Q�=���n��j�$.T� ��������h�q|N��S���_t;� ���-{)��o���e�L��Uy�����~�{��h�?��?0����H����v���$A����.�fi145�,�1��*�V������u����[�H��=c�m���:��$�z�@D�P�`�������[����mN]�#1�e�$�����\���%Y<���b���p�����>�K4F ��kp��%$@��y�O�k���j�����?w��M�f�^�w��I.L�C���;�i���������Z�@FEh|'U������9Yiu�*����kpe���B�l����k7�<�z�o\}��V�;7��{�M��n���-����������+����y�2��p����~�h�L7�����A��B��>~��t�����c-�8�p�bW!�.>5����+&���4��SK��^DX�������Sn	�|����,�]�)��`��U����s���S�RT�7���X:��K�lyd�{�3��T�v����@���t`6�(�p_$4���e�?
�9�� �_"�����-,�~�Fv&�7�[�Q{qSA2�(b�z�b�O������V�(�:�Z7"QV`�"1�>�b���,��S������`..����l������oN�h�j�,]	3����Z"��G�d+5 ��v��
��������~��a���9A�m~2z��O�BhrQ�p'��C&\Jn��x����[����S5�[_��������v��(]N�^�����[����6�K�Z(�8��9v���DfZ�\N���|�����4lr0��/$V�(�������;�T���k3��`���
����\���2t�#���O���k��Z��9���,�T���%�|����.������,c�jy�W��Z|]�|�Z���j�����[�7�&r^Y������1��R���+��5&�L�P�-\�N�Gc.�1OF��!�=f�f>8�m0d��x���A��� +������WMC)�V�yj��Y<��WJH�������je�?�������#Z�����*GV�6%�u�����Ag������>����o����������/������v
���?@<J�����]�"�?��_3�4^=?�<����ne�
e���-����H�e�F�%1�������^u��pB�����$Q��d�|(�,TY��(0����b���2���������s���(1b�E��"�$I�jlb ���V��u���%��_S<�?�x���SyyF!F	2F>�2���?�*�e�Q���1��4����*�[}����r��p�!�������o���*n�gn�����<R�$y:��S�PF�Os������(�a�8����1S�/�@������/v�7���zp����s8�������jw����mM���n(���k0�_Q���_A++b[�W��T�������VB�������ME�|���wgM�a����z�a����z�a����z�a����z�0�����$t��AIEND�B`�PK�LrHlayout-cachecd`d(pg``�d``��1A^�D #&��qvB9PK�-�u)KPK�LrHcontent.xml�][�7�~�_!h�}Y�����`6�$@��b�mP-����TJI�����a]X$%��.I�g�K�*}<<<�;V~����j�9/w�b�~����(����r��~����������b�X���y1{^���dVl��������W�����]�����M��ww��]��7�]w������ov������������>�f3��7{H��j�{���^Ro6cA����"��/��dQ����l�P|Y-7��������t�������+��)�ZO������>��j�|6�W�������v�:�g���X��y�����������>?&k����hfOY���`y�<}y���w���"k������_����B�N�-3���\n��Y�v�/��B57���Kb���3��w�K����3|�;|��fV�����`���I����U|#�]�2�/���y�����/fO�:�/O�,7�}��$S�E���O�|[�{+�E����"��~��ows��X��G�:��o�y���i�Y�~���BTf��-� �n��������T�3����C��y���0������������'�Z�*^���w9O��2_��N��c�b����F)�w���(�_�g6G1_�O�l���c���������Gp�/9����[P����~�.0Yd�|2�g����-�_�������e��6;<��n��������{��8y�70_�$���n7����/Kp?�"�>�m�#?������{8���@Y���7b�������e��g��g{�K�V�f���b9�~m�s�Z=d�OQxG��@�����!-y�%!�����>{�����I�����=��a�[
�mVf�e�}j/��TW&�]����r>nl��d{;/��|7���6)pM�.D6v���c��=���^.�����
�@D�K�}K��1������*qY���t��	���t��V�5�\�_�I��!��s�8����f��|=&<F�B���X�27��3�_��.�������' ��j>>6��L�S{���������M����&����4���u\��-Jy�Zn��	@�����!�������M��bU�������������6 }W�p�C���H�7\����V&���a8&���-���VJ���'��X�m�X�����W�.���|��H�2�����Ey(J0��VL���d��;��\nF��d���z|K4n�m�/`�6�&w�-������/;Zo��y�������s��b���7��V;'���.�e����Y�_&�z[���J�e����Y~�\&T~[�W'��^���o+v�4&�L���b�^�:�8c�.�*x[��e��ero+vt���$g��e�o+vFa"=��]V��eRn��b}���g�E=h<�
��-�si=�V�� ��_A��s��-;��h���[D3� �z��}/zu��}�z����^L�j�E�����[J	�z���v��u�4B?�������_�+�^1e�!�+4< w�9�g&F?x��<��n�<pxg���:[������!���E/d���yo�R�����A�������m���?6h�u��f;�c�c�{������vO����>�q��jw���������6���P�������`�(}I�����(^�tW��c}W+s�U����(���N�`W7R�Q��#����b��~0�U��y��l�,S��O�<���U,�sX�j�.��9�tg����4_����k��@�>�+�"��r�]��w�z9�$=�c{Nz�S��}�O��u�u�Km[���dS�����|����m�P9�l��ji^@�.��������/e�}���}J�g|�-v��2�������	��� F����������K��h�������r3bw#�;��G�q��_�'*������V�2�m���@�#*�hn$�v%_-FI��.H�" �T��$C�U 5���Q�!|a�X�$��T�J��V����H�q�`D�d.HI5A�� �5�fKm����B�q��L����R'$��Z]xB(6!"����QWR�\3����AR
�H){Qr���b&D��D��
Jbr/�S`*0!<%�N��"���
|���0�����1T�V���+u#�H[i�\`��[�ZS�)S����T�;����VXi=PsC�:�RH�[��XK���QWu%Q0��QbOsQ%��%�T�4���4������V���r�<a�B��x�s�MXjW�0�%%z��}�8�+��v0����(���$�����^����n5`�&R��JLu������/�5���UT*$b��*���H[�-D��@�
��Ra�@JA#��sA�%GZ�����?�b_�`��4F����#h=`<%��'�����m�bs��j1|���;������j-��@rO�O�XZ�=���?M�������t�V��f�d����\�$��Jb�g��4��E��13��TRoW��a���V?H���1QJI�hy
�
�Z�X*"�������3P��(%%�C	G�1��E�I�����l����Hb�T��IquB������6�s�L@�(�`I�0"o(
ID�b���qJ�	,��V�>lqJ�=����h�N#O�c��iN�;Obiqk�p��w���q�(NE�P'���|��������<���qb��Gq��:0WJ��Q�����JpM�l���Dp�}G�>:�p)�|p'P��I�`N0���B����/b�������	VG����_�pN�����c�1Ui�Y� �h����'-^+�{^���X���i����X;B��p���I�?����<��SJm}8fL1��;a�[w$88����)c�$^j��� ��9,8�N���C\'�p��s�e�^*m6O��S6\�A��W��&����)��HicC������X�S��I���sq^�L�]~�|���?|�?)?���2h�g��hd��������������
�	������1y~�������$enF�f����f$� �eXQ/aq��S�� �2Z+D�80[�h�$�DH@��*�{$��(��#�r	jI��-��nj����.I�JRG���"���@�L�S�;R�ai4|B�����a��)�C�E����!i
�<$�n2STkcJ,�����tD�&��2%��)]�%1��v�-�"���x��5n2`�4�J�����v���5}��E��0��{$5G~lX�����;��A$�`%�� ���:+���>�&�8e"`XF,Dh 8����e�|�g��������&�6��de�,^0!��k���m&$��{�zr���Iu�^�yf9Z�E��.����.	���T=��9J��%�IF��w��B�����#O��~T4�,�~�IN�Q0�����fd��r �b��C�@1"�K`��v������#	�#��=���1��`c��C�P��0F�-��5(�)u�t�G������i�B�5G��U�0t��C�����F?v������(�T-�7I2�4�ypw)��I)���0�fk��*'����6�x������V�uKx��&h�}�+obF�V^���l�e���)�=�cR��A��buP��y��<`j)�$�i��lR�n�k@��K��h�FI���r�H����m�������#�C<0vT7.D�"���u
1��>������u���4���9F�@��LZ��M��bCZ�DC��
l����.F$�X���$���y���	�@�8ig�c��4������'>�{��
c�JF-����eK�<Z!C�lY�X>H�b���t(��4j�+k������w���������4V������9d��
�:�d����RH$��23l����D�4�c�NpD�����\��1��`�i�C���7�g�����PGi����#�RGU��m
7��-�h��T��eZ���f���m�/r�o���9r����F�����&��������r�%�pn,�NH���&MT� �6�%�3���v�I�u0�C�@IZ�?I��iu>K�b�w�*�T�W��	n*��y����J)�S�<I��=�N_��S�p�)�6�*)���\�2W��q���=�@�����+p�)y�~���-r�@���A��t�cE.t��
���G��A����I�I�mfL�D�'���P,6%����i=��S2y��2��ev��)K��T'�<<�e����.L	��	�l�Oh�Z0�;�K�L�Fo�a�X��0
�MY��5tJc����N0�%'�afL�l mI��������}{��1I����I�E�e��0.��)�n�1����>��2���0U��ag	���QI��dL�cP���!�/RPt����������EO�i�����yS���9��^��2��v�X��fN�����,^�,
k�<�<�R8 CV��~������>�r�M��,����R�%l��	����sD�j8C�����RF�.	�P��$H^�4���G6]*�"M������"�����y��0��0o#�t���������2�������N�*���3[�%J*��y������j]7i��M>]
�2���z�^O#��m��h�+�7��g���|p����&�]f�+���U{'���e���}=9t��z������|'6�Q�����@c�e
@F�$��� 	fM0x����E$v�������D�U	v�Q�%��</�C���p��K��P�;�"��P��|����"��<A
�{��+:WN=|�K��O]�.g�k��������^�g�x��-L��{�
���e����t�(�"V�����a�a�pQ�e�{hF�Q�t�,�J"��{��n�H�>�W=p��A�4�p%�b�D
,��w������<����:���T���J@r*u:S��[J��J&�
@Bx�b|�AJOc�����}���B%��}RWq
K��'�b���@C��w����dIu�=e�����p����)Dy�2o�v������CvFJtH����W�4�iG5�����
�L�)E��	%��M��k�_���+a�����uuL��0�^��DX�4�e�����|
���|Z����'9���Q"<$�IR����;"/m_�R�i�=9,P�gQ�9|?����'d+^S��g�0�{�>l�tO�:��0o���H��[���c�7D��R-v�BD#��|�Cb��js��
��^*D�A�y!�p��#&H�1�A06o�l�K�UHLHI(%>��.��n����IE�$�+II�2)��z���;.�!B- �m{��T2|���I�������0����@�	B=�g���a� ����fz�l#�h�&A&���m8{����ysv��~�v�i^������2���PKc�p�U��PK�LrH
styles.xml�[m�����_a(h���l�����)
p�������h�=JHj��_���D��W�_�h$0g�>|f8���~|���sAXq%�i4�E�2R���_�|�W�������v$����U�y�X���#��*^lDl
�c������pF_{��2-����Z����A5V�-[�>�V��3��C��.`����P�A��S��H���_�����f29���|��~���������^Yq���t�)V��I2N&N7�
�O��.U��|04H��U�������@��
��^�y6|y��o�#y�Y����>}l����c)�T)'��im��1V��L�jwg��bb~{����GN$��zzQ=E4�gy4�K&��{ES����{{��p\2.kGv��3��� s�^J�T�<�����|�D��	>��2��Xw@���L����.$�������I�|_���
�l@�PbN�Qm�i�����gti�
��V%�������81)����x��l!�\����&J������4g�{�C���;��8�)����V7��o��]��c�"�F���RN���E�VJ�q���G"D4�<�G�XC<�f��`C%�*���p����Q�B��G�h>�93��}��l^��?8#�m�Q�E��^��C\<	�������y�������T����PEm��z�N�9*$�������K5�����}���Se����
ZD#�� n)�%��z��wls��bVJ���ok"(c��X�w�t�$��A��\.�4�����(Q
u\|`�|c*��dqI�^�-
���3��3�P�n
�9y�Mu�CTx\*G��Z��cTI���3��hy@n���c�',I����_9���)���E/RdX������9�|��J����v���>�M%0�P����[�H^��S}�h���|y2+�n���WhM��
)�b�q~�\#�%�6�W�=���1#��K5����&��MY��\q�o'�>9�O�����f�zr��V1��ZK�;l-����Y�Vn�0�5�.���p*���S�e��E'JrR�?��eU��2�d��y�b>��qF j��x���4!����l���d����vY�K@�sD�X�)ggJe%�D�)���I��!s{�e\�"l� �J�(���c�������_1.c��X���
���4���!�E���-EB�{LMh���O�2/�{�����+�R����@�o��o[��Bn=��r�!�d���ooui[&�:3O�S�\�!)���+D��k��:(tu����c��'��@s������Yl�����j?��w=��N�L�/gs9�YW)9k�^�k�~��d��n����
���%E'/hF��%!��h�
��r64�t�p�d�j���u8L�px��/d���-F�12yEJ�+�a8�������p�������2��8a����H�%�L�f}���{�<��,����%�/�A�J�W"��z����[���g8�>�"����"�^i��W�����T�������
��z�l�27�;$��U$���5�^G��El��kU#�-a}�3p�%�����i�.~Jmg�R}zzs�dMq�?��VB��)Vg����[���o����n�[F���gL<��r��mM�>�ZA�^���V������-��k�L�HB$��i�����_VO)q-�
�:��K\�1J����l�*�w���@�Y��t(��s�������!�g����x1����w���x1__����i�M�M����b~���L���i������4y2/���y�c�/t}��>����$�(�%�����I�pX��S!��I�X����1A�
5T�|�>��-����V������u��Q��8��'s!�6������u�m�������[�Cs�pp� �Cr ��7����E��50��1�oz0�	c|s
�o���o��^��7����e��50�}c�z0^�1^]���x���:������I�$�or
t�o�q��0���`�L����ga�g/�-��/�������+�����@�cL���EH�\���{D+�*�6:C�M^�I�m��'�hM��^����Y��$���^!�x����y�������������H�r��R���o!�e���w��k�D����t�}�	��~��D����_F�F�t_����L�������^�T�I����s-9�L�Y��>�����C����VN���W�c�K��<��6^����gsSt.���3I�����3�)����S�oH��m�	�H4M����������Q�WV�� ��?���?W�K�~�����c�����D��fvUH����5I&��Ck���8X�K�Ln��t������	���V��W%�S�W�:M�������C���d�����!Y[�f�5���T���n8���������G�k<�����`�����z4��zjj�|=�}����� ��'�?~�;PK�����	^<PK�LrHmeta.xml�����0��}
Dg6�#u��T��T�.r�;�-���!}�4�d�%�~����������Nj����(��T�������s��������
(4`������h��FQ�:�Q����T�����4+�Z��;�dmK�!�P�
E��M�|����%8���Ea�vl���F����z
���G0N������~����5d�o���w�����\Nw�QSn�YG�%��&�qe�(�iJ�M�g�6�RL
��QNo��>J)�P�$O��$�YlN!���@�fZ�����/�|�d��om���+�w�E��
8�6��<�:��!IH�^��������I�+����E7��S/k,��������������t��5\���|w��(V���8f�W�+2[5�*���2���2���e���O��-�u������
'i�kwaw,I��Ge�n^z�w�PKP�?C��PK�LrHsettings.xml�ZQs�8~�_��=%@B��RZ��\����Y�����V�ISW��=�`{W����o%n?�"q�Js�w������C.gw�o���M�c��[�Ny��$i�5C��3z\�zz���(YG���K�����1��c��w�������rqW�����r�aY��jV,�j�����5@9��C]�w�u������b�����e1��p�-�Mh���&��o�f��sn ��9���K�+�����k�
������tSc\�\1�����B�t]���n���n�S���E>��<4���+����|����w/�tS�.g}4��B*4h������0A�d�aT��������Ga��)�`�����a��`��2��C�y7�Z�6��`��>2�����+���������j�|qy�]�'Ng�����^�X eg���`�{��r���������v��Q�\��ckLLE�mpg�s�ij���8������j��/��MW�@@` �(�������-�������S�[�/�u�?i��23P�c
���9��6�KGI�O�[,X�&r��O����L���)%�����}g��b�����x�g�;l���Y+.\(1���h&��9'+�1��@��(B�f�S�i��X�b3��G�T3c|J�	��x�3�q��,�K���O��mf\z���"v�{K4��Q����?SRF��9��r6�6\��-�������1��D�v��FZ�	#�w^��	MM�f~\����.�*�#S�KO�0�"�E^�W�x���\��z* �E��Z�����(6�6a�x��=23o1
v o
>�vO��n�a�z��)�Fy��������|u]P��A?16�=�o�o"��m�yb2�����6=���A@�����[2V��!���	�-�>�HC=.|�7G�!�����/WI�I���)*r�I��|������Lk��0�#�z���<5,����H��b�)�{��'��@(�O���`��cL!���P�m�L�*4��R��r�g�������d]���1/�TC��,�<��N��V��x���r�Q"�fd����;�G5�)���A�">;��p��ks��1�i��gY'�^�,�=='���@�c�������eX�����k������9���)�(���U��dSs&��I�cW��$����$�����k%c��#����-��?8� 4F(;�x������d�,����N����h0�9�D�+�����U7��m&��7���n;&����b�=�	rE~�
)���r>��tG��"��;��+{\{Q_�	7I%C$@�������e�K����,clE}���\Y��:�y�J��?�/����<��!�v&�`-U������#�����,����>��<���N�C�;2o���_���|M�����#�������������m�r�Eb�eV��v��!h~%zh��q���Hi�C��P�wH?U��,�[���	��/����,��
������YAS�j���yVt�"����l�o��
�2���a]����|����t�}�3l�PK�&�)�(PK�LrHConfigurations2/toolpanel/PK�LrHConfigurations2/floater/PK�LrHConfigurations2/menubar/PK�LrHConfigurations2/toolbar/PK�LrHConfigurations2/progressbar/PK�LrHConfigurations2/statusbar/PK�LrH'Configurations2/accelerator/current.xmlPKPK�LrHConfigurations2/popupmenu/PK�LrHConfigurations2/images/Bitmaps/PK�LrHmanifest.rdf���n�0��<�e��@/r(��j��5�X/������VQ�������F3�����a�����T4c)%�Hh��+:�.���:���+��j���*�wn*9_��-7l���(x��<O�"��8qH���	�Bi��|9��	fWQt���y� =��:���
a�R��� ��@�	L��t��NK�3��Q9�����`����<`�+�������^����\��|�hz�czu����#�`�2�O��;y���.�����vDl@��g�����UG�PK��h��PK�LrHMETA-INF/manifest.xml�TKn� �����fU�8YT�	�L��A���!�o_;���R�jv�}�1�6G���dU�M�����R[���g�.6������Q}��n%r$ ��	<&�F��&{$�?������	X�����5�a9���V�d��x_	���X[(����u���L��'��^�d<�Ps4l����K�/���}��zhQ��Y,���4`�8�t;K�y&�#V��a���p���a=2<_+2w��j?5������T`:���1�=��qM|
)�(Af+�=����c2������<�������PKsX�k,�PK�LrH^�2''mimetypePK�LrH	��7�7MThumbnails/thumbnail.pngPK�LrH�-�u)K~8layout-cachePK�LrHc�p�U���8content.xmlPK�LrH�����	^<
oNstyles.xmlPK�LrHP�?C��{Xmeta.xmlPK�LrH�&�)�(Zsettings.xmlPK�LrH�`Configurations2/toolpanel/PK�LrHaConfigurations2/floater/PK�LrHPaConfigurations2/menubar/PK�LrH�aConfigurations2/toolbar/PK�LrH�aConfigurations2/progressbar/PK�LrH�aConfigurations2/statusbar/PK�LrH'.bConfigurations2/accelerator/current.xmlPK�LrH�bConfigurations2/popupmenu/PK�LrH�bConfigurations2/images/Bitmaps/PK�LrH��h���bmanifest.rdfPK�LrHsX�k,�9dMETA-INF/manifest.xmlPK��e

#18

Mithun Cy

mithun.cy@enterprisedb.com

almost 10 years ago

In reply to: Mithun Cy (#17)

On Thu, Mar 10, 2016 at 1:36 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

Do you think it makes sense to check the performance by increasing CLOG

buffers (patch for same is posted in Speed up Clog thread [1]) >as that
also relieves contention on CLOG as per the tests I have done?

Along with clog patches and save snapshot, I also applied Amit's increase
clog buffer patch. Below is the benchmark results.

clients BASE + WITH INCREASE IN CLOG BUFFER Patch % Increase
1 1198.326337 1275.461955 6.4369458985
32 37455.181727 41239.13593 10.102618726
64 48838.016451 56362.163626 15.4063324471
88 36878.187766 58217.924138 57.8654691695
128 35901.537773 58239.783982 62.22086182
256 28130.354402 56417.133939 100.5560723934

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

#19

Robert Haas

robertmhaas@gmail.com

almost 10 years ago

In reply to: Mithun Cy (#18)

On Tue, Mar 22, 2016 at 4:42 AM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:

On Thu, Mar 10, 2016 at 1:36 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

Do you think it makes sense to check the performance by increasing CLOG buffers (patch for same is posted in Speed up Clog thread [1]) >as that also relieves contention on CLOG as per the tests I have done?

Along with clog patches and save snapshot, I also applied Amit's increase clog buffer patch. Below is the benchmark results.

I think that we really shouldn't do anything about this patch until
after the CLOG stuff is settled, which it isn't yet. So I'm going to
mark this Returned with Feedback; let's reconsider it for 9.7.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20

Mithun Cy

mithun.cy@enterprisedb.com

over 8 years ago

In reply to: Robert Haas (#19)

1 attachment(s)

On Fri, Apr 8, 2016 at 12:13 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I think that we really shouldn't do anything about this patch until
after the CLOG stuff is settled, which it isn't yet. So I'm going to
mark this Returned with Feedback; let's reconsider it for 9.7.

I am updating a rebased patch have tried to benchmark again could see
good improvement in the pgbench read-only case at very high clients on
our cthulhu (8 nodes, 128 hyper thread machines) and power2 (4 nodes,
192 hyper threads) machine. There is some issue with base code
benchmarking which is somehow not consistent so once I could figure
out what is the issue with that I will update

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Attachments:

Cache_data_in_GetSnapshotData_POC_02.patchapplication/octet-stream; name=Cache_data_in_GetSnapshotData_POC_02.patchDownload

diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index f51f8b9..eb08025 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1554,6 +1554,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 			elog(ERROR, "cache lookup failed for relation %u", OIDOldHeap);
 		relform = (Form_pg_class) GETSTRUCT(reltup);
 
+		Assert(TransactionIdIsNormal(frozenXid));
 		relform->relfrozenxid = frozenXid;
 		relform->relminmxid = cutoffMulti;
 
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 6eb7c72..0ee558c 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -366,11 +366,15 @@ ProcArrayRemove(PGPROC *proc, TransactionId latestXid)
 					(arrayP->numProcs - index - 1) * sizeof(int));
 			arrayP->pgprocnos[arrayP->numProcs - 1] = -1;	/* for debugging */
 			arrayP->numProcs--;
+			pg_atomic_write_u32(&ProcGlobal->snapshot_cached, 0);
+			ProcGlobal->cached_snapshot_valid = false;
 			LWLockRelease(ProcArrayLock);
 			return;
 		}
 	}
 
+	pg_atomic_write_u32(&ProcGlobal->snapshot_cached, 0);
+	ProcGlobal->cached_snapshot_valid = false;
 	/* Oops */
 	LWLockRelease(ProcArrayLock);
 
@@ -414,6 +418,8 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
 		if (LWLockConditionalAcquire(ProcArrayLock, LW_EXCLUSIVE))
 		{
 			ProcArrayEndTransactionInternal(proc, pgxact, latestXid);
+			pg_atomic_write_u32(&ProcGlobal->snapshot_cached, 0);
+			ProcGlobal->cached_snapshot_valid = false;
 			LWLockRelease(ProcArrayLock);
 		}
 		else
@@ -565,6 +571,9 @@ ProcArrayGroupClearXid(PGPROC *proc, TransactionId latestXid)
 		nextidx = pg_atomic_read_u32(&proc->procArrayGroupNext);
 	}
 
+	pg_atomic_write_u32(&ProcGlobal->snapshot_cached, 0);
+	ProcGlobal->cached_snapshot_valid = false;
+
 	/* We're done with the lock now. */
 	LWLockRelease(ProcArrayLock);
 
@@ -1552,6 +1561,8 @@ GetSnapshotData(Snapshot snapshot)
 					 errmsg("out of memory")));
 	}
 
+	snapshot->takenDuringRecovery = RecoveryInProgress();
+
 	/*
 	 * It is sufficient to get shared lock on ProcArrayLock, even if we are
 	 * going to set MyPgXact->xmin.
@@ -1566,12 +1577,39 @@ GetSnapshotData(Snapshot snapshot)
 	/* initialize xmin calculation with xmax */
 	globalxmin = xmin = xmax;
 
-	snapshot->takenDuringRecovery = RecoveryInProgress();
+	if (!snapshot->takenDuringRecovery && ProcGlobal->cached_snapshot_valid)
+	{
+		Snapshot csnap = &ProcGlobal->cached_snapshot;
+		TransactionId *saved_xip;
+		TransactionId *saved_subxip;
+
+		saved_xip = snapshot->xip;
+		saved_subxip = snapshot->subxip;
+
+		memcpy(snapshot, csnap, sizeof(SnapshotData));
+
+		snapshot->xip = saved_xip;
+		snapshot->subxip = saved_subxip;
+
+		memcpy(snapshot->xip, csnap->xip,
+			   sizeof(TransactionId) * csnap->xcnt);
+		memcpy(snapshot->subxip, csnap->subxip,
+			   sizeof(TransactionId) * csnap->subxcnt);
+		globalxmin = ProcGlobal->cached_snapshot_globalxmin;
+		xmin = csnap->xmin;
 
-	if (!snapshot->takenDuringRecovery)
+		count = csnap->xcnt;
+		subcount = csnap->subxcnt;
+		suboverflowed = csnap->suboverflowed;
+
+		Assert(TransactionIdIsValid(globalxmin));
+		Assert(TransactionIdIsValid(xmin));
+	}
+	else if (!snapshot->takenDuringRecovery)
 	{
 		int		   *pgprocnos = arrayP->pgprocnos;
 		int			numProcs;
+		uint32 snapshot_cached= 0;
 
 		/*
 		 * Spin over procArray checking xid, xmin, and subxids.  The goal is
@@ -1586,14 +1624,11 @@ GetSnapshotData(Snapshot snapshot)
 			TransactionId xid;
 
 			/*
-			 * Backend is doing logical decoding which manages xmin
-			 * separately, check below.
+			 * Ignore procs running LAZY VACUUM (which don't need to retain
+			 * rows) and backends doing logical decoding (which manages xmin
+			 * separately, check below).
 			 */
-			if (pgxact->vacuumFlags & PROC_IN_LOGICAL_DECODING)
-				continue;
-
-			/* Ignore procs running LAZY VACUUM */
-			if (pgxact->vacuumFlags & PROC_IN_VACUUM)
+			if (pgxact->vacuumFlags & (PROC_IN_LOGICAL_DECODING | PROC_IN_VACUUM))
 				continue;
 
 			/* Update globalxmin to be the smallest valid xmin */
@@ -1662,6 +1697,79 @@ GetSnapshotData(Snapshot snapshot)
 				}
 			}
 		}
+
+		/*
+		 * Let only one of the caller cache the computed snapshot, and others
+		 * can continue as before.
+		 */
+		if (pg_atomic_compare_exchange_u32(&ProcGlobal->snapshot_cached,
+					&snapshot_cached, 1))
+		{
+			Snapshot csnap = &ProcGlobal->cached_snapshot;
+			TransactionId *saved_xip;
+			TransactionId *saved_subxip;
+			int curr_subcount= subcount;
+
+			ProcGlobal->cached_snapshot_globalxmin = globalxmin;
+
+			saved_xip = csnap->xip;
+			saved_subxip = csnap->subxip;
+			memcpy(csnap, snapshot, sizeof(SnapshotData));
+			csnap->xip = saved_xip;
+			csnap->subxip = saved_subxip;
+
+			/* not yet stored. Yuck */
+			csnap->xmin = xmin;
+
+			memcpy(csnap->xip, snapshot->xip,
+				   sizeof(TransactionId) * count);
+			memcpy(csnap->subxip, snapshot->subxip,
+				   sizeof(TransactionId) * subcount);
+			/*
+			 * If the transaction has no XID assigned, we can skip it; it
+			 * won't have sub-XIDs either.  If the XID is >= xmax, we can also
+			 * skip it; such transactions will be treated as running anyway
+			 * (and any sub-XIDs will also be >= xmax).
+			 */
+			if (TransactionIdIsNormal(MyPgXact->xid) && NormalTransactionIdPrecedes(MyPgXact->xid, xmax))
+		    {
+			
+				/* Add my own XID and sub-XIDs to snapshot. */
+				csnap->xip[count] = MyPgXact->xid;
+				csnap->xcnt = count + 1;
+
+				if (!suboverflowed)
+				{
+					if (MyPgXact->overflowed)
+						suboverflowed = true;
+					else
+					{
+						int			nxids = MyPgXact->nxids;
+
+						if (nxids > 0)
+						{
+							memcpy(csnap->subxip + subcount,
+							   	(void *) MyProc->subxids.xids,
+							   	nxids * sizeof(TransactionId));
+							curr_subcount += nxids;
+						}
+					}
+				}
+
+			}
+			else
+				csnap->xcnt = count;
+
+			csnap->subxcnt = curr_subcount;
+			csnap->suboverflowed = suboverflowed;
+
+	       /*
+	        * The memory barrier has be to be placed here to ensure that
+			* cached_snapshot_valid is set only after snapshot is cached.
+            */
+		    pg_write_barrier();
+			ProcGlobal->cached_snapshot_valid = true;
+		}
 	}
 	else
 	{
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index bfa8499..9157c32 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -54,7 +54,7 @@
 #include "storage/spin.h"
 #include "utils/timeout.h"
 #include "utils/timestamp.h"
-
+#include "port/atomics.h"
 
 /* GUC variables */
 int			DeadlockTimeout = 1000;
@@ -118,6 +118,13 @@ ProcGlobalShmemSize(void)
 	size = add_size(size, mul_size(NUM_AUXILIARY_PROCS, sizeof(PGXACT)));
 	size = add_size(size, mul_size(max_prepared_xacts, sizeof(PGXACT)));
 
+	/* for the cached snapshot */
+#define PROCARRAY_MAXPROCS	(MaxBackends + max_prepared_xacts)
+	size = add_size(size, mul_size(sizeof(TransactionId), PROCARRAY_MAXPROCS));
+#define TOTAL_MAX_CACHED_SUBXIDS \
+	((PGPROC_MAX_CACHED_SUBXIDS + 1) * PROCARRAY_MAXPROCS)
+	size = add_size(size, mul_size(sizeof(TransactionId), TOTAL_MAX_CACHED_SUBXIDS));
+
 	return size;
 }
 
@@ -278,6 +285,13 @@ InitProcGlobal(void)
 	/* Create ProcStructLock spinlock, too */
 	ProcStructLock = (slock_t *) ShmemAlloc(sizeof(slock_t));
 	SpinLockInit(ProcStructLock);
+
+	/* cached snapshot */
+	ProcGlobal->cached_snapshot_valid = false;
+    pg_atomic_write_u32(&ProcGlobal->snapshot_cached, 0);
+	ProcGlobal->cached_snapshot.xip = ShmemAlloc(PROCARRAY_MAXPROCS * sizeof(TransactionId));
+	ProcGlobal->cached_snapshot.subxip = ShmemAlloc(TOTAL_MAX_CACHED_SUBXIDS * sizeof(TransactionId));
+	ProcGlobal->cached_snapshot_globalxmin = InvalidTransactionId;
 }
 
 /*
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 7dbaa81..356e9fa 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -16,6 +16,7 @@
 
 #include "access/xlogdefs.h"
 #include "lib/ilist.h"
+#include "utils/snapshot.h"
 #include "storage/latch.h"
 #include "storage/lock.h"
 #include "storage/pg_sema.h"
@@ -253,6 +254,12 @@ typedef struct PROC_HDR
 	int			startupProcPid;
 	/* Buffer id of the buffer that Startup process waits for pin on, or -1 */
 	int			startupBufferPinWaitBufId;
+
+	/* Cached snapshot */
+	volatile bool		cached_snapshot_valid;
+    pg_atomic_uint32    snapshot_cached;
+	SnapshotData cached_snapshot;
+	TransactionId cached_snapshot_globalxmin;
 } PROC_HDR;
 
 extern PROC_HDR *ProcGlobal;

#21

Mithun Cy

mithun.cy@enterprisedb.com

over 8 years ago

In reply to: Mithun Cy (#20)

2 attachment(s)

I have made few more changes with the new patch.

1. Ran pgindent.
2. Instead of an atomic state variable to make only one process cache
the snapshot in shared memory, I have used conditional try lwlock.
With this, we have a small and reliable code.
3. Performance benchmarking

Machine - cthulhu
==============
[mithun.cy@cthulhu bin]$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 128
On-line CPU(s) list: 0-127
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 8
NUMA node(s): 8
Vendor ID: GenuineIntel
CPU family: 6
Model: 47
Model name: Intel(R) Xeon(R) CPU E7- 8830 @ 2.13GHz
Stepping: 2
CPU MHz: 1197.000
BogoMIPS: 4266.63
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 24576K
NUMA node0 CPU(s): 0,65-71,96-103
NUMA node1 CPU(s): 72-79,104-111
NUMA node2 CPU(s): 80-87,112-119
NUMA node3 CPU(s): 88-95,120-127
NUMA node4 CPU(s): 1-8,33-40
NUMA node5 CPU(s): 9-16,41-48
NUMA node6 CPU(s): 17-24,49-56
NUMA node7 CPU(s): 25-32,57-64

Server configuration:
./postgres -c shared_buffers=8GB -N 300 -c min_wal_size=15GB -c
max_wal_size=20GB -c checkpoint_timeout=900 -c
maintenance_work_mem=1GB -c checkpoint_completion_target=0.9 -c
wal_buffers=256MB &

pgbench configuration:
scale_factor = 300
./pgbench -c $threads -j $threads -T $time_for_reading -M prepared -S postgres

The machine has 64 cores with this patch I can see server starts
improvement after 64 clients. I have tested up to 256 clients. Which
shows performance improvement nearly max 39%.

Alternatively, I thought instead of storing the snapshot in a shared
memory each backend can hold on to its previously computed snapshot
until next commit/rollback happens in the system. We can have a global
counter value associated with the snapshot when ever it is computed.
Upon any new end of the transaction, the global counter will be
incremented. So when a process wants a new snapshot it can compare
these 2 values to check if it can use previously computed snapshot.
This makes code significantly simple. With the first approach, one
process has to compute and store the snapshot for every end of the
transaction and others can reuse the cached the snapshot. In the
second approach, every process has to re-compute the snapshot. So I am
keeping with the same approach.

On Mon, Jul 10, 2017 at 10:13 AM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:

On Fri, Apr 8, 2016 at 12:13 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I think that we really shouldn't do anything about this patch until
after the CLOG stuff is settled, which it isn't yet. So I'm going to
mark this Returned with Feedback; let's reconsider it for 9.7.

I am updating a rebased patch have tried to benchmark again could see
good improvement in the pgbench read-only case at very high clients on
our cthulhu (8 nodes, 128 hyper thread machines) and power2 (4 nodes,
192 hyper threads) machine. There is some issue with base code
benchmarking which is somehow not consistent so once I could figure
out what is the issue with that I will update

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Attachments:

cache_the_snapshot_performance.odsapplication/vnd.oasis.opendocument.spreadsheet; name=cache_the_snapshot_performance.odsDownload

PK	OK�l9�..mimetypeapplication/vnd.oasis.opendocument.spreadsheetPK	OK1}����Thumbnails/thumbnail.png�PNG


IHDR��[���~PLTE###+++333<<<CCCKKKSSS[[[ccckkksss{{{S�~�����������������������������������������������������������������������BJIDATx���������k�-v��`���s��
���t��J\���}�V�����f��-v�N�	;a'����v�N�	;a'����v�^
��~���������0l�������;����R�y��[�e���x�Xp
���_��`�Q.����������B=g`O
��?(e��������(��J�W�������6�]y\��\�����>�r;�
����cL���cKK�~�N��8�X������q.��d:��N���j��w{���y�N������g�8Z:��'fP��+�wP��\I���@<m�y�>l�(�OO�!>��Y���I�o|�?�J�����B���O�Dh��,��;����!���Z�r��#9��H��\�����?�����w�Ou������;����x�����6�����L�(�i4A�M��J��T�J����k�?���a���j���b\�Mp������a37���z��	%��a�5y���u�'R����-���l�v��X�����
�o$)KX�$��-� q�/9��NYW�S\ri���'���elv�o(�������
Yvnx{A����F�vgm������7
_�������������w4
(�M�~�z6P�m
���Bv���x�5�;Svm�7������*`��a�[@��eJ��JUG���.T��m����m���|m�&�����c#�+�����ZX��IT	P$�PjG���*�6b���k���d��X�x�E!�I�������y	����4-��{` �R^
�����|�{ 6�noL��D�k�������5]������fL���/�=t�����x�a�(�)���w2
�#����L�}2+�i�>{����2�W�=�FF��V.PJ����*���J.j?o���Jz��.�bu/�D/6�^+�e��Y�v�xVc1�f��D�M�� t��F/���+mY����-�.��ui*�.��N��JI���[P��p��B�QMI4�|cD�����W�����b�=�������a��t�zj�*����g_a?x@4�C�3��1*K
��b���{:1}�DDnE���+����<�����GadF�I���$-����1`k�p;^�m���c��o?G����/��k����Mt�o7o[��0��\�����M��	Q��H�VN������1V�*�����:�q���=�x�a#]�������p�!�.T�D�N�W�.)Z5v� Y�����&�M(�<���U���k�]��,2F��D�$v1x�+Y���X�Y�ko�VY<U`Csqt`�aC�"�	�+K���B��&�F?���9(�J�:B�X��}k4�����R(�-�S
�Ro �9T�[y���T�9�C+lm�9'�l��� g������O��~�p!�������BpPQ�2���6.hP�{4�	�� k��jG�K����#�u}���U��]��/x�FA������
��]S��5�"u��3�:��6��8|ZA�4���s����"�������Q"QmaHC�w���������q��%1H��R��d�e@��*_L6���zIC!��d��o�����c��[�\L���Je�>PR��n'���9���`��?�xF)�L&��ckZ��u)Z�M�z��@�14W���Pb����u}�\W|W��;���m]%�5b�sh[/��Z_�����/{��8����>U��v"l[��C������C~����,-��Y������V� ,�`n)���%�NN���X�F���H� b�`f���<d�g��;�����e��#�7��Q��N�n���-f�x�O[�4k���/�y3�
;���3;F�z���G����x���X���`3xt��x�s��V��@;`�J��H�]V<sa�i-��_<���0��1�[������9��Zw��;�f�
���R�W1���zC�sU��
z����Z�!�����*n�
��(�o��zK��������Vxc�/��9�{�N�	;a'����v�N�	;a'����v�N�	;a'����v�N�	;a'���������&G���-��n�l��*l�y���u	Z���yTrM��
+����Yy�cpii�R[nq�0�s`�U���eX�zR�'p�$vh;+��� ��Ebw2n������vf7n�O�%�[X�QK���V�����P$�<�XR�{wW��I�"�W���'�z�1�U��`O[8��c|}D��`���Pb)�NK����Saw��x�� ��;�x*s>��T��0����yx��{6���&����v�N�	;a?2��w���DucJ6�g��]�V4H��7�Cc�yK���g�r+���f��
�2(�.a�,R!���C�?a{k����V�^�r�3�7f|e�m��u�kw'��`7����V��l^D�}$�5a'�����7����m���Q�����1}�=��@�`-�nY��VV^���C�m_?���g.��r����?��'����8�]���S�F��6��w���aCNo]���$�{T/�kO�5a'������a�/P�	-Y����%lgQ��j����"h��
w	����
��~J�������������H�.b�w�"�{h���1.����\��w�����J����m�2U����v�N�;��)C�j�sO��>a��������('�ZE�OX���@�Ed��"rz�.b�;U�">�%�x�.b�`�I2Qp��H�;������,����W%�S�������zv�N�	;aO��'}}*�O�p��^Bx��Or�~����.`��������82I�rX]��j��������=��$|��������}u��=��
��7PB��o<
g��V;�����������6fF/����s��^����������A�V��8�\S�~X��|Uj�F�m��M�'�6�UW���"�������-�n@�D�N�]���toe�����r���XA�O�����;������f��G�������z�W�>���a'���0=�s��k�N�	������|�������_��������u�����kv���IEND�B`�PK	OKcontent.xml�Zm�����_!��"
��7�=o���h�=���M� (h����D������EI���K��wh���E��>�j���]U��qB�����i�:�9�W�������_���--
��yN�M�kae�����������zN'|^�
������u�5K����O��z+<�x't���D-�Wn���9C[]e)A�TWy�K����A�����~�0�B4s��n��6�)[9^��N;;��f��V*�\b�w<�sz�
��O��!��j��vh�@�v�a���+��gh�3���J�]��a���i���R%����c�
�����9�`����~�+V��%e'��i��T�c}J�U*�����]7t��Hz{V|���l$���P�
����������%��C$�O(����y~��?��?dk\��0�����@�>2Ln�IO#���21��O��[��m-��t��������QQ�8�F�[�o�Lr�y>�N+4��g<��2�����'��SA7u���
�5�9��Vm>�0fpI?�dW�F&����?��KG�PjU����|�=������D�yqh���g����~��#�,Y6�0t+�����{��3�#X������7oU��
�,q/�{�
���j8i��{���O�w���rj�4&�����5���'
$�G�H�>�<�;�3���<����'.p�1�����?��@�t�_������"��qm���as�	����bd7�6�B�%�������x�QoX�s��`�3���6{�������L���������!�r��b'�%����oG��j�������� ���2l�ka�;�.���0���������`02F�������{hG>�}��c�6�T�����m�>l����+��1���V�d�)��O#��+'�)�S���&�D�ZY��R�,��`�������i�����,`��|d��PIV������rAv�qh�0��0�J��6Q��k�k�q#NK�o��g������/���������f}@�="����a�BlEj���pQG.Lm��t��cI�|����
)�OO��V�w���v*���.�|�TV~�0�X������Y�M
QyE����N�W�Y�i�%���y�D^s���q�\
�,'/�����_����O��
�^�"p���Q�G��m�3p��w����b��K�)OGn}�=�������%���W���5����*4�b�)8���5ho� @�-q�{r:���w���� �C����<�r���5j$N�*B��.���5{r�
��yrV�]����@�bo�m �����4�(,��v����Q`�<�����j*H�dA�6
\��u(,������������F��6����A�v/x7�
�w������+�.�������ba���jB���S!�D�~���;(\-q��Ll���I�{��a
��V���ahK��-�����YS��h�5�G���W��I%m���es�m��J���a���^�����!t�G�}��r)�-U�p�+�,w�����qL�+��+����������
��S!�V��_��Cm���-�<_R=O���b�v�	��?>|c�~{����"��-���?�F
�a4Hdk�
>|��K������+J���/���F��x�ETw�(LC;L]_�^�Zpf�'DI���?�qC9����	��X��_���'�2~�f���;����l�u����zQ:��8v�d�98�������/��,J���F���h\	T��Ib���g��i\��qG�X;����k{��S?��@��c��3�u�k����P��a�q'�'l�q%P~�����@��M4�����u�x�\����<�� ��D���t�����tjy���O�a���z�����@%����=�����s:8��Y������RNG��L���q&z-�^����4H���%����j9�}7�?I5��D�:��.�P
\?u==P���:�h�*Z����K�0���ODp
�Ho��J��}�l�_-]�a��?������W�u��^h���z��D���������e��{a����Di����\+%R{<��@��[x'��*���G_m���T������}�+PK��0'c�-PK	OK
styles.xml�[������S
����d�L�����A�$����E{��(�
%
$e��e����'�CR�%[���3�M�"?~������!��
���b�E�7�E�RdK�o������V�eiJb�HX\�������	.��t.�����X,d�`%.��E��S�-l�p
n���A������~����p�;Xa������� ��2?fy�$��������[KY.�`��^lg�gA4�������8��$0�j2DQ`�9�h�~
�T���{�GS�$:�j��,W��8A�1-��d��k�����g�v�Y2�UfIsl�����7�;���{��+���Ka[T�����i����1��`�]�;
���|7��A���y�cDc�8��H\�������{%_��K�H:>�;S�k���PU���$���:������}��e���
j���Q(��w�ly����UEb����J���BT[�$4=��OY�!
	��J0�����)�1?>)��X�h�nE�3�e�����W�
��z��f:�Vv�L��)���������x�yb���K�-���?�����,4'�q�}�J&����Fo���~�X5��!Z�����q�}#V�
�7�{5�V3F�G!q~�No�����A�`L?_{�g�����B�?��_������W��^%{�g�5������������>+��W'?��z^"�2���_BDa.	�M�A
+���
u`��Wy�#P��@�hO`�lAQ�U(�^\��R�����?y�"|����wc��r,��������t�Ug��;����v]���Q�c,��'�3G����NH�>(}(�S�������&*�C��������8��nA��Js+����!b�����>,bF��z4]�oQ��*�D����3E�\#�_Y���Gh� �b�qO
��s����&��=�hr?g	�/�>-��"�j;W�-D��/E)�;��W����V5����>�Sz��~�+��?��!���������%��:<K9:�(����0/4MF������3���nV������57��b�
]��~
�=����I�(�KHF��D2�������]SE����]��Q�"u	�����3j�i{��6�a�������'�g����Q����:8.�#�ZzG� 1�7��*w����M�X�����`�e���K�U�T���f��_/����.@����b�h�����L~�t�����t�-�<n��8����s������:� o�C}�E����b�4��,>={��r�cz�s�Nu�'��>~A�H/x�;����/���?||��qy�s\
8�Y������f5�
:�?kfm�t�g����Pc]�������'��G���B�v�����W!��1.P������Q���h��#�C���:v��2���&�'��O�/%q|Y�=I�x������L����%���%��3%����K�w�zG$���ph�?5p�QVd�F\t{��J�>��.#��q��_�Hb�>�`�R�6�����Y��8e_�j��c��8��(>�����O��@(����p^���J42�^�P�BUb�F���5|��7�})�1Nv���c�1��C�����A�H
�*���;|]��G�V�R�j��W�W��9:���:�'�wzv�Vq�a�~k��b������{���a�FnjJ&�D��V�WE��))�`�$=�-I��/U�A���]�pW��b�����g4�Xr���N�X�F�������p�0RO�c7���L!�vU_N���q��;r�=h�h
���.��t���a��z]�Y�<�4c��F�D�5�J93�.}TdJ�yx�d�����T������V��=�J�����#�t�T��O�/�w6Y�����2�j�m���h]��~��[������v#7���{&�z��������<�����
��	�a��Wc7�q�x��9O/.�%�U����P���%�!}x��A(����E�^������i ����9X�K���I�qz3T��#�D��MO7*ICg�ft$6���V�����W�16�����m��X��{$��]�n�D����`�3���Vk�?�������|������*<��}�<{����v������H�'��$�6!���"Z3��t���'�{�������Xa��o�V�����
����*���B�qJw�O[���S��	��aV�yh������h��S��PK����T5PK	OKmeta.xml��K��0�������A �"�TUUS�RS����wR��3�����a:Y���w�9����r���FY��iB���R��6�������zW��G%�K+��N��(XM��W��u���Q
7�	��g0��/i>��E+�g����1��.����#���ogT�+wn�()0h�L�g�o��R=��d����Xz�c����<�G'��u���84�}��t��h:�b�,����5�r(#�>(��b��Y#Fv��,�+��l���y�o8J)��Z �v4�,��H����:�����
R�p�H�n�U}���]�9%_����
z��������<�����aD�����
H�����xh=k��D{�
�?�x�|��u��:8�:,g	I���=(�^�?�|����?;�����}h���M������-�[D�PK�P�OPK	OKmanifest.rdf���n�0��<�e��@/r(��j��5�X/������VQ�������F3�����a�����T4c)%�Hh��+:�.���:���+��j���*�wn*9_��-7l���(x��<O�"��8qH���	�Bi��|9��	fWQt���y� =��:���
a�R��� ��@�	L��t��NK�3��Q9�����`����<`�+�������^����\��|�hz�czu����#�`�2�O��;y���.�����vDl@��g�����UG�PK��h��PK	OK'Configurations2/accelerator/current.xmlPKPK	OKConfigurations2/toolpanel/PK	OKConfigurations2/statusbar/PK	OKConfigurations2/progressbar/PK	OKConfigurations2/toolbar/PK	OKConfigurations2/images/Bitmaps/PK	OKConfigurations2/popupmenu/PK	OKConfigurations2/floater/PK	OKConfigurations2/menubar/PK	OKObject 1/content.xml�Z[s��~����L����g��/'m�N�}�@$D�!	�,���.�(S2�ZIN���b{�v��r���,�g�8����������������Gs����wwt�%	^�4���fB+�
����Y�_�X�����P��Z$kZ���Z��k���pq,f�+f]Z���+,yG�h3_�b��S��s�%/U����^�[
Q/k$����T�����������-�2����V���I�W�X���������Z����X���I^��jWn0�$�����l6"��3�Ir�fcC1���������l�D~&'+�,���X`�\]�w���z��
�.O)�M�M�*s=�	��Y��_d�3"0�����	*�>���
��6p��Y���?#���r����[���/OI�K40���MRq��!2L&�����pM�����0![^o[.��|����5ci:�
��6�>��L���Q?����VL��(�:�b���k���Z<��ShKw�'W>|�1#r	Jl=�A�|A�`����v�s���_�.MnC�Yr��	�����t`�a�v�4h�=���M$��b

�f�5S���[M�P�-�	���FSul�p�����7T����������	��>��R&g ����ewa?��{>�5��T����%lQI��$I�u"Cu���������T��J����:��:��+k��
,U��9��Y�����S'��������M��:�(O���+\��t�93%(��r"`�����{�����k�:��7���
sL.-�/(LL?l|��h���(:�����H���s)���C@��~Q@������6���]%UR�Rl�$Mqe&�(�_lQ�q�"F��e-��z��363Tw<�d�h����������E���Du��?M�^t<%�.��,�]�4K��SyI�g�C�d8i ���K����@@>������	 �t���MW��k�v.B�<��Cp.���L8QW���IRa���I�������&�z������n.�m�Z�y�x��_@�3���7���%�R�!-��~��;���������cyaR���X�Y���5����+vVr�	��%=\ES����k�f�d���������{fv�Z�O	*)t��L����s&��a�7��Wn7����$}��}���\O�:&,J����g��������A��p��S}*+Z���~����c�~ ��-�:gx{�����)*Nd�;�'��
����z��V��lD���Z�Ld�����~��*n��|��F����6�Jr��`����S�|2��D�0��[�;[���w��
��+�����9+��
h-P���;fI����U�G�'����(�R�q~4Jt����j�KE��4��f��bv���;��������`J4����s����*��
�����BU+�5��K9�`��@��Y�
�5�'���|+#�{&���G�BN#O9��������r�igv�;U�4s���d6T��K.�����=�i��#;���b��t�P��;"�g�� i�^j�JY��Z�V�����KI��8:���������t���,mV���>���JG�z�	�2�dz=)a��w�������P�^�g9�	���Z�c��{�
WC>D~q��g�q�N��\R��2F�I���C��_�]�����1�LdeS��F���o�3���_��+�L{'Jr�F��W�)o�5;��@}���8t�-��}�y��Y��
��T�-Pc(�������-<��{�>���<?��[y�GE1��?4�m�v�+�v����V���\�DA�����]#.��Q���ve��k
Q*�Nb����|�2�C�������Z���O�r�R�U�����J���DT{Z��	�������g��/�
\
���%R�������[��;�72�D�����O�B5��0j$`�s�����>�J�E��2��9Y�rj��hE~���}���I��	��-��
b���h3�����`�
��p�G#s{��M�������4tD�y��asW	�f'��<"�Hs���K��Vp���5�2��v�����w� ��I-�L��7��-�(������>"��G�`Q�&A������
4u#��4/�&*B���Y'����^������NF�����2��lD���(�=(-��W�<"��{:���!?|?����rt���?v#7��	"�U��|e�O)|L�����-��c���{:���PK������-PK	OKObject 1/styles.xml��K�� ��9�KYc,+��)���LM� $SA�
�#� KA��PYJ����/�/c#w=�F��d�����bP
U_�����[�r�t����X�pe��?%7;'V�L�K�iE�a��
7�2-W���4	��?�Y�<�������=����������C���.����T�h$�1hZj���P?.����`<�~(��k��N'���������*���`���������G-��2��������nT'�7��ez��ru|�E�z�c=,�����C����<}G����\G8{�3*��h��q9v��/�Lk����b�[�v9H�>�.;��mo�����[g��e�������k>�>|^�����p�Vm�T�p������3KW���J��2��S�
n�����k�MTYy�{L���o���j-	.��Y���5	��Z�W�kLa������
�����=N�����L�
����Ct�PK�P���PK	OKObject 1/meta.xml��AO�0��~
Rw�����X���/�x[�}cUhI)c�QdA�������k�=�UpB�*�3QF��H�����<����Rs8(�\���]X�+�K�n�X�Hg57E�Z��[�7
����i����s��gF��5��{�/��%DI���N�W��l�))+Z�h;�Wj`�J��k����>.fl�z�K+eu�v	���Ia?u�ca��!����f#�I>�c�S�_�F[8c��z���x���i��)���o��~�
f���������d����~�����PK[��H!IPK	OKsettings.xml�Y]s�8}�_�������'�!�Y���L���	����I���~e:�k7�X;��}J��s/WG����Okm@H�|huO:V��>���u?�n�[�G�]�rI=p|���jKPJ�"[z:�N6<�"�$�J����<C��i����4X���Q�8��J��m�q|�OP���������_��/��j������G�dB�L��������%��4=k������.w�?m� Hj��=NRZ:������U4���*���X�ch��6���+k���.��q����*��N�
���j]��}wzv4��j]�}w08�T�o$lS����X�U:G�Ll�d���KS*��`�Zt�4��9'�"�%�r��P�
<�DB��CI��U��J�D�������\����?��|��(4~e��V���)�\������
��=E7���_���W|�o�r��y;+��#qKD�H�+T
���!s��(����<j����5 +H����g5��5�����"���#<|�-��SY���R�/�(��-=�@�D���n�����$���2�w/P��%a��pU�|�i��pBW����t�]�r�:A�"�0�C��Y_;���������H�z��S �>�	�*���7�K����m�@&]���F������������#O���OG���r�qF��<��������E����*5�ZJk�]����F�n�,M�!����R'�?Q�n�����r�6�=v���$�����H�d�����v7��#;��?�r��m
�����=���=`�K�}��B(B��7!��@�����E�S)86Q����km7K������Z�n�M)�k�L{-s��m&x�����	Ci�-�t���N*� >���h���X���[���R~�����_9�}�H��0/2��w����7�"�lpE9�*D�!��K�"�g:���x:�	������R�j����]�c��PKF���PK	OKObjectReplacements/Object 1�]ml�~��
��������Lb��1R��0_�&��P��8�	���i�VJ7j����*T����RU���������DJ�TJi�T5��*��UM�ygv�����w�=���{w�������y=s{�e��I��P�W������M�~Qo#�AU��d>�wpz�&'O���q�{8��F��:���8�Sy��_|Vd��>��<c^^H�q4WQMR��YQ�c8�]i>�>�7�$[���%��>����s��+B�E���u&v�'��������w�����iX���>��V��e�l������-��:�4��3_�i�i*�v�~���?+%���<�K;�X������i5�+F���^��K�GU/`�V�����9��$A2�k����o�;I�d��*�K��w����+�^,}3�I$���k�X�f��$H��i����o�;I�d�g%Z/���$	��u�Z/���$	��;��^,}3�I���w��J�7E�^,}3>�$4>��U��b����$��0~�N���7�SIB�`���i�X�f|*Ih|�Wm����&��['Ih|��2N_�O%	��E9X�M%	����/]�����x���SIB�b����$��1���T������vU*Ih|@����F�����T������Fe*Ih|@��}�6�$4>�[���J���q������|T��/V���J��+���[���H���Il�����opf���������N�f|���0��"XA7� ��_`K��e�`L\;�+�A�3���������������*��t���D/Gmi��Ty�|�\~8�-y.�	TOW��O���N�-�v�^8�5��wK�K�_���g�S��������������n�s��(�r����%���&[Z�J�q�k.?���<��o�$
��G��m���Yz�|�\~8�-������IR�D���^��Y�DTgi/�Y�=���QN��.RE7Jx�**���*�n�����[��#�T�W�=7}�T�
�������w�o���U�������]s��|���t>gH�)�Oc��9Y�n3�z}Z����]s��|���t>gH<���ofNV����&����;�5��wK�K�G����������RLS��w�����������$���_�u�d�����p/������%���|R
�R:��H�u����������o��2�6�����;4:8>|l��pdL�4��y	���p��d�������L�'�Y�	���'���. ���N���.��W�Q���G�BS�x��4j:�W������t�,pX=��%�ay�a�c���e����LgDD������m���5�OUk��|z���i;����j
}���<G~}��)_������^m�����F��n��v{��������.}C��5.$:�Z��4=5hS������+yU������2������<��F����5�eh���'�W�t|[3�ok�Y�y�t|c�7�w���i��o[F�m�;K��?���!��n�;K���R������w����b���wc�Y����vd��0�~��
������r\�����^����e=���8���8
*����wF����*FX�u�Q.����+�� }
Z|�����(��':YN��~���D��������K(-&zf	�*D��(O�}���y��P^� �]����.'����d�Wc��PR���2T(�P��"Z �P���2T,C�d�D�Jeh�-���l�e2T.�P���2T!�P��JZ&C�e�J��eh����U"����w���8�E|��bLH���(9��r��Cx������(�z���Fq4���	�cu��p� j��z���k.|�Q�?�> #�7��V5�yc���zc����!��n�M]j�!��=`�!�(����]E��at|���������i�A��(S�P�A��������y�������C�@��D����+(?���1z����������^�~�_�G�����nC���������<Jt5�{(?���������f��(��':�@4�H�J���rq-�/�=�D4��(7<�Rp+��Q~���d��V�[��$N������s,���o�-D
��>AY����s�����_��:��T~����O���=���^���N�F����Z����d�f���j��kdh����&Y
2���-�
e��U��e�E�Ze(&Cm2�A��eh�u��&�M�6��9(2dl���e�K2�MVC��N��o(C�!C�e(.C;d�[�v��.�-C{dh����>��=/CF����;d�_VC����p�#}���^pxd����>v�?XKK��$��&�1�g�������G�G�����H��>=�U�k(�����?�����v�s����Fv8s^�|�\���VFS��[����q��_��-B����(�D]`��8t5������
I�����R�w����q�����.]$s�q����.�z�����cg��q��aW�V�Z:;����0�q�vx���77����X���e�P����f����6�>�G^@�3�ATDt�\}es1�P."�(!:�r[)��Q�^?�K���k��l���[�7�\.��r�-���X@����2��-8����pl�A��?���P�Z?4IG����?��H�&�CQv��������������B?d�@��O���8���!#�?����13���?��.���PK��j�
�{PK	OKMETA-INF/manifest.xml�U�N�0��+"����'�6�T�!�rF���F~{���c�����w=��]���d�d�����gW$�M)t]���4�&�qo��8���$���
���
s���)p9��4�K����w|�Qj�=2�%;�JHH�y���+/e�0��"������n� �i���.t�mg�>3�X`�� �]���^�j&���]f��X��@�~'n4F����VH�v'^�k	���
����+���:������������P�e���K����;��������2/2���M���
8&}z��j�4b-�Yf�e������v|K ��\�z�F2����
�(p���LU	i]���(8\�P���T�*��(.�������CM�3�����d�PK�EGn��PK	OK�l9�..mimetypePK	OK1}����TThumbnails/thumbnail.pngPK	OK��0'c�-a
content.xmlPK	OK����T5
�styles.xmlPK	OK�P�O�meta.xmlPK	OK��h��� manifest.rdfPK	OK'"Configurations2/accelerator/current.xmlPK	OKe"Configurations2/toolpanel/PK	OK�"Configurations2/statusbar/PK	OK�"Configurations2/progressbar/PK	OK#Configurations2/toolbar/PK	OKE#Configurations2/images/Bitmaps/PK	OK�#Configurations2/popupmenu/PK	OK�#Configurations2/floater/PK	OK�#Configurations2/menubar/PK	OK������-&$Object 1/content.xmlPK	OK�P���.-Object 1/styles.xmlPK	OK[��H!IF/Object 1/meta.xmlPK	OKF����0settings.xmlPK	OK��j�
�{�4ObjectReplacements/Object 1PK	OK�EGn���?META-INF/manifest.xmlPK{�A

Cache_data_in_GetSnapshotData_POC_03.patchapplication/octet-stream; name=Cache_data_in_GetSnapshotData_POC_03.patchDownload

diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index a7e8cf2..4d7debe 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -366,11 +366,13 @@ ProcArrayRemove(PGPROC *proc, TransactionId latestXid)
 					(arrayP->numProcs - index - 1) * sizeof(int));
 			arrayP->pgprocnos[arrayP->numProcs - 1] = -1;	/* for debugging */
 			arrayP->numProcs--;
+			ProcGlobal->is_snapshot_valid = false;
 			LWLockRelease(ProcArrayLock);
 			return;
 		}
 	}
 
+	ProcGlobal->is_snapshot_valid = false;
 	/* Oops */
 	LWLockRelease(ProcArrayLock);
 
@@ -414,6 +416,7 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
 		if (LWLockConditionalAcquire(ProcArrayLock, LW_EXCLUSIVE))
 		{
 			ProcArrayEndTransactionInternal(proc, pgxact, latestXid);
+			ProcGlobal->is_snapshot_valid = false;
 			LWLockRelease(ProcArrayLock);
 		}
 		else
@@ -565,6 +568,8 @@ ProcArrayGroupClearXid(PGPROC *proc, TransactionId latestXid)
 		nextidx = pg_atomic_read_u32(&proc->procArrayGroupNext);
 	}
 
+	ProcGlobal->is_snapshot_valid = false;
+
 	/* We're done with the lock now. */
 	LWLockRelease(ProcArrayLock);
 
@@ -1552,6 +1557,8 @@ GetSnapshotData(Snapshot snapshot)
 					 errmsg("out of memory")));
 	}
 
+	snapshot->takenDuringRecovery = RecoveryInProgress();
+
 	/*
 	 * It is sufficient to get shared lock on ProcArrayLock, even if we are
 	 * going to set MyPgXact->xmin.
@@ -1566,100 +1573,200 @@ GetSnapshotData(Snapshot snapshot)
 	/* initialize xmin calculation with xmax */
 	globalxmin = xmin = xmax;
 
-	snapshot->takenDuringRecovery = RecoveryInProgress();
-
 	if (!snapshot->takenDuringRecovery)
 	{
-		int		   *pgprocnos = arrayP->pgprocnos;
-		int			numProcs;
+		if (ProcGlobal->is_snapshot_valid)
+		{
+			Snapshot	csnap = &ProcGlobal->cached_snapshot;
+			TransactionId *saved_xip;
+			TransactionId *saved_subxip;
 
-		/*
-		 * Spin over procArray checking xid, xmin, and subxids.  The goal is
-		 * to gather all active xids, find the lowest xmin, and try to record
-		 * subxids.
-		 */
-		numProcs = arrayP->numProcs;
-		for (index = 0; index < numProcs; index++)
+			saved_xip = snapshot->xip;
+			saved_subxip = snapshot->subxip;
+
+			memcpy(snapshot, csnap, sizeof(SnapshotData));
+
+			snapshot->xip = saved_xip;
+			snapshot->subxip = saved_subxip;
+
+			memcpy(snapshot->xip, csnap->xip,
+				   sizeof(TransactionId) * csnap->xcnt);
+			memcpy(snapshot->subxip, csnap->subxip,
+				   sizeof(TransactionId) * csnap->subxcnt);
+			globalxmin = ProcGlobal->cached_snapshot_globalxmin;
+			xmin = csnap->xmin;
+
+			count = csnap->xcnt;
+			subcount = csnap->subxcnt;
+			suboverflowed = csnap->suboverflowed;
+
+			Assert(TransactionIdIsValid(globalxmin));
+			Assert(TransactionIdIsValid(xmin));
+		}
+		else					/* compute snapshot. */
 		{
-			int			pgprocno = pgprocnos[index];
-			volatile PGXACT *pgxact = &allPgXact[pgprocno];
-			TransactionId xid;
+			int		   *pgprocnos = arrayP->pgprocnos;
+			int			numProcs;
 
 			/*
-			 * Backend is doing logical decoding which manages xmin
-			 * separately, check below.
+			 * Spin over procArray checking xid, xmin, and subxids.  The goal
+			 * is to gather all active xids, find the lowest xmin, and try to
+			 * record subxids.
 			 */
-			if (pgxact->vacuumFlags & PROC_IN_LOGICAL_DECODING)
-				continue;
+			numProcs = arrayP->numProcs;
+			for (index = 0; index < numProcs; index++)
+			{
+				int			pgprocno = pgprocnos[index];
+				volatile PGXACT *pgxact = &allPgXact[pgprocno];
+				TransactionId xid;
 
-			/* Ignore procs running LAZY VACUUM */
-			if (pgxact->vacuumFlags & PROC_IN_VACUUM)
-				continue;
+				/*
+				 * Backend is doing logical decoding which manages xmin
+				 * separately, check below.
+				 */
+				if (pgxact->vacuumFlags & PROC_IN_LOGICAL_DECODING)
+					continue;
 
-			/* Update globalxmin to be the smallest valid xmin */
-			xid = pgxact->xmin; /* fetch just once */
-			if (TransactionIdIsNormal(xid) &&
-				NormalTransactionIdPrecedes(xid, globalxmin))
-				globalxmin = xid;
+				/* Ignore procs running LAZY VACUUM */
+				if (pgxact->vacuumFlags & PROC_IN_VACUUM)
+					continue;
 
-			/* Fetch xid just once - see GetNewTransactionId */
-			xid = pgxact->xid;
+				/* Update globalxmin to be the smallest valid xmin */
+				xid = pgxact->xmin; /* fetch just once */
+				if (TransactionIdIsNormal(xid) &&
+					NormalTransactionIdPrecedes(xid, globalxmin))
+					globalxmin = xid;
 
-			/*
-			 * If the transaction has no XID assigned, we can skip it; it
-			 * won't have sub-XIDs either.  If the XID is >= xmax, we can also
-			 * skip it; such transactions will be treated as running anyway
-			 * (and any sub-XIDs will also be >= xmax).
-			 */
-			if (!TransactionIdIsNormal(xid)
-				|| !NormalTransactionIdPrecedes(xid, xmax))
-				continue;
+				/* Fetch xid just once - see GetNewTransactionId */
+				xid = pgxact->xid;
 
-			/*
-			 * We don't include our own XIDs (if any) in the snapshot, but we
-			 * must include them in xmin.
-			 */
-			if (NormalTransactionIdPrecedes(xid, xmin))
-				xmin = xid;
-			if (pgxact == MyPgXact)
-				continue;
+				/*
+				 * If the transaction has no XID assigned, we can skip it; it
+				 * won't have sub-XIDs either.  If the XID is >= xmax, we can
+				 * also skip it; such transactions will be treated as running
+				 * anyway (and any sub-XIDs will also be >= xmax).
+				 */
+				if (!TransactionIdIsNormal(xid)
+					|| !NormalTransactionIdPrecedes(xid, xmax))
+					continue;
+
+				/*
+				 * We don't include our own XIDs (if any) in the snapshot, but
+				 * we must include them in xmin.
+				 */
+				if (NormalTransactionIdPrecedes(xid, xmin))
+					xmin = xid;
+				if (pgxact == MyPgXact)
+					continue;
 
-			/* Add XID to snapshot. */
-			snapshot->xip[count++] = xid;
+				/* Add XID to snapshot. */
+				snapshot->xip[count++] = xid;
+
+				/*
+				 * Save subtransaction XIDs if possible (if we've already
+				 * overflowed, there's no point).  Note that the subxact XIDs
+				 * must be later than their parent, so no need to check them
+				 * against xmin.  We could filter against xmax, but it seems
+				 * better not to do that much work while holding the
+				 * ProcArrayLock.
+				 *
+				 * The other backend can add more subxids concurrently, but
+				 * cannot remove any.  Hence it's important to fetch nxids
+				 * just once. Should be safe to use memcpy, though.  (We
+				 * needn't worry about missing any xids added concurrently,
+				 * because they must postdate xmax.)
+				 *
+				 * Again, our own XIDs are not included in the snapshot.
+				 */
+				if (!suboverflowed)
+				{
+					if (pgxact->overflowed)
+						suboverflowed = true;
+					else
+					{
+						int			nxids = pgxact->nxids;
+
+						if (nxids > 0)
+						{
+							volatile PGPROC *proc = &allProcs[pgprocno];
+
+							memcpy(snapshot->subxip + subcount,
+								   (void *) proc->subxids.xids,
+								   nxids * sizeof(TransactionId));
+							subcount += nxids;
+						}
+					}
+				}
+			}
 
 			/*
-			 * Save subtransaction XIDs if possible (if we've already
-			 * overflowed, there's no point).  Note that the subxact XIDs must
-			 * be later than their parent, so no need to check them against
-			 * xmin.  We could filter against xmax, but it seems better not to
-			 * do that much work while holding the ProcArrayLock.
-			 *
-			 * The other backend can add more subxids concurrently, but cannot
-			 * remove any.  Hence it's important to fetch nxids just once.
-			 * Should be safe to use memcpy, though.  (We needn't worry about
-			 * missing any xids added concurrently, because they must postdate
-			 * xmax.)
-			 *
-			 * Again, our own XIDs are not included in the snapshot.
+			 * Let only one of the caller cache the computed snapshot, and
+			 * others can continue as before.
 			 */
-			if (!suboverflowed)
+			if (!ProcGlobal->is_snapshot_valid &&
+				LWLockConditionalAcquire(&ProcGlobal->CachedSnapshotLock,
+										 LW_EXCLUSIVE))
 			{
-				if (pgxact->overflowed)
-					suboverflowed = true;
-				else
+				Snapshot	csnap = &ProcGlobal->cached_snapshot;
+				TransactionId *saved_xip;
+				TransactionId *saved_subxip;
+				int			curr_subcount = subcount;
+
+				ProcGlobal->cached_snapshot_globalxmin = globalxmin;
+
+				saved_xip = csnap->xip;
+				saved_subxip = csnap->subxip;
+				memcpy(csnap, snapshot, sizeof(SnapshotData));
+				csnap->xip = saved_xip;
+				csnap->subxip = saved_subxip;
+
+				csnap->xmin = xmin;
+
+				memcpy(csnap->xip, snapshot->xip,
+					   sizeof(TransactionId) * count);
+				memcpy(csnap->subxip, snapshot->subxip,
+					   sizeof(TransactionId) * subcount);
+
+				/* Add my own XID and sub-XIDs to snapshot. */
+				if (TransactionIdIsNormal(MyPgXact->xid)
+					&& NormalTransactionIdPrecedes(MyPgXact->xid, xmax))
 				{
-					int			nxids = pgxact->nxids;
 
-					if (nxids > 0)
-					{
-						volatile PGPROC *proc = &allProcs[pgprocno];
+					csnap->xip[count] = MyPgXact->xid;
+					csnap->xcnt = count + 1;
 
-						memcpy(snapshot->subxip + subcount,
-							   (void *) proc->subxids.xids,
-							   nxids * sizeof(TransactionId));
-						subcount += nxids;
+					if (!suboverflowed)
+					{
+						if (MyPgXact->overflowed)
+							suboverflowed = true;
+						else
+						{
+							int			nxids = MyPgXact->nxids;
+
+							if (nxids > 0)
+							{
+								memcpy(csnap->subxip + subcount,
+									   (void *) MyProc->subxids.xids,
+									   nxids * sizeof(TransactionId));
+								curr_subcount += nxids;
+							}
+						}
 					}
+
 				}
+				else
+					csnap->xcnt = count;
+
+				csnap->subxcnt = curr_subcount;
+				csnap->suboverflowed = suboverflowed;
+
+				/*
+				 * The memory barrier has be to be placed here to ensure that
+				 * is_snapshot_valid is set only after snapshot is cached.
+				 */
+				pg_write_barrier();
+				ProcGlobal->is_snapshot_valid = true;
+				LWLockRelease(&ProcGlobal->CachedSnapshotLock);
 			}
 		}
 	}
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index bfa8499..f327838 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -54,7 +54,7 @@
 #include "storage/spin.h"
 #include "utils/timeout.h"
 #include "utils/timestamp.h"
-
+#include "port/atomics.h"
 
 /* GUC variables */
 int			DeadlockTimeout = 1000;
@@ -118,6 +118,13 @@ ProcGlobalShmemSize(void)
 	size = add_size(size, mul_size(NUM_AUXILIARY_PROCS, sizeof(PGXACT)));
 	size = add_size(size, mul_size(max_prepared_xacts, sizeof(PGXACT)));
 
+	/* for the cached snapshot */
+#define PROCARRAY_MAXPROCS	(MaxBackends + max_prepared_xacts)
+	size = add_size(size, mul_size(sizeof(TransactionId), PROCARRAY_MAXPROCS));
+#define TOTAL_MAX_CACHED_SUBXIDS \
+	((PGPROC_MAX_CACHED_SUBXIDS + 1) * PROCARRAY_MAXPROCS)
+	size = add_size(size, mul_size(sizeof(TransactionId), TOTAL_MAX_CACHED_SUBXIDS));
+
 	return size;
 }
 
@@ -278,6 +285,15 @@ InitProcGlobal(void)
 	/* Create ProcStructLock spinlock, too */
 	ProcStructLock = (slock_t *) ShmemAlloc(sizeof(slock_t));
 	SpinLockInit(ProcStructLock);
+
+	/* Initialize structure for cached snapshot */
+	ProcGlobal->is_snapshot_valid = false;
+	LWLockInitialize(&ProcGlobal->CachedSnapshotLock, LWTRANCHE_PROC);
+	ProcGlobal->cached_snapshot.xip =
+						ShmemAlloc(PROCARRAY_MAXPROCS * sizeof(TransactionId));
+	ProcGlobal->cached_snapshot.subxip =
+				ShmemAlloc(TOTAL_MAX_CACHED_SUBXIDS * sizeof(TransactionId));
+	ProcGlobal->cached_snapshot_globalxmin = InvalidTransactionId;
 }
 
 /*
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 7dbaa81..7d06904 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -16,6 +16,7 @@
 
 #include "access/xlogdefs.h"
 #include "lib/ilist.h"
+#include "utils/snapshot.h"
 #include "storage/latch.h"
 #include "storage/lock.h"
 #include "storage/pg_sema.h"
@@ -253,6 +254,22 @@ typedef struct PROC_HDR
 	int			startupProcPid;
 	/* Buffer id of the buffer that Startup process waits for pin on, or -1 */
 	int			startupBufferPinWaitBufId;
+
+	/*
+	 * In GetSnapshotData we can reuse the previously snapshot computed if no
+	 * new transaction has committed or rolledback. Thus saving good amount of
+	 * computation cycle under GetSnapshotData where we need to iterate
+	 * through procArray every time we try to get the current snapshot. Below
+	 * members help us to save the previously computed snapshot in global
+	 * shared memory and any process which want to get current snapshot can
+	 * directly copy from them if it is still valid.
+	 */
+	SnapshotData cached_snapshot;	/* Previously saved snapshot */
+	volatile bool is_snapshot_valid;	/* is above snapshot valid */
+	TransactionId cached_snapshot_globalxmin;	/* globalxmin when above
+												 * snapshot was computed */
+	LWLock		CachedSnapshotLock; /* A try lock to make sure only one
+									 * process write to cached_snapshot */
 } PROC_HDR;
 
 extern PROC_HDR *ProcGlobal;

#22

Mithun Cy

mithun.cy@enterprisedb.com

over 8 years ago

In reply to: Mithun Cy (#21)

1 attachment(s)

On Wed, Aug 2, 2017 at 3:42 PM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:
Sorry, there was an unnecessary header included in proc.c which should
be removed adding the corrected patch.

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Attachments:

Cache_data_in_GetSnapshotData_POC_04.patchapplication/octet-stream; name=Cache_data_in_GetSnapshotData_POC_04.patchDownload

diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index a7e8cf2..4d7debe 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -366,11 +366,13 @@ ProcArrayRemove(PGPROC *proc, TransactionId latestXid)
 					(arrayP->numProcs - index - 1) * sizeof(int));
 			arrayP->pgprocnos[arrayP->numProcs - 1] = -1;	/* for debugging */
 			arrayP->numProcs--;
+			ProcGlobal->is_snapshot_valid = false;
 			LWLockRelease(ProcArrayLock);
 			return;
 		}
 	}
 
+	ProcGlobal->is_snapshot_valid = false;
 	/* Oops */
 	LWLockRelease(ProcArrayLock);
 
@@ -414,6 +416,7 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
 		if (LWLockConditionalAcquire(ProcArrayLock, LW_EXCLUSIVE))
 		{
 			ProcArrayEndTransactionInternal(proc, pgxact, latestXid);
+			ProcGlobal->is_snapshot_valid = false;
 			LWLockRelease(ProcArrayLock);
 		}
 		else
@@ -565,6 +568,8 @@ ProcArrayGroupClearXid(PGPROC *proc, TransactionId latestXid)
 		nextidx = pg_atomic_read_u32(&proc->procArrayGroupNext);
 	}
 
+	ProcGlobal->is_snapshot_valid = false;
+
 	/* We're done with the lock now. */
 	LWLockRelease(ProcArrayLock);
 
@@ -1552,6 +1557,8 @@ GetSnapshotData(Snapshot snapshot)
 					 errmsg("out of memory")));
 	}
 
+	snapshot->takenDuringRecovery = RecoveryInProgress();
+
 	/*
 	 * It is sufficient to get shared lock on ProcArrayLock, even if we are
 	 * going to set MyPgXact->xmin.
@@ -1566,100 +1573,200 @@ GetSnapshotData(Snapshot snapshot)
 	/* initialize xmin calculation with xmax */
 	globalxmin = xmin = xmax;
 
-	snapshot->takenDuringRecovery = RecoveryInProgress();
-
 	if (!snapshot->takenDuringRecovery)
 	{
-		int		   *pgprocnos = arrayP->pgprocnos;
-		int			numProcs;
+		if (ProcGlobal->is_snapshot_valid)
+		{
+			Snapshot	csnap = &ProcGlobal->cached_snapshot;
+			TransactionId *saved_xip;
+			TransactionId *saved_subxip;
 
-		/*
-		 * Spin over procArray checking xid, xmin, and subxids.  The goal is
-		 * to gather all active xids, find the lowest xmin, and try to record
-		 * subxids.
-		 */
-		numProcs = arrayP->numProcs;
-		for (index = 0; index < numProcs; index++)
+			saved_xip = snapshot->xip;
+			saved_subxip = snapshot->subxip;
+
+			memcpy(snapshot, csnap, sizeof(SnapshotData));
+
+			snapshot->xip = saved_xip;
+			snapshot->subxip = saved_subxip;
+
+			memcpy(snapshot->xip, csnap->xip,
+				   sizeof(TransactionId) * csnap->xcnt);
+			memcpy(snapshot->subxip, csnap->subxip,
+				   sizeof(TransactionId) * csnap->subxcnt);
+			globalxmin = ProcGlobal->cached_snapshot_globalxmin;
+			xmin = csnap->xmin;
+
+			count = csnap->xcnt;
+			subcount = csnap->subxcnt;
+			suboverflowed = csnap->suboverflowed;
+
+			Assert(TransactionIdIsValid(globalxmin));
+			Assert(TransactionIdIsValid(xmin));
+		}
+		else					/* compute snapshot. */
 		{
-			int			pgprocno = pgprocnos[index];
-			volatile PGXACT *pgxact = &allPgXact[pgprocno];
-			TransactionId xid;
+			int		   *pgprocnos = arrayP->pgprocnos;
+			int			numProcs;
 
 			/*
-			 * Backend is doing logical decoding which manages xmin
-			 * separately, check below.
+			 * Spin over procArray checking xid, xmin, and subxids.  The goal
+			 * is to gather all active xids, find the lowest xmin, and try to
+			 * record subxids.
 			 */
-			if (pgxact->vacuumFlags & PROC_IN_LOGICAL_DECODING)
-				continue;
+			numProcs = arrayP->numProcs;
+			for (index = 0; index < numProcs; index++)
+			{
+				int			pgprocno = pgprocnos[index];
+				volatile PGXACT *pgxact = &allPgXact[pgprocno];
+				TransactionId xid;
 
-			/* Ignore procs running LAZY VACUUM */
-			if (pgxact->vacuumFlags & PROC_IN_VACUUM)
-				continue;
+				/*
+				 * Backend is doing logical decoding which manages xmin
+				 * separately, check below.
+				 */
+				if (pgxact->vacuumFlags & PROC_IN_LOGICAL_DECODING)
+					continue;
 
-			/* Update globalxmin to be the smallest valid xmin */
-			xid = pgxact->xmin; /* fetch just once */
-			if (TransactionIdIsNormal(xid) &&
-				NormalTransactionIdPrecedes(xid, globalxmin))
-				globalxmin = xid;
+				/* Ignore procs running LAZY VACUUM */
+				if (pgxact->vacuumFlags & PROC_IN_VACUUM)
+					continue;
 
-			/* Fetch xid just once - see GetNewTransactionId */
-			xid = pgxact->xid;
+				/* Update globalxmin to be the smallest valid xmin */
+				xid = pgxact->xmin; /* fetch just once */
+				if (TransactionIdIsNormal(xid) &&
+					NormalTransactionIdPrecedes(xid, globalxmin))
+					globalxmin = xid;
 
-			/*
-			 * If the transaction has no XID assigned, we can skip it; it
-			 * won't have sub-XIDs either.  If the XID is >= xmax, we can also
-			 * skip it; such transactions will be treated as running anyway
-			 * (and any sub-XIDs will also be >= xmax).
-			 */
-			if (!TransactionIdIsNormal(xid)
-				|| !NormalTransactionIdPrecedes(xid, xmax))
-				continue;
+				/* Fetch xid just once - see GetNewTransactionId */
+				xid = pgxact->xid;
 
-			/*
-			 * We don't include our own XIDs (if any) in the snapshot, but we
-			 * must include them in xmin.
-			 */
-			if (NormalTransactionIdPrecedes(xid, xmin))
-				xmin = xid;
-			if (pgxact == MyPgXact)
-				continue;
+				/*
+				 * If the transaction has no XID assigned, we can skip it; it
+				 * won't have sub-XIDs either.  If the XID is >= xmax, we can
+				 * also skip it; such transactions will be treated as running
+				 * anyway (and any sub-XIDs will also be >= xmax).
+				 */
+				if (!TransactionIdIsNormal(xid)
+					|| !NormalTransactionIdPrecedes(xid, xmax))
+					continue;
+
+				/*
+				 * We don't include our own XIDs (if any) in the snapshot, but
+				 * we must include them in xmin.
+				 */
+				if (NormalTransactionIdPrecedes(xid, xmin))
+					xmin = xid;
+				if (pgxact == MyPgXact)
+					continue;
 
-			/* Add XID to snapshot. */
-			snapshot->xip[count++] = xid;
+				/* Add XID to snapshot. */
+				snapshot->xip[count++] = xid;
+
+				/*
+				 * Save subtransaction XIDs if possible (if we've already
+				 * overflowed, there's no point).  Note that the subxact XIDs
+				 * must be later than their parent, so no need to check them
+				 * against xmin.  We could filter against xmax, but it seems
+				 * better not to do that much work while holding the
+				 * ProcArrayLock.
+				 *
+				 * The other backend can add more subxids concurrently, but
+				 * cannot remove any.  Hence it's important to fetch nxids
+				 * just once. Should be safe to use memcpy, though.  (We
+				 * needn't worry about missing any xids added concurrently,
+				 * because they must postdate xmax.)
+				 *
+				 * Again, our own XIDs are not included in the snapshot.
+				 */
+				if (!suboverflowed)
+				{
+					if (pgxact->overflowed)
+						suboverflowed = true;
+					else
+					{
+						int			nxids = pgxact->nxids;
+
+						if (nxids > 0)
+						{
+							volatile PGPROC *proc = &allProcs[pgprocno];
+
+							memcpy(snapshot->subxip + subcount,
+								   (void *) proc->subxids.xids,
+								   nxids * sizeof(TransactionId));
+							subcount += nxids;
+						}
+					}
+				}
+			}
 
 			/*
-			 * Save subtransaction XIDs if possible (if we've already
-			 * overflowed, there's no point).  Note that the subxact XIDs must
-			 * be later than their parent, so no need to check them against
-			 * xmin.  We could filter against xmax, but it seems better not to
-			 * do that much work while holding the ProcArrayLock.
-			 *
-			 * The other backend can add more subxids concurrently, but cannot
-			 * remove any.  Hence it's important to fetch nxids just once.
-			 * Should be safe to use memcpy, though.  (We needn't worry about
-			 * missing any xids added concurrently, because they must postdate
-			 * xmax.)
-			 *
-			 * Again, our own XIDs are not included in the snapshot.
+			 * Let only one of the caller cache the computed snapshot, and
+			 * others can continue as before.
 			 */
-			if (!suboverflowed)
+			if (!ProcGlobal->is_snapshot_valid &&
+				LWLockConditionalAcquire(&ProcGlobal->CachedSnapshotLock,
+										 LW_EXCLUSIVE))
 			{
-				if (pgxact->overflowed)
-					suboverflowed = true;
-				else
+				Snapshot	csnap = &ProcGlobal->cached_snapshot;
+				TransactionId *saved_xip;
+				TransactionId *saved_subxip;
+				int			curr_subcount = subcount;
+
+				ProcGlobal->cached_snapshot_globalxmin = globalxmin;
+
+				saved_xip = csnap->xip;
+				saved_subxip = csnap->subxip;
+				memcpy(csnap, snapshot, sizeof(SnapshotData));
+				csnap->xip = saved_xip;
+				csnap->subxip = saved_subxip;
+
+				csnap->xmin = xmin;
+
+				memcpy(csnap->xip, snapshot->xip,
+					   sizeof(TransactionId) * count);
+				memcpy(csnap->subxip, snapshot->subxip,
+					   sizeof(TransactionId) * subcount);
+
+				/* Add my own XID and sub-XIDs to snapshot. */
+				if (TransactionIdIsNormal(MyPgXact->xid)
+					&& NormalTransactionIdPrecedes(MyPgXact->xid, xmax))
 				{
-					int			nxids = pgxact->nxids;
 
-					if (nxids > 0)
-					{
-						volatile PGPROC *proc = &allProcs[pgprocno];
+					csnap->xip[count] = MyPgXact->xid;
+					csnap->xcnt = count + 1;
 
-						memcpy(snapshot->subxip + subcount,
-							   (void *) proc->subxids.xids,
-							   nxids * sizeof(TransactionId));
-						subcount += nxids;
+					if (!suboverflowed)
+					{
+						if (MyPgXact->overflowed)
+							suboverflowed = true;
+						else
+						{
+							int			nxids = MyPgXact->nxids;
+
+							if (nxids > 0)
+							{
+								memcpy(csnap->subxip + subcount,
+									   (void *) MyProc->subxids.xids,
+									   nxids * sizeof(TransactionId));
+								curr_subcount += nxids;
+							}
+						}
 					}
+
 				}
+				else
+					csnap->xcnt = count;
+
+				csnap->subxcnt = curr_subcount;
+				csnap->suboverflowed = suboverflowed;
+
+				/*
+				 * The memory barrier has be to be placed here to ensure that
+				 * is_snapshot_valid is set only after snapshot is cached.
+				 */
+				pg_write_barrier();
+				ProcGlobal->is_snapshot_valid = true;
+				LWLockRelease(&ProcGlobal->CachedSnapshotLock);
 			}
 		}
 	}
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index bfa8499..e67fd60 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -118,6 +118,13 @@ ProcGlobalShmemSize(void)
 	size = add_size(size, mul_size(NUM_AUXILIARY_PROCS, sizeof(PGXACT)));
 	size = add_size(size, mul_size(max_prepared_xacts, sizeof(PGXACT)));
 
+	/* for the cached snapshot */
+#define PROCARRAY_MAXPROCS	(MaxBackends + max_prepared_xacts)
+	size = add_size(size, mul_size(sizeof(TransactionId), PROCARRAY_MAXPROCS));
+#define TOTAL_MAX_CACHED_SUBXIDS \
+	((PGPROC_MAX_CACHED_SUBXIDS + 1) * PROCARRAY_MAXPROCS)
+	size = add_size(size, mul_size(sizeof(TransactionId), TOTAL_MAX_CACHED_SUBXIDS));
+
 	return size;
 }
 
@@ -278,6 +285,15 @@ InitProcGlobal(void)
 	/* Create ProcStructLock spinlock, too */
 	ProcStructLock = (slock_t *) ShmemAlloc(sizeof(slock_t));
 	SpinLockInit(ProcStructLock);
+
+	/* Initialize structure for cached snapshot */
+	ProcGlobal->is_snapshot_valid = false;
+	LWLockInitialize(&ProcGlobal->CachedSnapshotLock, LWTRANCHE_PROC);
+	ProcGlobal->cached_snapshot.xip =
+						ShmemAlloc(PROCARRAY_MAXPROCS * sizeof(TransactionId));
+	ProcGlobal->cached_snapshot.subxip =
+				ShmemAlloc(TOTAL_MAX_CACHED_SUBXIDS * sizeof(TransactionId));
+	ProcGlobal->cached_snapshot_globalxmin = InvalidTransactionId;
 }
 
 /*
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 7dbaa81..7d06904 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -16,6 +16,7 @@
 
 #include "access/xlogdefs.h"
 #include "lib/ilist.h"
+#include "utils/snapshot.h"
 #include "storage/latch.h"
 #include "storage/lock.h"
 #include "storage/pg_sema.h"
@@ -253,6 +254,22 @@ typedef struct PROC_HDR
 	int			startupProcPid;
 	/* Buffer id of the buffer that Startup process waits for pin on, or -1 */
 	int			startupBufferPinWaitBufId;
+
+	/*
+	 * In GetSnapshotData we can reuse the previously snapshot computed if no
+	 * new transaction has committed or rolledback. Thus saving good amount of
+	 * computation cycle under GetSnapshotData where we need to iterate
+	 * through procArray every time we try to get the current snapshot. Below
+	 * members help us to save the previously computed snapshot in global
+	 * shared memory and any process which want to get current snapshot can
+	 * directly copy from them if it is still valid.
+	 */
+	SnapshotData cached_snapshot;	/* Previously saved snapshot */
+	volatile bool is_snapshot_valid;	/* is above snapshot valid */
+	TransactionId cached_snapshot_globalxmin;	/* globalxmin when above
+												 * snapshot was computed */
+	LWLock		CachedSnapshotLock; /* A try lock to make sure only one
+									 * process write to cached_snapshot */
 } PROC_HDR;
 
 extern PROC_HDR *ProcGlobal;

#23

Andres Freund

andres@anarazel.de

over 8 years ago

In reply to: Mithun Cy (#22)

Hi,

I think this patch should have a "cover letter" explaining what it's
trying to achieve, how it does so and why it's safe/correct. I think
it'd also be good to try to show some of the worst cases of this patch
(i.e. where the caching only adds overhead, but never gets used), and
some of the best cases (where the cache gets used quite often, and
reduces contention significantly).

I wonder if there's a way to avoid copying the snapshot into the cache
in situations where we're barely ever going to need it. But right now
the only way I can think of is to look at the length of the
ProcArrayLock wait queue and count the exclusive locks therein - which'd
be expensive and intrusive...

On 2017-08-02 15:56:21 +0530, Mithun Cy wrote:

diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index a7e8cf2..4d7debe 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -366,11 +366,13 @@ ProcArrayRemove(PGPROC *proc, TransactionId latestXid)
(arrayP->numProcs - index - 1) * sizeof(int));
arrayP->pgprocnos[arrayP->numProcs - 1] = -1;	/* for debugging */
arrayP->numProcs--;
+			ProcGlobal->is_snapshot_valid = false;
LWLockRelease(ProcArrayLock);
return;
}
}

+ ProcGlobal->is_snapshot_valid = false;
/* Oops */
LWLockRelease(ProcArrayLock);

Why do we need to do anything here? And if we need to, why not in
ProcArrayAdd?

@@ -1552,6 +1557,8 @@ GetSnapshotData(Snapshot snapshot)
errmsg("out of memory")));
}
+	snapshot->takenDuringRecovery = RecoveryInProgress();
+
/*
* It is sufficient to get shared lock on ProcArrayLock, even if we are
* going to set MyPgXact->xmin.
@@ -1566,100 +1573,200 @@ GetSnapshotData(Snapshot snapshot)
/* initialize xmin calculation with xmax */
globalxmin = xmin = xmax;
- snapshot->takenDuringRecovery = RecoveryInProgress();
-

This movement of code seems fairly unrelated?

if (!snapshot->takenDuringRecovery)
{

Do we really need to restrict this to !recovery snapshots? It'd be nicer
if we could generalize it - I don't immediately see anything !recovery
specific in the logic here.

-		int		   *pgprocnos = arrayP->pgprocnos;
-		int			numProcs;
+		if (ProcGlobal->is_snapshot_valid)
+		{
+			Snapshot	csnap = &ProcGlobal->cached_snapshot;
+			TransactionId *saved_xip;
+			TransactionId *saved_subxip;

-		/*
-		 * Spin over procArray checking xid, xmin, and subxids.  The goal is
-		 * to gather all active xids, find the lowest xmin, and try to record
-		 * subxids.
-		 */
-		numProcs = arrayP->numProcs;
-		for (index = 0; index < numProcs; index++)
+			saved_xip = snapshot->xip;
+			saved_subxip = snapshot->subxip;
+
+			memcpy(snapshot, csnap, sizeof(SnapshotData));
+
+			snapshot->xip = saved_xip;
+			snapshot->subxip = saved_subxip;
+
+			memcpy(snapshot->xip, csnap->xip,
+				   sizeof(TransactionId) * csnap->xcnt);
+			memcpy(snapshot->subxip, csnap->subxip,
+				   sizeof(TransactionId) * csnap->subxcnt);
+			globalxmin = ProcGlobal->cached_snapshot_globalxmin;
+			xmin = csnap->xmin;
+
+			count = csnap->xcnt;
+			subcount = csnap->subxcnt;
+			suboverflowed = csnap->suboverflowed;
+
+			Assert(TransactionIdIsValid(globalxmin));
+			Assert(TransactionIdIsValid(xmin));
+		}

Can we move this to a separate function? In fact, could you check how
much effort it'd be to split cached, !recovery, recovery cases into
three static inline helper functions? This is starting to be hard to
read, the indentation added in this patch doesn't help... Consider doing
recovery, !recovery cases in a preliminary separate patch.

+			 * Let only one of the caller cache the computed snapshot, and
+			 * others can continue as before.
*/
-			if (!suboverflowed)
+			if (!ProcGlobal->is_snapshot_valid &&
+				LWLockConditionalAcquire(&ProcGlobal->CachedSnapshotLock,
+										 LW_EXCLUSIVE))
{

I'd also move this to a helper function (the bit inside the lock).

+
+				/*
+				 * The memory barrier has be to be placed here to ensure that
+				 * is_snapshot_valid is set only after snapshot is cached.
+				 */
+				pg_write_barrier();
+				ProcGlobal->is_snapshot_valid = true;
+				LWLockRelease(&ProcGlobal->CachedSnapshotLock);

LWLockRelease() is a full memory barrier, so this shouldn't be required.

/*
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 7dbaa81..7d06904 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -16,6 +16,7 @@

#include "access/xlogdefs.h"
#include "lib/ilist.h"
+#include "utils/snapshot.h"
#include "storage/latch.h"
#include "storage/lock.h"
#include "storage/pg_sema.h"
@@ -253,6 +254,22 @@ typedef struct PROC_HDR
int			startupProcPid;
/* Buffer id of the buffer that Startup process waits for pin on, or -1 */
int			startupBufferPinWaitBufId;
+
+	/*
+	 * In GetSnapshotData we can reuse the previously snapshot computed if no
+	 * new transaction has committed or rolledback. Thus saving good amount of
+	 * computation cycle under GetSnapshotData where we need to iterate
+	 * through procArray every time we try to get the current snapshot. Below
+	 * members help us to save the previously computed snapshot in global
+	 * shared memory and any process which want to get current snapshot can
+	 * directly copy from them if it is still valid.
+	 */
+	SnapshotData cached_snapshot;	/* Previously saved snapshot */
+	volatile bool is_snapshot_valid;	/* is above snapshot valid */
+	TransactionId cached_snapshot_globalxmin;	/* globalxmin when above

I'd rename is_snapshot_valid to 'cached_snapshot_valid' or such, it's
not clear from the name what it refers to.

+	LWLock		CachedSnapshotLock; /* A try lock to make sure only one
+									 * process write to cached_snapshot */
} PROC_HDR;

Hm, I'd not add the lock to the struct, we normally don't do that for
the "in core" lwlocks that don't exist in a "configurable" number like
the buffer locks and such.

Regards,

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24

Mithun Cy

mithun.cy@enterprisedb.com

over 8 years ago

In reply to: Andres Freund (#23)

On 3 Aug 2017 2:16 am, "Andres Freund" <andres@anarazel.de> wrote:

Hi Andres thanks for detailed review. I agree with all of the comments. I
am going for a vacation. Once I come back I will fix those comments and
will submit a new patch.

#25

Mithun Cy

mithun.cy@enterprisedb.com

over 8 years ago

In reply to: Andres Freund (#23)

Cache the SnapshotData for reuse:
===========================
In one of our perf analysis using perf tool it showed GetSnapshotData
takes a very high percentage of CPU cycles on readonly workload when
there is very high number of concurrent connections >= 64.

Machine : cthulhu 8 node machine.
----------------------------------------------
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 128
On-line CPU(s) list: 0-127
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 8
NUMA node(s): 8
Vendor ID: GenuineIntel
CPU family: 6
Model: 47
Model name: Intel(R) Xeon(R) CPU E7- 8830 @ 2.13GHz
Stepping: 2
CPU MHz: 2128.000
BogoMIPS: 4266.63
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 24576K
NUMA node0 CPU(s): 0,65-71,96-103
NUMA node1 CPU(s): 72-79,104-111
NUMA node2 CPU(s): 80-87,112-119
NUMA node3 CPU(s): 88-95,120-127
NUMA node4 CPU(s): 1-8,33-40
NUMA node5 CPU(s): 9-16,41-48
NUMA node6 CPU(s): 17-24,49-56
NUMA node7 CPU(s): 25-32,57-64

Perf CPU cycle 128 clients.

On further analysis, it appears this mainly accounts to cache-misses
happening while iterating through large proc array to compute the
SnapshotData. Also, it seems mainly on read-only (read heavy) workload
SnapshotData mostly remains same as no new(rarely a) transaction
commits or rollbacks. Taking advantage of this we can save the
previously computed SanspshotData in shared memory and in
GetSnapshotData we can use the saved SnapshotData instead of computing
same when it is still valid. A saved SnapsotData is valid until no
transaction end.

[Perf analysis Base]

Samples: 421K of event 'cache-misses', Event count (approx.): 160069274
Overhead Command Shared Object Symbol
18.63% postgres postgres [.] GetSnapshotData
11.98% postgres postgres [.] _bt_compare
10.20% postgres postgres [.] PinBuffer
8.63% postgres postgres [.] LWLockAttemptLock
6.50% postgres postgres [.] LWLockRelease

[Perf analysis with Patch]

pgbench configuration:
scale_factor = 300
./pgbench -c $threads -j $threads -T $time_for_reading -M prepared -S postgres

The machine has 64 cores with this patch I can see server starts
improvement after 64 clients. I have tested up to 256 clients. Which
shows performance improvement nearly max 39% [1]. This is the best
case for the patch where once computed snapshotData is reused further.

The worst case seems to be small, quick write transactions with which
frequently invalidate the cached SnapshotData before it is reused by
any next GetSnapshotData call. As of now, I tested simple update
tests: pgbench -M Prepared -N on the same machine with the above
server configuration. I do not see much change in TPS numbers.

All TPS are median of 3 runs
Clients TPS-With Patch 05 TPS-Base %Diff
1 752.461117 755.186777 -0.3%
64 32171.296537 31202.153576 +3.1%
128 41059.660769 40061.929658 +2.49%

I will do some profiling and find out why this case is not costing us
some performance due to caching overhead.

[]

On Thu, Aug 3, 2017 at 2:16 AM, Andres Freund <andres@anarazel.de> wrote:

Hi,

I think this patch should have a "cover letter" explaining what it's
trying to achieve, how it does so and why it's safe/correct. I think
it'd also be good to try to show some of the worst cases of this patch
(i.e. where the caching only adds overhead, but never gets used), and
some of the best cases (where the cache gets used quite often, and
reduces contention significantly).

I wonder if there's a way to avoid copying the snapshot into the cache
in situations where we're barely ever going to need it. But right now
the only way I can think of is to look at the length of the
ProcArrayLock wait queue and count the exclusive locks therein - which'd
be expensive and intrusive...

On 2017-08-02 15:56:21 +0530, Mithun Cy wrote:
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index a7e8cf2..4d7debe 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -366,11 +366,13 @@ ProcArrayRemove(PGPROC *proc, TransactionId latestXid)
(arrayP->numProcs - index - 1) * sizeof(int));
arrayP->pgprocnos[arrayP->numProcs - 1] = -1;   /* for debugging */
arrayP->numProcs--;
+                     ProcGlobal->is_snapshot_valid = false;
LWLockRelease(ProcArrayLock);
return;
}
}
+ ProcGlobal->is_snapshot_valid = false;
/* Oops */
LWLockRelease(ProcArrayLock);
Why do we need to do anything here? And if we need to, why not in

;> ProcArrayAdd?

In ProcArrayRemove I can see ShmemVariableCache->latestCompletedXid is
set to latestXid which is going to change xmax in GetSnapshotData,
hence invalidated the snapshot there. ProcArrayAdd do not seem to be
affecting snapshotData. I have modified now to set
cached_snapshot_valid to false just below the statement
ShmemVariableCache->latestCompletedXid = latestXid only.
+if (TransactionIdIsValid(latestXid))
+{
+     Assert(TransactionIdIsValid(allPgXact[proc->pgprocno].xid));
+     /* Advance global latestCompletedXid while holding the lock */
+     if (TransactionIdPrecedes(ShmemVariableCache->latestCompletedXid,
+                                               latestXid))
+     ShmemVariableCache->latestCompletedXid = latestXid;

@@ -1552,6 +1557,8 @@ GetSnapshotData(Snapshot snapshot)
errmsg("out of memory")));
}
+     snapshot->takenDuringRecovery = RecoveryInProgress();
+
/*
* It is sufficient to get shared lock on ProcArrayLock, even if we are
* going to set MyPgXact->xmin.
@@ -1566,100 +1573,200 @@ GetSnapshotData(Snapshot snapshot)
/* initialize xmin calculation with xmax */
globalxmin = xmin = xmax;
- snapshot->takenDuringRecovery = RecoveryInProgress();
-
This movement of code seems fairly unrelated?

-- Yes not related, It was part of early POC patch so retained it as
it is. Now reverted those changes moved them back under the
ProcArrayLock

if (!snapshot->takenDuringRecovery)
{

Do we really need to restrict this to !recovery snapshots? It'd be nicer
if we could generalize it - I don't immediately see anything !recovery
specific in the logic here.

Agree I will add one more patch on this to include standby(recovery
state) case also to unify all the cases where we can cache the
snapshot. Before let me see

-             int                *pgprocnos = arrayP->pgprocnos;
-             int                     numProcs;
+             if (ProcGlobal->is_snapshot_valid)
+             {
+                     Snapshot        csnap = &ProcGlobal->cached_snapshot;
+                     TransactionId *saved_xip;
+                     TransactionId *saved_subxip;

-             /*
-              * Spin over procArray checking xid, xmin, and subxids.  The goal is
-              * to gather all active xids, find the lowest xmin, and try to record
-              * subxids.
-              */
-             numProcs = arrayP->numProcs;
-             for (index = 0; index < numProcs; index++)
+                     saved_xip = snapshot->xip;
+                     saved_subxip = snapshot->subxip;
+
+                     memcpy(snapshot, csnap, sizeof(SnapshotData));
+
+                     snapshot->xip = saved_xip;
+                     snapshot->subxip = saved_subxip;
+
+                     memcpy(snapshot->xip, csnap->xip,
+                                sizeof(TransactionId) * csnap->xcnt);
+                     memcpy(snapshot->subxip, csnap->subxip,
+                                sizeof(TransactionId) * csnap->subxcnt);
+                     globalxmin = ProcGlobal->cached_snapshot_globalxmin;
+                     xmin = csnap->xmin;
+
+                     count = csnap->xcnt;
+                     subcount = csnap->subxcnt;
+                     suboverflowed = csnap->suboverflowed;
+
+                     Assert(TransactionIdIsValid(globalxmin));
+                     Assert(TransactionIdIsValid(xmin));
+             }

In this patch, I have moved the code 2 functions one to cache the
snapshot data another to get the snapshot data from the cache. In the
next patch, I will try to unify these things with recovery (standby)
case. For recovery case, we are copying the xids from a simple xid
array KnownAssignedXids[], I am not completely sure whether caching
them bring similar performance benefits as above so I will also try to
get a perf report for same.

+                      * Let only one of the caller cache the computed snapshot, and
+                      * others can continue as before.
*/
-                     if (!suboverflowed)
+                     if (!ProcGlobal->is_snapshot_valid &&
+                             LWLockConditionalAcquire(&ProcGlobal->CachedSnapshotLock,
+                                                                              LW_EXCLUSIVE))
{

I'd also move this to a helper function (the bit inside the lock).

Fixed as suggested.

+
+                             /*
+                              * The memory barrier has be to be placed here to ensure that
+                              * is_snapshot_valid is set only after snapshot is cached.
+                              */
+                             pg_write_barrier();
+                             ProcGlobal->is_snapshot_valid = true;
+                             LWLockRelease(&ProcGlobal->CachedSnapshotLock);

LWLockRelease() is a full memory barrier, so this shouldn't be required.

Okay, Sorry for my understanding, Do you want me to set
ProcGlobal->is_snapshot_valid after
LWLockRelease(&ProcGlobal->CachedSnapshotLock)?

/*
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 7dbaa81..7d06904 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -16,6 +16,7 @@

#include "access/xlogdefs.h"
#include "lib/ilist.h"
+#include "utils/snapshot.h"
#include "storage/latch.h"
#include "storage/lock.h"
#include "storage/pg_sema.h"
@@ -253,6 +254,22 @@ typedef struct PROC_HDR
int                     startupProcPid;
/* Buffer id of the buffer that Startup process waits for pin on, or -1 */
int                     startupBufferPinWaitBufId;
+
+     /*
+      * In GetSnapshotData we can reuse the previously snapshot computed if no
+      * new transaction has committed or rolledback. Thus saving good amount of
+      * computation cycle under GetSnapshotData where we need to iterate
+      * through procArray every time we try to get the current snapshot. Below
+      * members help us to save the previously computed snapshot in global
+      * shared memory and any process which want to get current snapshot can
+      * directly copy from them if it is still valid.
+      */
+     SnapshotData cached_snapshot;   /* Previously saved snapshot */
+     volatile bool is_snapshot_valid;        /* is above snapshot valid */
+     TransactionId cached_snapshot_globalxmin;       /* globalxmin when above

I'd rename is_snapshot_valid to 'cached_snapshot_valid' or such, it's
not clear from the name what it refers to.

-- Renamed to cached_snapshot_valid.

+     LWLock          CachedSnapshotLock; /* A try lock to make sure only one
+                                                                      * process write to cached_snapshot */
} PROC_HDR;
Hm, I'd not add the lock to the struct, we normally don't do that for
the "in core" lwlocks that don't exist in a "configurable" number like
the buffer locks and such.

-- I am not really sure about this. Can you please help what exactly I
should be doing here to get this corrected? Is that I have to add it
to lwlocknames.txt as a new LWLock?

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#26

Mithun Cy

mithun.cy@enterprisedb.com

over 8 years ago

In reply to: Mithun Cy (#25)

Hi all please ignore this mail, this email is incomplete I have to add
more information and Sorry accidentally I pressed the send button
while replying.

On Tue, Aug 29, 2017 at 11:27 AM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:

Cache the SnapshotData for reuse:
===========================
In one of our perf analysis using perf tool it showed GetSnapshotData
takes a very high percentage of CPU cycles on readonly workload when
there is very high number of concurrent connections >= 64.

Machine : cthulhu 8 node machine.
----------------------------------------------
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 128
On-line CPU(s) list: 0-127
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 8
NUMA node(s): 8
Vendor ID: GenuineIntel
CPU family: 6
Model: 47
Model name: Intel(R) Xeon(R) CPU E7- 8830 @ 2.13GHz
Stepping: 2
CPU MHz: 2128.000
BogoMIPS: 4266.63
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 24576K
NUMA node0 CPU(s): 0,65-71,96-103
NUMA node1 CPU(s): 72-79,104-111
NUMA node2 CPU(s): 80-87,112-119
NUMA node3 CPU(s): 88-95,120-127
NUMA node4 CPU(s): 1-8,33-40
NUMA node5 CPU(s): 9-16,41-48
NUMA node6 CPU(s): 17-24,49-56
NUMA node7 CPU(s): 25-32,57-64

Perf CPU cycle 128 clients.

On further analysis, it appears this mainly accounts to cache-misses
happening while iterating through large proc array to compute the
SnapshotData. Also, it seems mainly on read-only (read heavy) workload
SnapshotData mostly remains same as no new(rarely a) transaction
commits or rollbacks. Taking advantage of this we can save the
previously computed SanspshotData in shared memory and in
GetSnapshotData we can use the saved SnapshotData instead of computing
same when it is still valid. A saved SnapsotData is valid until no
transaction end.

[Perf analysis Base]

Samples: 421K of event 'cache-misses', Event count (approx.): 160069274
Overhead Command Shared Object Symbol
18.63% postgres postgres [.] GetSnapshotData
11.98% postgres postgres [.] _bt_compare
10.20% postgres postgres [.] PinBuffer
8.63% postgres postgres [.] LWLockAttemptLock
6.50% postgres postgres [.] LWLockRelease

[Perf analysis with Patch]

Server configuration:
./postgres -c shared_buffers=8GB -N 300 -c min_wal_size=15GB -c
max_wal_size=20GB -c checkpoint_timeout=900 -c
maintenance_work_mem=1GB -c checkpoint_completion_target=0.9 -c
wal_buffers=256MB &

pgbench configuration:
scale_factor = 300
./pgbench -c $threads -j $threads -T $time_for_reading -M prepared -S postgres

The machine has 64 cores with this patch I can see server starts
improvement after 64 clients. I have tested up to 256 clients. Which
shows performance improvement nearly max 39% [1]. This is the best
case for the patch where once computed snapshotData is reused further.

The worst case seems to be small, quick write transactions with which
frequently invalidate the cached SnapshotData before it is reused by
any next GetSnapshotData call. As of now, I tested simple update
tests: pgbench -M Prepared -N on the same machine with the above
server configuration. I do not see much change in TPS numbers.

All TPS are median of 3 runs
Clients TPS-With Patch 05 TPS-Base %Diff
1 752.461117 755.186777 -0.3%
64 32171.296537 31202.153576 +3.1%
128 41059.660769 40061.929658 +2.49%

I will do some profiling and find out why this case is not costing us
some performance due to caching overhead.

[]

On Thu, Aug 3, 2017 at 2:16 AM, Andres Freund <andres@anarazel.de> wrote:
Hi,

I think this patch should have a "cover letter" explaining what it's
trying to achieve, how it does so and why it's safe/correct. I think
it'd also be good to try to show some of the worst cases of this patch
(i.e. where the caching only adds overhead, but never gets used), and
some of the best cases (where the cache gets used quite often, and
reduces contention significantly).

I wonder if there's a way to avoid copying the snapshot into the cache
in situations where we're barely ever going to need it. But right now
the only way I can think of is to look at the length of the
ProcArrayLock wait queue and count the exclusive locks therein - which'd
be expensive and intrusive...

On 2017-08-02 15:56:21 +0530, Mithun Cy wrote:
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index a7e8cf2..4d7debe 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -366,11 +366,13 @@ ProcArrayRemove(PGPROC *proc, TransactionId latestXid)
(arrayP->numProcs - index - 1) * sizeof(int));
arrayP->pgprocnos[arrayP->numProcs - 1] = -1;   /* for debugging */
arrayP->numProcs--;
+                     ProcGlobal->is_snapshot_valid = false;
LWLockRelease(ProcArrayLock);
return;
}
}
+ ProcGlobal->is_snapshot_valid = false;
/* Oops */
LWLockRelease(ProcArrayLock);
Why do we need to do anything here? And if we need to, why not in
;> ProcArrayAdd?
In ProcArrayRemove I can see ShmemVariableCache->latestCompletedXid is
set to latestXid which is going to change xmax in GetSnapshotData,
hence invalidated the snapshot there. ProcArrayAdd do not seem to be
affecting snapshotData. I have modified now to set
cached_snapshot_valid to false just below the statement
ShmemVariableCache->latestCompletedXid = latestXid only.
+if (TransactionIdIsValid(latestXid))
+{
+     Assert(TransactionIdIsValid(allPgXact[proc->pgprocno].xid));
+     /* Advance global latestCompletedXid while holding the lock */
+     if (TransactionIdPrecedes(ShmemVariableCache->latestCompletedXid,
+                                               latestXid))
+     ShmemVariableCache->latestCompletedXid = latestXid;
@@ -1552,6 +1557,8 @@ GetSnapshotData(Snapshot snapshot)
errmsg("out of memory")));
}
+     snapshot->takenDuringRecovery = RecoveryInProgress();
+
/*
* It is sufficient to get shared lock on ProcArrayLock, even if we are
* going to set MyPgXact->xmin.
@@ -1566,100 +1573,200 @@ GetSnapshotData(Snapshot snapshot)
/* initialize xmin calculation with xmax */
globalxmin = xmin = xmax;
- snapshot->takenDuringRecovery = RecoveryInProgress();
-
This movement of code seems fairly unrelated?
-- Yes not related, It was part of early POC patch so retained it as
it is. Now reverted those changes moved them back under the
ProcArrayLock

if (!snapshot->takenDuringRecovery)
{

Do we really need to restrict this to !recovery snapshots? It'd be nicer
if we could generalize it - I don't immediately see anything !recovery
specific in the logic here.

Agree I will add one more patch on this to include standby(recovery
state) case also to unify all the cases where we can cache the
snapshot. Before let me see
-             int                *pgprocnos = arrayP->pgprocnos;
-             int                     numProcs;
+             if (ProcGlobal->is_snapshot_valid)
+             {
+                     Snapshot        csnap = &ProcGlobal->cached_snapshot;
+                     TransactionId *saved_xip;
+                     TransactionId *saved_subxip;
-             /*
-              * Spin over procArray checking xid, xmin, and subxids.  The goal is
-              * to gather all active xids, find the lowest xmin, and try to record
-              * subxids.
-              */
-             numProcs = arrayP->numProcs;
-             for (index = 0; index < numProcs; index++)
+                     saved_xip = snapshot->xip;
+                     saved_subxip = snapshot->subxip;
+
+                     memcpy(snapshot, csnap, sizeof(SnapshotData));
+
+                     snapshot->xip = saved_xip;
+                     snapshot->subxip = saved_subxip;
+
+                     memcpy(snapshot->xip, csnap->xip,
+                                sizeof(TransactionId) * csnap->xcnt);
+                     memcpy(snapshot->subxip, csnap->subxip,
+                                sizeof(TransactionId) * csnap->subxcnt);
+                     globalxmin = ProcGlobal->cached_snapshot_globalxmin;
+                     xmin = csnap->xmin;
+
+                     count = csnap->xcnt;
+                     subcount = csnap->subxcnt;
+                     suboverflowed = csnap->suboverflowed;
+
+                     Assert(TransactionIdIsValid(globalxmin));
+                     Assert(TransactionIdIsValid(xmin));
+             }
Can we move this to a separate function? In fact, could you check how
much effort it'd be to split cached, !recovery, recovery cases into
three static inline helper functions? This is starting to be hard to
read, the indentation added in this patch doesn't help... Consider doing
recovery, !recovery cases in a preliminary separate patch.
In this patch, I have moved the code 2 functions one to cache the
snapshot data another to get the snapshot data from the cache. In the
next patch, I will try to unify these things with recovery (standby)
case. For recovery case, we are copying the xids from a simple xid
array KnownAssignedXids[], I am not completely sure whether caching
them bring similar performance benefits as above so I will also try to
get a perf report for same.
+                      * Let only one of the caller cache the computed snapshot, and
+                      * others can continue as before.
*/
-                     if (!suboverflowed)
+                     if (!ProcGlobal->is_snapshot_valid &&
+                             LWLockConditionalAcquire(&ProcGlobal->CachedSnapshotLock,
+                                                                              LW_EXCLUSIVE))
{
I'd also move this to a helper function (the bit inside the lock).
Fixed as suggested.
+
+                             /*
+                              * The memory barrier has be to be placed here to ensure that
+                              * is_snapshot_valid is set only after snapshot is cached.
+                              */
+                             pg_write_barrier();
+                             ProcGlobal->is_snapshot_valid = true;
+                             LWLockRelease(&ProcGlobal->CachedSnapshotLock);
LWLockRelease() is a full memory barrier, so this shouldn't be required.
Okay, Sorry for my understanding, Do you want me to set
ProcGlobal->is_snapshot_valid after
LWLockRelease(&ProcGlobal->CachedSnapshotLock)?
/*
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 7dbaa81..7d06904 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -16,6 +16,7 @@
#include "access/xlogdefs.h"
#include "lib/ilist.h"
+#include "utils/snapshot.h"
#include "storage/latch.h"
#include "storage/lock.h"
#include "storage/pg_sema.h"
@@ -253,6 +254,22 @@ typedef struct PROC_HDR
int                     startupProcPid;
/* Buffer id of the buffer that Startup process waits for pin on, or -1 */
int                     startupBufferPinWaitBufId;
+
+     /*
+      * In GetSnapshotData we can reuse the previously snapshot computed if no
+      * new transaction has committed or rolledback. Thus saving good amount of
+      * computation cycle under GetSnapshotData where we need to iterate
+      * through procArray every time we try to get the current snapshot. Below
+      * members help us to save the previously computed snapshot in global
+      * shared memory and any process which want to get current snapshot can
+      * directly copy from them if it is still valid.
+      */
+     SnapshotData cached_snapshot;   /* Previously saved snapshot */
+     volatile bool is_snapshot_valid;        /* is above snapshot valid */
+     TransactionId cached_snapshot_globalxmin;       /* globalxmin when above
I'd rename is_snapshot_valid to 'cached_snapshot_valid' or such, it's
not clear from the name what it refers to.
-- Renamed to cached_snapshot_valid.
+     LWLock          CachedSnapshotLock; /* A try lock to make sure only one
+                                                                      * process write to cached_snapshot */
} PROC_HDR;
Hm, I'd not add the lock to the struct, we normally don't do that for
the "in core" lwlocks that don't exist in a "configurable" number like
the buffer locks and such.
-- I am not really sure about this. Can you please help what exactly I
should be doing here to get this corrected? Is that I have to add it
to lwlocknames.txt as a new LWLock?

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#27

Mithun Cy

mithun.cy@enterprisedb.com

over 8 years ago

In reply to: Andres Freund (#23)

2 attachment(s)

On Thu, Aug 3, 2017 at 2:16 AM, Andres Freund <andres@anarazel.de> wrote:

I think this patch should have a "cover letter" explaining what it's
trying to achieve, how it does so and why it's safe/correct.

[Perf cache-misses analysis of Base for 128 pgbench readonly clients]
======================================================
Samples: 421K of event 'cache-misses', Event count (approx.): 160069274
Overhead Command Shared Object Symbol
18.63% postgres postgres [.] GetSnapshotData
11.98% postgres postgres [.] _bt_compare
10.20% postgres postgres [.] PinBuffer
8.63% postgres postgres [.] LWLockAttemptLock
6.50% postgres postgres [.] LWLockRelease

[Perf cache-misses analysis with Patch for 128 pgbench readonly clients]
========================================================
Samples: 357K of event 'cache-misses', Event count (approx.): 102380622
Overhead Command Shared Object Symbol
0.27% postgres postgres [.] GetSnapshotData

with the patch, it appears cache-misses events has reduced significantly.

Test Setting:
=========
Server configuration:
./postgres -c shared_buffers=8GB -N 300 -c min_wal_size=15GB -c
max_wal_size=20GB -c checkpoint_timeout=900 -c
maintenance_work_mem=1GB -c checkpoint_completion_target=0.9 -c
wal_buffers=256MB &

pgbench configuration:
scale_factor = 300
./pgbench -c $threads -j $threads -T $time_for_reading -M prepared -S postgres

The worst case seems to be small, quick write transactions which
frequently invalidate the cached SnapshotData before it is reused by
any next GetSnapshotData call. As of now, I tested simple update
tests: pgbench -M Prepared -N on the same machine with the above
server configuration. I do not see much change in TPS numbers.

All TPS are median of 3 runs
Clients TPS-With Patch 05 TPS-Base %Diff
1 752.461117 755.186777 -0.3%
64 32171.296537 31202.153576 +3.1%
128 41059.660769 40061.929658 +2.49%

I will do some profiling and find out why this case is not costing us
some performance due to caching overhead.

diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index a7e8cf2..4d7debe 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -366,11 +366,13 @@ ProcArrayRemove(PGPROC *proc, TransactionId latestXid)
(arrayP->numProcs - index - 1) * sizeof(int));
arrayP->pgprocnos[arrayP->numProcs - 1] = -1;   /* for debugging */
arrayP->numProcs--;
+                     ProcGlobal->is_snapshot_valid = false;
LWLockRelease(ProcArrayLock);
return;
}
}

+ ProcGlobal->is_snapshot_valid = false;
/* Oops */
LWLockRelease(ProcArrayLock);

Why do we need to do anything here? And if we need to, why not in
ProcArrayAdd?

-- In ProcArrayRemove I can see ShmemVariableCache->latestCompletedXid
is set to latestXid which is going to change xmax in GetSnapshotData,
hence invalidated the snapshot there. ProcArrayAdd do not seem to be
affecting snapshotData. I have modified now to set
cached_snapshot_valid to false just below the statement
ShmemVariableCache->latestCompletedXid = latestXid only.
+if (TransactionIdIsValid(latestXid))
+{
+     Assert(TransactionIdIsValid(allPgXact[proc->pgprocno].xid));
+     /* Advance global latestCompletedXid while holding the lock */
+     if (TransactionIdPrecedes(ShmemVariableCache->latestCompletedXid,
+                                               latestXid))
+     ShmemVariableCache->latestCompletedXid = latestXid;

@@ -1552,6 +1557,8 @@ GetSnapshotData(Snapshot snapshot)
globalxmin = xmin = xmax;

- snapshot->takenDuringRecovery = RecoveryInProgress();
-

This movement of code seems fairly unrelated?

-- Sorry not related, It was part of early POC patch so retained it as
it is. Now reverted those changes moved them back under the
ProcArrayLock

if (!snapshot->takenDuringRecovery)
{

Do we really need to restrict this to !recovery snapshots? It'd be nicer
if we could generalize it - I don't immediately see anything !recovery
specific in the logic here.

-- Agree I will add one more patch on this to include standby(recovery
state) case also to unify all the cases where we can cache the
snapshot. In this patch, I have moved the code 2 functions one to
cache the snapshot data another to get the snapshot data from the
cache. In the next patch, I will try to unify these things with
recovery (standby) case. For recovery case, we are copying the xids
from a simple xid array KnownAssignedXids[], I am not completely sure
whether caching them bring similar performance benefits as above so I
will also try to get a perf report for same.

+                      * Let only one of the caller cache the computed snapshot, and
+                      * others can continue as before.
*/
-                     if (!suboverflowed)
+                     if (!ProcGlobal->is_snapshot_valid &&
+                             LWLockConditionalAcquire(&ProcGlobal->CachedSnapshotLock,
+                                                                              LW_EXCLUSIVE))
{

I'd also move this to a helper function (the bit inside the lock).

-- Fixed as suggested.

+
+                             /*
+                              * The memory barrier has be to be placed here to ensure that
+                              * is_snapshot_valid is set only after snapshot is cached.
+                              */
+                             pg_write_barrier();
+                             ProcGlobal->is_snapshot_valid = true;
+                             LWLockRelease(&ProcGlobal->CachedSnapshotLock);

LWLockRelease() is a full memory barrier, so this shouldn't be required.

-- Okay, Sorry for my understanding, do you want me to set
ProcGlobal->is_snapshot_valid after
LWLockRelease(&ProcGlobal->CachedSnapshotLock)?

+     SnapshotData cached_snapshot;   /* Previously saved snapshot */
+     volatile bool is_snapshot_valid;        /* is above snapshot valid */
+     TransactionId cached_snapshot_globalxmin;       /* globalxmin when above
I'd rename is_snapshot_valid to 'cached_snapshot_valid' or such, it's
not clear from the name what it refers to.

-- Fixed renamed as suggested.

+     LWLock          CachedSnapshotLock; /* A try lock to make sure only one
+                                                                      * process write to cached_snapshot */
} PROC_HDR;
Hm, I'd not add the lock to the struct, we normally don't do that for
the "in core" lwlocks that don't exist in a "configurable" number like
the buffer locks and such.

-- I am not really sure about this. Can you please help what exactly I
should be doing here to get this corrected? Is that I have to add it
to lwlocknames.txt as a new LWLock?

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Attachments:

cache_the_snapshot_performance.odsapplication/vnd.oasis.opendocument.spreadsheet; name=cache_the_snapshot_performance.odsDownload

PK	OK�l9�..mimetypeapplication/vnd.oasis.opendocument.spreadsheetPK	OK1}����Thumbnails/thumbnail.png�PNG


IHDR��[���~PLTE###+++333<<<CCCKKKSSS[[[ccckkksss{{{S�~�����������������������������������������������������������������������BJIDATx���������k�-v��`���s��
���t��J\���}�V�����f��-v�N�	;a'����v�N�	;a'����v�^
��~���������0l�������;����R�y��[�e���x�Xp
���_��`�Q.����������B=g`O
��?(e��������(��J�W�������6�]y\��\�����>�r;�
����cL���cKK�~�N��8�X������q.��d:��N���j��w{���y�N������g�8Z:��'fP��+�wP��\I���@<m�y�>l�(�OO�!>��Y���I�o|�?�J�����B���O�Dh��,��;����!���Z�r��#9��H��\�����?�����w�Ou������;����x�����6�����L�(�i4A�M��J��T�J����k�?���a���j���b\�Mp������a37���z��	%��a�5y���u�'R����-���l�v��X�����
�o$)KX�$��-� q�/9��NYW�S\ri���'���elv�o(�������
Yvnx{A����F�vgm������7
_�������������w4
(�M�~�z6P�m
���Bv���x�5�;Svm�7������*`��a�[@��eJ��JUG���.T��m����m���|m�&�����c#�+�����ZX��IT	P$�PjG���*�6b���k���d��X�x�E!�I�������y	����4-��{` �R^
�����|�{ 6�noL��D�k�������5]������fL���/�=t�����x�a�(�)���w2
�#����L�}2+�i�>{����2�W�=�FF��V.PJ����*���J.j?o���Jz��.�bu/�D/6�^+�e��Y�v�xVc1�f��D�M�� t��F/���+mY����-�.��ui*�.��N��JI���[P��p��B�QMI4�|cD�����W�����b�=�������a��t�zj�*����g_a?x@4�C�3��1*K
��b���{:1}�DDnE���+����<�����GadF�I���$-����1`k�p;^�m���c��o?G����/��k����Mt�o7o[��0��\�����M��	Q��H�VN������1V�*�����:�q���=�x�a#]�������p�!�.T�D�N�W�.)Z5v� Y�����&�M(�<���U���k�]��,2F��D�$v1x�+Y���X�Y�ko�VY<U`Csqt`�aC�"�	�+K���B��&�F?���9(�J�:B�X��}k4�����R(�-�S
�Ro �9T�[y���T�9�C+lm�9'�l��� g������O��~�p!�������BpPQ�2���6.hP�{4�	�� k��jG�K����#�u}���U��]��/x�FA������
��]S��5�"u��3�:��6��8|ZA�4���s����"�������Q"QmaHC�w���������q��%1H��R��d�e@��*_L6���zIC!��d��o�����c��[�\L���Je�>PR��n'���9���`��?�xF)�L&��ckZ��u)Z�M�z��@�14W���Pb����u}�\W|W��;���m]%�5b�sh[/��Z_�����/{��8����>U��v"l[��C������C~����,-��Y������V� ,�`n)���%�NN���X�F���H� b�`f���<d�g��;�����e��#�7��Q��N�n���-f�x�O[�4k���/�y3�
;���3;F�z���G����x���X���`3xt��x�s��V��@;`�J��H�]V<sa�i-��_<���0��1�[������9��Zw��;�f�
���R�W1���zC�sU��
z����Z�!�����*n�
��(�o��zK��������Vxc�/��9�{�N�	;a'����v�N�	;a'����v�N�	;a'����v�N�	;a'���������&G���-��n�l��*l�y���u	Z���yTrM��
+����Yy�cpii�R[nq�0�s`�U���eX�zR�'p�$vh;+��� ��Ebw2n������vf7n�O�%�[X�QK���V�����P$�<�XR�{wW��I�"�W���'�z�1�U��`O[8��c|}D��`���Pb)�NK����Saw��x�� ��;�x*s>��T��0����yx��{6���&����v�N�	;a?2��w���DucJ6�g��]�V4H��7�Cc�yK���g�r+���f��
�2(�.a�,R!���C�?a{k����V�^�r�3�7f|e�m��u�kw'��`7����V��l^D�}$�5a'�����7����m���Q�����1}�=��@�`-�nY��VV^���C�m_?���g.��r����?��'����8�]���S�F��6��w���aCNo]���$�{T/�kO�5a'������a�/P�	-Y����%lgQ��j����"h��
w	����
��~J�������������H�.b�w�"�{h���1.����\��w�����J����m�2U����v�N�;��)C�j�sO��>a��������('�ZE�OX���@�Ed��"rz�.b�;U�">�%�x�.b�`�I2Qp��H�;������,����W%�S�������zv�N�	;aO��'}}*�O�p��^Bx��Or�~����.`��������82I�rX]��j��������=��$|��������}u��=��
��7PB��o<
g��V;�����������6fF/����s��^����������A�V��8�\S�~X��|Uj�F�m��M�'�6�UW���"�������-�n@�D�N�]���toe�����r���XA�O�����;������f��G�������z�W�>���a'���0=�s��k�N�	������|�������_��������u�����kv���IEND�B`�PK	OKcontent.xml�Zm�����_!��"
��7�=o���h�=���M� (h����D������EI���K��wh���E��>�j���]U��qB�����i�:�9�W�������_���--
��yN�M�kae�����������zN'|^�
������u�5K����O��z+<�x't���D-�Wn���9C[]e)A�TWy�K����A�����~�0�B4s��n��6�)[9^��N;;��f��V*�\b�w<�sz�
��O��!��j��vh�@�v�a���+��gh�3���J�]��a���i���R%����c�
�����9�`����~�+V��%e'��i��T�c}J�U*�����]7t��Hz{V|���l$���P�
����������%��C$�O(����y~��?��?dk\��0�����@�>2Ln�IO#���21��O��[��m-��t��������QQ�8�F�[�o�Lr�y>�N+4��g<��2�����'��SA7u���
�5�9��Vm>�0fpI?�dW�F&����?��KG�PjU����|�=������D�yqh���g����~��#�,Y6�0t+�����{��3�#X������7oU��
�,q/�{�
���j8i��{���O�w���rj�4&�����5���'
$�G�H�>�<�;�3���<����'.p�1�����?��@�t�_������"��qm���as�	����bd7�6�B�%�������x�QoX�s��`�3���6{�������L���������!�r��b'�%����oG��j�������� ���2l�ka�;�.���0���������`02F�������{hG>�}��c�6�T�����m�>l����+��1���V�d�)��O#��+'�)�S���&�D�ZY��R�,��`�������i�����,`��|d��PIV������rAv�qh�0��0�J��6Q��k�k�q#NK�o��g������/���������f}@�="����a�BlEj���pQG.Lm��t��cI�|����
)�OO��V�w���v*���.�|�TV~�0�X������Y�M
QyE����N�W�Y�i�%���y�D^s���q�\
�,'/�����_����O��
�^�"p���Q�G��m�3p��w����b��K�)OGn}�=�������%���W���5����*4�b�)8���5ho� @�-q�{r:���w���� �C����<�r���5j$N�*B��.���5{r�
��yrV�]����@�bo�m �����4�(,��v����Q`�<�����j*H�dA�6
\��u(,������������F��6����A�v/x7�
�w������+�.�������ba���jB���S!�D�~���;(\-q��Ll���I�{��a
��V���ahK��-�����YS��h�5�G���W��I%m���es�m��J���a���^�����!t�G�}��r)�-U�p�+�,w�����qL�+��+����������
��S!�V��_��Cm���-�<_R=O���b�v�	��?>|c�~{����"��-���?�F
�a4Hdk�
>|��K������+J���/���F��x�ETw�(LC;L]_�^�Zpf�'DI���?�qC9����	��X��_���'�2~�f���;����l�u����zQ:��8v�d�98�������/��,J���F���h\	T��Ib���g��i\��qG�X;����k{��S?��@��c��3�u�k����P��a�q'�'l�q%P~�����@��M4�����u�x�\����<�� ��D���t�����tjy���O�a���z�����@%����=�����s:8��Y������RNG��L���q&z-�^����4H���%����j9�}7�?I5��D�:��.�P
\?u==P���:�h�*Z����K�0���ODp
�Ho��J��}�l�_-]�a��?������W�u��^h���z��D���������e��{a����Di����\+%R{<��@��[x'��*���G_m���T������}�+PK��0'c�-PK	OK
styles.xml�[������S
����d�L�����A�$����E{��(�
%
$e��e����'�CR�%[���3�M�"?~������!��
���b�E�7�E�RdK�o������V�eiJb�HX\�������	.��t.�����X,d�`%.��E��S�-l�p
n���A������~����p�;Xa������� ��2?fy�$��������[KY.�`��^lg�gA4�������8��$0�j2DQ`�9�h�~
�T���{�GS�$:�j��,W��8A�1-��d��k�����g�v�Y2�UfIsl�����7�;���{��+���Ka[T�����i����1��`�]�;
���|7��A���y�cDc�8��H\�������{%_��K�H:>�;S�k���PU���$���:������}��e���
j���Q(��w�ly����UEb����J���BT[�$4=��OY�!
	��J0�����)�1?>)��X�h�nE�3�e�����W�
��z��f:�Vv�L��)���������x�yb���K�-���?�����,4'�q�}�J&����Fo���~�X5��!Z�����q�}#V�
�7�{5�V3F�G!q~�No�����A�`L?_{�g�����B�?��_������W��^%{�g�5������������>+��W'?��z^"�2���_BDa.	�M�A
+���
u`��Wy�#P��@�hO`�lAQ�U(�^\��R�����?y�"|����wc��r,��������t�Ug��;����v]���Q�c,��'�3G����NH�>(}(�S�������&*�C��������8��nA��Js+����!b�����>,bF��z4]�oQ��*�D����3E�\#�_Y���Gh� �b�qO
��s����&��=�hr?g	�/�>-��"�j;W�-D��/E)�;��W����V5����>�Sz��~�+��?��!���������%��:<K9:�(����0/4MF������3���nV������57��b�
]��~
�=����I�(�KHF��D2�������]SE����]��Q�"u	�����3j�i{��6�a�������'�g����Q����:8.�#�ZzG� 1�7��*w����M�X�����`�e���K�U�T���f��_/����.@����b�h�����L~�t�����t�-�<n��8����s������:� o�C}�E����b�4��,>={��r�cz�s�Nu�'��>~A�H/x�;����/���?||��qy�s\
8�Y������f5�
:�?kfm�t�g����Pc]�������'��G���B�v�����W!��1.P������Q���h��#�C���:v��2���&�'��O�/%q|Y�=I�x������L����%���%��3%����K�w�zG$���ph�?5p�QVd�F\t{��J�>��.#��q��_�Hb�>�`�R�6�����Y��8e_�j��c��8��(>�����O��@(����p^���J42�^�P�BUb�F���5|��7�})�1Nv���c�1��C�����A�H
�*���;|]��G�V�R�j��W�W��9:���:�'�wzv�Vq�a�~k��b������{���a�FnjJ&�D��V�WE��))�`�$=�-I��/U�A���]�pW��b�����g4�Xr���N�X�F�������p�0RO�c7���L!�vU_N���q��;r�=h�h
���.��t���a��z]�Y�<�4c��F�D�5�J93�.}TdJ�yx�d�����T������V��=�J�����#�t�T��O�/�w6Y�����2�j�m���h]��~��[������v#7���{&�z��������<�����
��	�a��Wc7�q�x��9O/.�%�U����P���%�!}x��A(����E�^������i ����9X�K���I�qz3T��#�D��MO7*ICg�ft$6���V�����W�16�����m��X��{$��]�n�D����`�3���Vk�?�������|������*<��}�<{����v������H�'��$�6!���"Z3��t���'�{�������Xa��o�V�����
����*���B�qJw�O[���S��	��aV�yh������h��S��PK����T5PK	OKmeta.xml��K��0�������A �"�TUUS�RS����wR��3�����a:Y���w�9����r���FY��iB���R��6�������zW��G%�K+��N��(XM��W��u���Q
7�	��g0��/i>��E+�g����1��.����#���ogT�+wn�()0h�L�g�o��R=��d����Xz�c����<�G'��u���84�}��t��h:�b�,����5�r(#�>(��b��Y#Fv��,�+��l���y�o8J)��Z �v4�,��H����:�����
R�p�H�n�U}���]�9%_����
z��������<�����aD�����
H�����xh=k��D{�
�?�x�|��u��:8�:,g	I���=(�^�?�|����?;�����}h���M������-�[D�PK�P�OPK	OKmanifest.rdf���n�0��<�e��@/r(��j��5�X/������VQ�������F3�����a�����T4c)%�Hh��+:�.���:���+��j���*�wn*9_��-7l���(x��<O�"��8qH���	�Bi��|9��	fWQt���y� =��:���
a�R��� ��@�	L��t��NK�3��Q9�����`����<`�+�������^����\��|�hz�czu����#�`�2�O��;y���.�����vDl@��g�����UG�PK��h��PK	OK'Configurations2/accelerator/current.xmlPKPK	OKConfigurations2/toolpanel/PK	OKConfigurations2/statusbar/PK	OKConfigurations2/progressbar/PK	OKConfigurations2/toolbar/PK	OKConfigurations2/images/Bitmaps/PK	OKConfigurations2/popupmenu/PK	OKConfigurations2/floater/PK	OKConfigurations2/menubar/PK	OKObject 1/content.xml�Z[s��~����L����g��/'m�N�}�@$D�!	�,���.�(S2�ZIN���b{�v��r���,�g�8����������������Gs����wwt�%	^�4���fB+�
����Y�_�X�����P��Z$kZ���Z��k���pq,f�+f]Z���+,yG�h3_�b��S��s�%/U����^�[
Q/k$����T�����������-�2����V���I�W�X���������Z����X���I^��jWn0�$�����l6"��3�Ir�fcC1���������l�D~&'+�,���X`�\]�w���z��
�.O)�M�M�*s=�	��Y��_d�3"0�����	*�>���
��6p��Y���?#���r����[���/OI�K40���MRq��!2L&�����pM�����0![^o[.��|����5ci:�
��6�>��L���Q?����VL��(�:�b���k���Z<��ShKw�'W>|�1#r	Jl=�A�|A�`����v�s���_�.MnC�Yr��	�����t`�a�v�4h�=���M$��b

�f�5S���[M�P�-�	���FSul�p�����7T����������	��>��R&g ����ewa?��{>�5��T����%lQI��$I�u"Cu���������T��J����:��:��+k��
,U��9��Y�����S'��������M��:�(O���+\��t�93%(��r"`�����{�����k�:��7���
sL.-�/(LL?l|��h���(:�����H���s)���C@��~Q@������6���]%UR�Rl�$Mqe&�(�_lQ�q�"F��e-��z��363Tw<�d�h����������E���Du��?M�^t<%�.��,�]�4K��SyI�g�C�d8i ���K����@@>������	 �t���MW��k�v.B�<��Cp.���L8QW���IRa���I�������&�z������n.�m�Z�y�x��_@�3���7���%�R�!-��~��;���������cyaR���X�Y���5����+vVr�	��%=\ES����k�f�d���������{fv�Z�O	*)t��L����s&��a�7��Wn7����$}��}���\O�:&,J����g��������A��p��S}*+Z���~����c�~ ��-�:gx{�����)*Nd�;�'��
����z��V��lD���Z�Ld�����~��*n��|��F����6�Jr��`����S�|2��D�0��[�;[���w��
��+�����9+��
h-P���;fI����U�G�'����(�R�q~4Jt����j�KE��4��f��bv���;��������`J4����s����*��
�����BU+�5��K9�`��@��Y�
�5�'���|+#�{&���G�BN#O9��������r�igv�;U�4s���d6T��K.�����=�i��#;���b��t�P��;"�g�� i�^j�JY��Z�V�����KI��8:���������t���,mV���>���JG�z�	�2�dz=)a��w�������P�^�g9�	���Z�c��{�
WC>D~q��g�q�N��\R��2F�I���C��_�]�����1�LdeS��F���o�3���_��+�L{'Jr�F��W�)o�5;��@}���8t�-��}�y��Y��
��T�-Pc(�������-<��{�>���<?��[y�GE1��?4�m�v�+�v����V���\�DA�����]#.��Q���ve��k
Q*�Nb����|�2�C�������Z���O�r�R�U�����J���DT{Z��	�������g��/�
\
���%R�������[��;�72�D�����O�B5��0j$`�s�����>�J�E��2��9Y�rj��hE~���}���I��	��-��
b���h3�����`�
��p�G#s{��M�������4tD�y��asW	�f'��<"�Hs���K��Vp���5�2��v�����w� ��I-�L��7��-�(������>"��G�`Q�&A������
4u#��4/�&*B���Y'����^������NF�����2��lD���(�=(-��W�<"��{:���!?|?����rt���?v#7��	"�U��|e�O)|L�����-��c���{:���PK������-PK	OKObject 1/styles.xml��K�� ��9�KYc,+��)���LM� $SA�
�#� KA��PYJ����/�/c#w=�F��d�����bP
U_�����[�r�t����X�pe��?%7;'V�L�K�iE�a��
7�2-W���4	��?�Y�<�������=����������C���.����T�h$�1hZj���P?.����`<�~(��k��N'���������*���`���������G-��2��������nT'�7��ez��ru|�E�z�c=,�����C����<}G����\G8{�3*��h��q9v��/�Lk����b�[�v9H�>�.;��mo�����[g��e�������k>�>|^�����p�Vm�T�p������3KW���J��2��S�
n�����k�MTYy�{L���o���j-	.��Y���5	��Z�W�kLa������
�����=N�����L�
����Ct�PK�P���PK	OKObject 1/meta.xml��AO�0��~
Rw�����X���/�x[�}cUhI)c�QdA�������k�=�UpB�*�3QF��H�����<����Rs8(�\���]X�+�K�n�X�Hg57E�Z��[�7
����i����s��gF��5��{�/��%DI���N�W��l�))+Z�h;�Wj`�J��k����>.fl�z�K+eu�v	���Ia?u�ca��!����f#�I>�c�S�_�F[8c��z���x���i��)���o��~�
f���������d����~�����PK[��H!IPK	OKsettings.xml�Y]s�8}�_�������'�!�Y���L���	����I���~e:�k7�X;��}J��s/WG����Okm@H�|huO:V��>���u?�n�[�G�]�rI=p|���jKPJ�"[z:�N6<�"�$�J����<C��i����4X���Q�8��J��m�q|�OP���������_��/��j������G�dB�L��������%��4=k������.w�?m� Hj��=NRZ:������U4���*���X�ch��6���+k���.��q����*��N�
���j]��}wzv4��j]�}w08�T�o$lS����X�U:G�Ll�d���KS*��`�Zt�4��9'�"�%�r��P�
<�DB��CI��U��J�D�������\����?��|��(4~e��V���)�\������
��=E7���_���W|�o�r��y;+��#qKD�H�+T
���!s��(����<j����5 +H����g5��5�����"���#<|�-��SY���R�/�(��-=�@�D���n�����$���2�w/P��%a��pU�|�i��pBW����t�]�r�:A�"�0�C��Y_;���������H�z��S �>�	�*���7�K����m�@&]���F������������#O���OG���r�qF��<��������E����*5�ZJk�]����F�n�,M�!����R'�?Q�n�����r�6�=v���$�����H�d�����v7��#;��?�r��m
�����=���=`�K�}��B(B��7!��@�����E�S)86Q����km7K������Z�n�M)�k�L{-s��m&x�����	Ci�-�t���N*� >���h���X���[���R~�����_9�}�H��0/2��w����7�"�lpE9�*D�!��K�"�g:���x:�	������R�j����]�c��PKF���PK	OKObjectReplacements/Object 1�]ml�~��
��������Lb��1R��0_�&��P��8�	���i�VJ7j����*T����RU���������DJ�TJi�T5��*��UM�ygv�����w�=���{w�������y=s{�e��I��P�W������M�~Qo#�AU��d>�wpz�&'O���q�{8��F��:���8�Sy��_|Vd��>��<c^^H�q4WQMR��YQ�c8�]i>�>�7�$[���%��>����s��+B�E���u&v�'��������w�����iX���>��V��e�l������-��:�4��3_�i�i*�v�~���?+%���<�K;�X������i5�+F���^��K�GU/`�V�����9��$A2�k����o�;I�d��*�K��w����+�^,}3�I$���k�X�f��$H��i����o�;I�d�g%Z/���$	��u�Z/���$	��;��^,}3�I���w��J�7E�^,}3>�$4>��U��b����$��0~�N���7�SIB�`���i�X�f|*Ih|�Wm����&��['Ih|��2N_�O%	��E9X�M%	����/]�����x���SIB�b����$��1���T������vU*Ih|@����F�����T������Fe*Ih|@��}�6�$4>�[���J���q������|T��/V���J��+���[���H���Il�����opf���������N�f|���0��"XA7� ��_`K��e�`L\;�+�A�3���������������*��t���D/Gmi��Ty�|�\~8�-y.�	TOW��O���N�-�v�^8�5��wK�K�_���g�S��������������n�s��(�r����%���&[Z�J�q�k.?���<��o�$
��G��m���Yz�|�\~8�-������IR�D���^��Y�DTgi/�Y�=���QN��.RE7Jx�**���*�n�����[��#�T�W�=7}�T�
�������w�o���U�������]s��|���t>gH�)�Oc��9Y�n3�z}Z����]s��|���t>gH<���ofNV����&����;�5��wK�K�G����������RLS��w�����������$���_�u�d�����p/������%���|R
�R:��H�u����������o��2�6�����;4:8>|l��pdL�4��y	���p��d�������L�'�Y�	���'���. ���N���.��W�Q���G�BS�x��4j:�W������t�,pX=��%�ay�a�c���e����LgDD������m���5�OUk��|z���i;����j
}���<G~}��)_������^m�����F��n��v{��������.}C��5.$:�Z��4=5hS������+yU������2������<��F����5�eh���'�W�t|[3�ok�Y�y�t|c�7�w���i��o[F�m�;K��?���!��n�;K���R������w����b���wc�Y����vd��0�~��
������r\�����^����e=���8���8
*����wF����*FX�u�Q.����+�� }
Z|�����(��':YN��~���D��������K(-&zf	�*D��(O�}���y��P^� �]����.'����d�Wc��PR���2T(�P��"Z �P���2T,C�d�D�Jeh�-���l�e2T.�P���2T!�P��JZ&C�e�J��eh����U"����w���8�E|��bLH���(9��r��Cx������(�z���Fq4���	�cu��p� j��z���k.|�Q�?�> #�7��V5�yc���zc����!��n�M]j�!��=`�!�(����]E��at|���������i�A��(S�P�A��������y�������C�@��D����+(?���1z����������^�~�_�G�����nC���������<Jt5�{(?���������f��(��':�@4�H�J���rq-�/�=�D4��(7<�Rp+��Q~���d��V�[��$N������s,���o�-D
��>AY����s�����_��:��T~����O���=���^���N�F����Z����d�f���j��kdh����&Y
2���-�
e��U��e�E�Ze(&Cm2�A��eh�u��&�M�6��9(2dl���e�K2�MVC��N��o(C�!C�e(.C;d�[�v��.�-C{dh����>��=/CF����;d�_VC����p�#}���^pxd����>v�?XKK��$��&�1�g�������G�G�����H��>=�U�k(�����?�����v�s����Fv8s^�|�\���VFS��[����q��_��-B����(�D]`��8t5������
I�����R�w����q�����.]$s�q����.�z�����cg��q��aW�V�Z:;����0�q�vx���77����X���e�P����f����6�>�G^@�3�ATDt�\}es1�P."�(!:�r[)��Q�^?�K���k��l���[�7�\.��r�-���X@����2��-8����pl�A��?���P�Z?4IG����?��H�&�CQv��������������B?d�@��O���8���!#�?����13���?��.���PK��j�
�{PK	OKMETA-INF/manifest.xml�U�N�0��+"����'�6�T�!�rF���F~{���c�����w=��]���d�d�����gW$�M)t]���4�&�qo��8���$���
���
s���)p9��4�K����w|�Qj�=2�%;�JHH�y���+/e�0��"������n� �i���.t�mg�>3�X`�� �]���^�j&���]f��X��@�~'n4F����VH�v'^�k	���
����+���:������������P�e���K����;��������2/2���M���
8&}z��j�4b-�Yf�e������v|K ��\�z�F2����
�(p���LU	i]���(8\�P���T�*��(.�������CM�3�����d�PK�EGn��PK	OK�l9�..mimetypePK	OK1}����TThumbnails/thumbnail.pngPK	OK��0'c�-a
content.xmlPK	OK����T5
�styles.xmlPK	OK�P�O�meta.xmlPK	OK��h��� manifest.rdfPK	OK'"Configurations2/accelerator/current.xmlPK	OKe"Configurations2/toolpanel/PK	OK�"Configurations2/statusbar/PK	OK�"Configurations2/progressbar/PK	OK#Configurations2/toolbar/PK	OKE#Configurations2/images/Bitmaps/PK	OK�#Configurations2/popupmenu/PK	OK�#Configurations2/floater/PK	OK�#Configurations2/menubar/PK	OK������-&$Object 1/content.xmlPK	OK�P���.-Object 1/styles.xmlPK	OK[��H!IF/Object 1/meta.xmlPK	OKF����0settings.xmlPK	OK��j�
�{�4ObjectReplacements/Object 1PK	OK�EGn���?META-INF/manifest.xmlPK{�A

Cache_data_in_GetSnapshotData_POC_05.patchapplication/octet-stream; name=Cache_data_in_GetSnapshotData_POC_05.patchDownload

diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index ffa6180..48b0efb 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -171,6 +171,14 @@ static void KnownAssignedXidsReset(void);
 static inline void ProcArrayEndTransactionInternal(PGPROC *proc,
 								PGXACT *pgxact, TransactionId latestXid);
 static void ProcArrayGroupClearXid(PGPROC *proc, TransactionId latestXid);
+static inline void GetCachedSnapshotData(Snapshot snapshot);
+static inline void SetSnapshotDataCache(Snapshot snapshot,
+					 TransactionId globalxmin,
+					 TransactionId xmin, TransactionId xmax,
+					 int count, int subcount,
+					 bool suboverflowed);
+
+
 
 /*
  * Report shared-memory space needed by CreateSharedProcArray.
@@ -351,6 +359,7 @@ ProcArrayRemove(PGPROC *proc, TransactionId latestXid)
 		if (TransactionIdPrecedes(ShmemVariableCache->latestCompletedXid,
 								  latestXid))
 			ShmemVariableCache->latestCompletedXid = latestXid;
+		ProcGlobal->cached_snapshot_valid = false;
 	}
 	else
 	{
@@ -415,6 +424,7 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
 		if (LWLockConditionalAcquire(ProcArrayLock, LW_EXCLUSIVE))
 		{
 			ProcArrayEndTransactionInternal(proc, pgxact, latestXid);
+			ProcGlobal->cached_snapshot_valid = false;
 			LWLockRelease(ProcArrayLock);
 		}
 		else
@@ -566,6 +576,8 @@ ProcArrayGroupClearXid(PGPROC *proc, TransactionId latestXid)
 		nextidx = pg_atomic_read_u32(&proc->procArrayGroupNext);
 	}
 
+	ProcGlobal->cached_snapshot_valid = false;
+
 	/* We're done with the lock now. */
 	LWLockRelease(ProcArrayLock);
 
@@ -1470,6 +1482,109 @@ GetMaxSnapshotSubxidCount(void)
 }
 
 /*
+ * GetCachedSnapshotData -- get the snapshot information from cache when it is
+ *                         still valid.
+ */
+static inline void
+GetCachedSnapshotData(Snapshot snapshot)
+{
+	Snapshot	csnap = &ProcGlobal->cached_snapshot;
+	TransactionId *saved_xip;
+	TransactionId *saved_subxip;
+
+	Assert(ProcGlobal->cached_snapshot_valid);
+
+	saved_xip = snapshot->xip;
+	saved_subxip = snapshot->subxip;
+
+	memcpy(snapshot, csnap, sizeof(SnapshotData));
+
+	snapshot->xip = saved_xip;
+	snapshot->subxip = saved_subxip;
+	Assert(csnap->xcnt <= GetMaxSnapshotXidCount());
+	memcpy(snapshot->xip, csnap->xip,
+		   sizeof(TransactionId) * csnap->xcnt);
+	Assert(csnap->subxcnt <= GetMaxSnapshotSubxidCount());
+	memcpy(snapshot->subxip, csnap->subxip,
+		   sizeof(TransactionId) * csnap->subxcnt);
+}
+
+/*
+ * SetSnapshotDataCache -- cache the snapshot data so it can be directly used
+ *                        by next caller of GetSnapshotData if no transaction
+ *                        ends in the system.
+ *
+ * This function should be called only under the exclusive
+ * ProcGlobal->CachedSnapshotLock.
+ */
+static inline void
+SetSnapshotDataCache(Snapshot snapshot, TransactionId globalxmin,
+					 TransactionId xmin, TransactionId xmax, int count,
+					 int subcount, bool suboverflowed)
+{
+	Snapshot	csnap = &ProcGlobal->cached_snapshot;
+	TransactionId *saved_xip;
+	TransactionId *saved_subxip;
+	int			curr_subcount = subcount;
+	bool		has_suboverflowed = suboverflowed;
+
+	ProcGlobal->cached_snapshot_globalxmin = globalxmin;
+
+	saved_xip = csnap->xip;
+	saved_subxip = csnap->subxip;
+	memcpy(csnap, snapshot, sizeof(SnapshotData));
+	csnap->xip = saved_xip;
+	csnap->subxip = saved_subxip;
+
+	csnap->xmin = xmin;
+	csnap->xmax = xmax;
+
+	memcpy(csnap->xip, snapshot->xip, sizeof(TransactionId) * count);
+	memcpy(csnap->subxip, snapshot->subxip, sizeof(TransactionId) * subcount);
+
+
+	/* Add my own XID and sub-XIDs to snapshot. */
+	if (TransactionIdIsNormal(MyPgXact->xid)
+		&& NormalTransactionIdPrecedes(MyPgXact->xid, xmax))
+	{
+
+		csnap->xip[count] = MyPgXact->xid;
+		csnap->xcnt = count + 1;
+
+		if (!has_suboverflowed)
+		{
+			if (MyPgXact->overflowed)
+				has_suboverflowed = true;
+			else
+			{
+				int			nxids = MyPgXact->nxids;
+
+				if (nxids > 0)
+				{
+					memcpy(csnap->subxip + subcount,
+						   (void *) MyProc->subxids.xids,
+						   nxids * sizeof(TransactionId));
+					curr_subcount += nxids;
+				}
+			}
+		}
+
+	}
+	else
+		csnap->xcnt = count;
+
+	csnap->subxcnt = curr_subcount;
+	csnap->suboverflowed = has_suboverflowed;
+
+	/*
+	 * The memory barrier has be to be placed here to ensure that
+	 * is_snapshot_valid is set only after snapshot is cached.
+	 */
+	pg_write_barrier();
+	ProcGlobal->cached_snapshot_valid = true;
+}
+
+/*
  * GetSnapshotData -- returns information about running transactions.
  *
  * The returned snapshot includes xmin (lowest still-running xact ID),
@@ -1518,6 +1633,7 @@ GetSnapshotData(Snapshot snapshot)
 	bool		suboverflowed = false;
 	volatile TransactionId replication_slot_xmin = InvalidTransactionId;
 	volatile TransactionId replication_slot_catalog_xmin = InvalidTransactionId;
+	bool		is_snapshot_set = false;
 
 	Assert(snapshot != NULL);
 
@@ -1559,150 +1675,179 @@ GetSnapshotData(Snapshot snapshot)
 	 */
 	LWLockAcquire(ProcArrayLock, LW_SHARED);
 
-	/* xmax is always latestCompletedXid + 1 */
-	xmax = ShmemVariableCache->latestCompletedXid;
-	Assert(TransactionIdIsNormal(xmax));
-	TransactionIdAdvance(xmax);
-
-	/* initialize xmin calculation with xmax */
-	globalxmin = xmin = xmax;
-
 	snapshot->takenDuringRecovery = RecoveryInProgress();
 
-	if (!snapshot->takenDuringRecovery)
+	if (ProcGlobal->cached_snapshot_valid)
 	{
-		int		   *pgprocnos = arrayP->pgprocnos;
-		int			numProcs;
+		Assert(!snapshot->takenDuringRecovery);
+		GetCachedSnapshotData(snapshot);
+		xmin = snapshot->xmin;
+		globalxmin = ProcGlobal->cached_snapshot_globalxmin;
 
-		/*
-		 * Spin over procArray checking xid, xmin, and subxids.  The goal is
-		 * to gather all active xids, find the lowest xmin, and try to record
-		 * subxids.
-		 */
-		numProcs = arrayP->numProcs;
-		for (index = 0; index < numProcs; index++)
+		Assert(TransactionIdIsValid(xmin));
+		Assert(TransactionIdIsValid(globalxmin));
+		is_snapshot_set = true;
+	}
+	else
+	{
+		/* xmax is always latestCompletedXid + 1 */
+		xmax = ShmemVariableCache->latestCompletedXid;
+		Assert(TransactionIdIsNormal(xmax));
+		TransactionIdAdvance(xmax);
+
+		/* initialize xmin calculation with xmax */
+		globalxmin = xmin = xmax;
+
+		if (!snapshot->takenDuringRecovery)
 		{
-			int			pgprocno = pgprocnos[index];
-			volatile PGXACT *pgxact = &allPgXact[pgprocno];
-			TransactionId xid;
+			int		   *pgprocnos = arrayP->pgprocnos;
+			int			numProcs;
 
 			/*
-			 * Backend is doing logical decoding which manages xmin
-			 * separately, check below.
+			 * Spin over procArray checking xid, xmin, and subxids.  The goal
+			 * is to gather all active xids, find the lowest xmin, and try to
+			 * record subxids.
 			 */
-			if (pgxact->vacuumFlags & PROC_IN_LOGICAL_DECODING)
-				continue;
-
-			/* Ignore procs running LAZY VACUUM */
-			if (pgxact->vacuumFlags & PROC_IN_VACUUM)
-				continue;
-
-			/* Update globalxmin to be the smallest valid xmin */
-			xid = pgxact->xmin; /* fetch just once */
-			if (TransactionIdIsNormal(xid) &&
-				NormalTransactionIdPrecedes(xid, globalxmin))
-				globalxmin = xid;
-
-			/* Fetch xid just once - see GetNewTransactionId */
-			xid = pgxact->xid;
-
-			/*
-			 * If the transaction has no XID assigned, we can skip it; it
-			 * won't have sub-XIDs either.  If the XID is >= xmax, we can also
-			 * skip it; such transactions will be treated as running anyway
-			 * (and any sub-XIDs will also be >= xmax).
-			 */
-			if (!TransactionIdIsNormal(xid)
-				|| !NormalTransactionIdPrecedes(xid, xmax))
-				continue;
-
-			/*
-			 * We don't include our own XIDs (if any) in the snapshot, but we
-			 * must include them in xmin.
-			 */
-			if (NormalTransactionIdPrecedes(xid, xmin))
-				xmin = xid;
-			if (pgxact == MyPgXact)
-				continue;
-
-			/* Add XID to snapshot. */
-			snapshot->xip[count++] = xid;
-
-			/*
-			 * Save subtransaction XIDs if possible (if we've already
-			 * overflowed, there's no point).  Note that the subxact XIDs must
-			 * be later than their parent, so no need to check them against
-			 * xmin.  We could filter against xmax, but it seems better not to
-			 * do that much work while holding the ProcArrayLock.
-			 *
-			 * The other backend can add more subxids concurrently, but cannot
-			 * remove any.  Hence it's important to fetch nxids just once.
-			 * Should be safe to use memcpy, though.  (We needn't worry about
-			 * missing any xids added concurrently, because they must postdate
-			 * xmax.)
-			 *
-			 * Again, our own XIDs are not included in the snapshot.
-			 */
-			if (!suboverflowed)
+			numProcs = arrayP->numProcs;
+			for (index = 0; index < numProcs; index++)
 			{
-				if (pgxact->overflowed)
-					suboverflowed = true;
-				else
+				int			pgprocno = pgprocnos[index];
+				volatile PGXACT *pgxact = &allPgXact[pgprocno];
+				TransactionId xid;
+
+				/*
+				 * Backend is doing logical decoding which manages xmin
+				 * separately, check below.
+				 */
+				if (pgxact->vacuumFlags & PROC_IN_LOGICAL_DECODING)
+					continue;
+
+				/* Ignore procs running LAZY VACUUM */
+				if (pgxact->vacuumFlags & PROC_IN_VACUUM)
+					continue;
+
+				/* Update globalxmin to be the smallest valid xmin */
+				xid = pgxact->xmin; /* fetch just once */
+				if (TransactionIdIsNormal(xid) &&
+					NormalTransactionIdPrecedes(xid, globalxmin))
+					globalxmin = xid;
+
+				/* Fetch xid just once - see GetNewTransactionId */
+				xid = pgxact->xid;
+
+				/*
+				 * If the transaction has no XID assigned, we can skip it; it
+				 * won't have sub-XIDs either.  If the XID is >= xmax, we can
+				 * also skip it; such transactions will be treated as running
+				 * anyway (and any sub-XIDs will also be >= xmax).
+				 */
+				if (!TransactionIdIsNormal(xid)
+					|| !NormalTransactionIdPrecedes(xid, xmax))
+					continue;
+
+				/*
+				 * We don't include our own XIDs (if any) in the snapshot, but
+				 * we must include them in xmin.
+				 */
+				if (NormalTransactionIdPrecedes(xid, xmin))
+					xmin = xid;
+				if (pgxact == MyPgXact)
+					continue;
+
+				/* Add XID to snapshot. */
+				snapshot->xip[count++] = xid;
+
+				/*
+				 * Save subtransaction XIDs if possible (if we've already
+				 * overflowed, there's no point).  Note that the subxact XIDs
+				 * must be later than their parent, so no need to check them
+				 * against xmin.  We could filter against xmax, but it seems
+				 * better not to do that much work while holding the
+				 * ProcArrayLock.
+				 *
+				 * The other backend can add more subxids concurrently, but
+				 * cannot remove any.  Hence it's important to fetch nxids
+				 * just once. Should be safe to use memcpy, though.  (We
+				 * needn't worry about missing any xids added concurrently,
+				 * because they must postdate xmax.)
+				 *
+				 * Again, our own XIDs are not included in the snapshot.
+				 */
+				if (!suboverflowed)
 				{
-					int			nxids = pgxact->nxids;
-
-					if (nxids > 0)
+					if (pgxact->overflowed)
+						suboverflowed = true;
+					else
 					{
-						volatile PGPROC *proc = &allProcs[pgprocno];
+						int			nxids = pgxact->nxids;
 
-						memcpy(snapshot->subxip + subcount,
-							   (void *) proc->subxids.xids,
-							   nxids * sizeof(TransactionId));
-						subcount += nxids;
+						if (nxids > 0)
+						{
+							volatile PGPROC *proc = &allProcs[pgprocno];
+
+							memcpy(snapshot->subxip + subcount,
+								   (void *) proc->subxids.xids,
+								   nxids * sizeof(TransactionId));
+							subcount += nxids;
+						}
 					}
 				}
 			}
+
+			/*
+			 * Let only one of the caller cache the computed snapshot, and
+			 * others can continue as before.
+			 */
+			if (!ProcGlobal->cached_snapshot_valid &&
+				LWLockConditionalAcquire(&ProcGlobal->CachedSnapshotLock,
+										 LW_EXCLUSIVE))
+			{
+				SetSnapshotDataCache(snapshot, globalxmin, xmin, xmax, count,
+									 subcount, suboverflowed);
+				LWLockRelease(&ProcGlobal->CachedSnapshotLock);
+			}
 		}
-	}
-	else
-	{
-		/*
-		 * We're in hot standby, so get XIDs from KnownAssignedXids.
-		 *
-		 * We store all xids directly into subxip[]. Here's why:
-		 *
-		 * In recovery we don't know which xids are top-level and which are
-		 * subxacts, a design choice that greatly simplifies xid processing.
-		 *
-		 * It seems like we would want to try to put xids into xip[] only, but
-		 * that is fairly small. We would either need to make that bigger or
-		 * to increase the rate at which we WAL-log xid assignment; neither is
-		 * an appealing choice.
-		 *
-		 * We could try to store xids into xip[] first and then into subxip[]
-		 * if there are too many xids. That only works if the snapshot doesn't
-		 * overflow because we do not search subxip[] in that case. A simpler
-		 * way is to just store all xids in the subxact array because this is
-		 * by far the bigger array. We just leave the xip array empty.
-		 *
-		 * Either way we need to change the way XidInMVCCSnapshot() works
-		 * depending upon when the snapshot was taken, or change normal
-		 * snapshot processing so it matches.
-		 *
-		 * Note: It is possible for recovery to end before we finish taking
-		 * the snapshot, and for newly assigned transaction ids to be added to
-		 * the ProcArray.  xmax cannot change while we hold ProcArrayLock, so
-		 * those newly added transaction ids would be filtered away, so we
-		 * need not be concerned about them.
-		 */
-		subcount = KnownAssignedXidsGetAndSetXmin(snapshot->subxip, &xmin,
-												  xmax);
+		else
+		{
+			/*
+			 * We're in hot standby, so get XIDs from KnownAssignedXids.
+			 *
+			 * We store all xids directly into subxip[]. Here's why:
+			 *
+			 * In recovery we don't know which xids are top-level and which
+			 * are subxacts, a design choice that greatly simplifies xid
+			 * processing.
+			 *
+			 * It seems like we would want to try to put xids into xip[] only,
+			 * but that is fairly small. We would either need to make that
+			 * bigger or to increase the rate at which we WAL-log xid
+			 * assignment; neither is an appealing choice.
+			 *
+			 * We could try to store xids into xip[] first and then into
+			 * subxip[] if there are too many xids. That only works if the
+			 * snapshot doesn't overflow because we do not search subxip[] in
+			 * that case. A simpler way is to just store all xids in the
+			 * subxact array because this is by far the bigger array. We just
+			 * leave the xip array empty.
+			 *
+			 * Either way we need to change the way XidInMVCCSnapshot() works
+			 * depending upon when the snapshot was taken, or change normal
+			 * snapshot processing so it matches.
+			 *
+			 * Note: It is possible for recovery to end before we finish
+			 * taking the snapshot, and for newly assigned transaction ids to
+			 * be added to the ProcArray.  xmax cannot change while we hold
+			 * ProcArrayLock, so those newly added transaction ids would be
+			 * filtered away, so we need not be concerned about them.
+			 */
+			subcount = KnownAssignedXidsGetAndSetXmin(snapshot->subxip, &xmin,
+													  xmax);
 
-		if (TransactionIdPrecedesOrEquals(xmin, procArray->lastOverflowedXid))
-			suboverflowed = true;
+			if (TransactionIdPrecedesOrEquals(xmin, procArray->lastOverflowedXid))
+				suboverflowed = true;
+		}
 	}
 
-
 	/* fetch into volatile var while ProcArrayLock is held */
 	replication_slot_xmin = procArray->replication_slot_xmin;
 	replication_slot_catalog_xmin = procArray->replication_slot_catalog_xmin;
@@ -1742,12 +1887,14 @@ GetSnapshotData(Snapshot snapshot)
 		RecentGlobalXmin = replication_slot_catalog_xmin;
 
 	RecentXmin = xmin;
-
-	snapshot->xmin = xmin;
-	snapshot->xmax = xmax;
-	snapshot->xcnt = count;
-	snapshot->subxcnt = subcount;
-	snapshot->suboverflowed = suboverflowed;
+	if (!is_snapshot_set)
+	{
+		snapshot->xmin = xmin;
+		snapshot->xmax = xmax;
+		snapshot->xcnt = count;
+		snapshot->subxcnt = subcount;
+		snapshot->suboverflowed = suboverflowed;
+	}
 
 	snapshot->curcid = GetCurrentCommandId(false);
 
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index bfa8499..3e21587 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -118,6 +118,13 @@ ProcGlobalShmemSize(void)
 	size = add_size(size, mul_size(NUM_AUXILIARY_PROCS, sizeof(PGXACT)));
 	size = add_size(size, mul_size(max_prepared_xacts, sizeof(PGXACT)));
 
+	/* for the cached snapshot */
+#define PROCARRAY_MAXPROCS	(MaxBackends + max_prepared_xacts)
+	size = add_size(size, mul_size(sizeof(TransactionId), PROCARRAY_MAXPROCS));
+#define TOTAL_MAX_CACHED_SUBXIDS \
+	((PGPROC_MAX_CACHED_SUBXIDS + 1) * PROCARRAY_MAXPROCS)
+	size = add_size(size, mul_size(sizeof(TransactionId), TOTAL_MAX_CACHED_SUBXIDS));
+
 	return size;
 }
 
@@ -278,6 +285,15 @@ InitProcGlobal(void)
 	/* Create ProcStructLock spinlock, too */
 	ProcStructLock = (slock_t *) ShmemAlloc(sizeof(slock_t));
 	SpinLockInit(ProcStructLock);
+
+	/* Initialize structure for cached snapshot */
+	ProcGlobal->cached_snapshot_valid = false;
+	LWLockInitialize(&ProcGlobal->CachedSnapshotLock, LWTRANCHE_PROC);
+	ProcGlobal->cached_snapshot.xip =
+						ShmemAlloc(PROCARRAY_MAXPROCS * sizeof(TransactionId));
+	ProcGlobal->cached_snapshot.subxip =
+				ShmemAlloc(TOTAL_MAX_CACHED_SUBXIDS * sizeof(TransactionId));
+	ProcGlobal->cached_snapshot_globalxmin = InvalidTransactionId;
 }
 
 /*
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 7dbaa81..0ba3dda 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -16,6 +16,7 @@
 
 #include "access/xlogdefs.h"
 #include "lib/ilist.h"
+#include "utils/snapshot.h"
 #include "storage/latch.h"
 #include "storage/lock.h"
 #include "storage/pg_sema.h"
@@ -253,6 +254,22 @@ typedef struct PROC_HDR
 	int			startupProcPid;
 	/* Buffer id of the buffer that Startup process waits for pin on, or -1 */
 	int			startupBufferPinWaitBufId;
+
+	/*
+	 * In GetSnapshotData we can reuse the previously snapshot computed if no
+	 * new transaction has committed or rolledback. Thus saving good amount of
+	 * computation cycle under GetSnapshotData where we need to iterate
+	 * through procArray every time we try to get the current snapshot. Below
+	 * members help us to save the previously computed snapshot in global
+	 * shared memory and any process which want to get current snapshot can
+	 * directly copy from them if it is still valid.
+	 */
+	SnapshotData cached_snapshot;	/* Previously saved snapshot */
+	volatile bool cached_snapshot_valid;	/* is above snapshot valid */
+	TransactionId cached_snapshot_globalxmin;	/* globalxmin when above
+												 * snapshot was computed */
+	LWLock		CachedSnapshotLock; /* A try lock to make sure only one
+									 * process write to cached_snapshot */
 } PROC_HDR;
 
 extern PROC_HDR *ProcGlobal;

#28

Jesper Pedersen

jesper.pedersen@redhat.com

over 8 years ago

In reply to: Mithun Cy (#27)

Hi,

On 08/29/2017 05:04 AM, Mithun Cy wrote:

Test Setting:
=========
Server configuration:
./postgres -c shared_buffers=8GB -N 300 -c min_wal_size=15GB -c
max_wal_size=20GB -c checkpoint_timeout=900 -c
maintenance_work_mem=1GB -c checkpoint_completion_target=0.9 -c
wal_buffers=256MB &

pgbench configuration:
scale_factor = 300
./pgbench -c $threads -j $threads -T $time_for_reading -M prepared -S postgres

The machine has 64 cores with this patch I can see server starts
improvement after 64 clients. I have tested up to 256 clients. Which
shows performance improvement nearly max 39% [1]. This is the best
case for the patch where once computed snapshotData is reused further.

The worst case seems to be small, quick write transactions which
frequently invalidate the cached SnapshotData before it is reused by
any next GetSnapshotData call. As of now, I tested simple update
tests: pgbench -M Prepared -N on the same machine with the above
server configuration. I do not see much change in TPS numbers.

All TPS are median of 3 runs
Clients TPS-With Patch 05 TPS-Base %Diff
1 752.461117 755.186777 -0.3%
64 32171.296537 31202.153576 +3.1%
128 41059.660769 40061.929658 +2.49%

I will do some profiling and find out why this case is not costing us
some performance due to caching overhead.

I have done a run with this patch on a 2S/28C/56T/256Gb w/ 2 x RAID10
SSD machine.

Both for -M prepared, and -M prepared -S I'm not seeing any improvements
(1 to 375 clients); e.g. +-1%.

Although the -M prepared -S case should improve, I'm not sure that the
extra overhead in the -M prepared case is worth the added code complexity.

Best regards,
Jesper

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#29

Mithun Cy

mithun.cy@enterprisedb.com

over 8 years ago

In reply to: Jesper Pedersen (#28)

On Wed, Sep 13, 2017 at 7:24 PM, Jesper Pedersen
<jesper.pedersen@redhat.com> wrote:

I have done a run with this patch on a 2S/28C/56T/256Gb w/ 2 x RAID10 SSD
machine.

Both for -M prepared, and -M prepared -S I'm not seeing any improvements (1
to 375 clients); e.g. +-1%.

My test was done on an 8 socket NUMA intel machine, where I could
clearly see improvements as posted before.
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 128
On-line CPU(s) list: 0-127
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 8
NUMA node(s): 8
Vendor ID: GenuineIntel
CPU family: 6
Model: 47
Model name: Intel(R) Xeon(R) CPU E7- 8830 @ 2.13GHz
Stepping: 2
CPU MHz: 1064.000
BogoMIPS: 4266.62
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 24576K
NUMA node0 CPU(s): 0,65-71,96-103
NUMA node1 CPU(s): 72-79,104-111
NUMA node2 CPU(s): 80-87,112-119
NUMA node3 CPU(s): 88-95,120-127
NUMA node4 CPU(s): 1-8,33-40
NUMA node5 CPU(s): 9-16,41-48
NUMA node6 CPU(s): 17-24,49-56
NUMA node7 CPU(s): 25-32,57-64

Let me recheck if the improvement is due to NUMA or cache sizes.
Currently above machine is not available for me. It will take 2 more
days for same.

Although the -M prepared -S case should improve, I'm not sure that the extra
overhead in the -M prepared case is worth the added code complexity.

Each tpcb like (-M prepared) transaction in pgbench have 3 updates 1
insert and 1 select statements. There will be more GetSnapshotData
calls than the end of the transaction (cached snapshot invalidation).
So I think whatever we cache has a higher chance of getting used
before it is invalidated in -M prepared. But on worst cases where we
have quick write transaction which invalidates cached snapshot before
it is reused becomes an overhead.
As Andres has suggested previously I need a mechanism to identify and
avoid caching for such scenarios. I do not have a right solution for
this at present but one thing we can try is just before caching if we
see an exclusive request in wait queue of ProcArrayLock we can avoid
caching.

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#30

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Mithun Cy (#25)

On Tue, Aug 29, 2017 at 1:57 AM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:

All TPS are median of 3 runs
Clients TPS-With Patch 05 TPS-Base %Diff
1 752.461117 755.186777 -0.3%
64 32171.296537 31202.153576 +3.1%
128 41059.660769 40061.929658 +2.49%

I will do some profiling and find out why this case is not costing us
some performance due to caching overhead.

So, this shows only a 2.49% improvement at 128 clients but in the
earlier message you reported a 39% speedup at 256 clients. Is that
really correct? There's basically no improvement up to threads = 2 x
CPU cores, and then after that it starts to improve rapidly? What
happens at intermediate points, like 160, 192, 224 clients?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#31

Michael Paquier

michael.paquier@gmail.com

over 8 years ago

In reply to: Robert Haas (#30)

On Thu, Sep 21, 2017 at 3:15 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Aug 29, 2017 at 1:57 AM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:

All TPS are median of 3 runs
Clients TPS-With Patch 05 TPS-Base %Diff
1 752.461117 755.186777 -0.3%
64 32171.296537 31202.153576 +3.1%
128 41059.660769 40061.929658 +2.49%

I will do some profiling and find out why this case is not costing us
some performance due to caching overhead.

So, this shows only a 2.49% improvement at 128 clients but in the
earlier message you reported a 39% speedup at 256 clients. Is that
really correct? There's basically no improvement up to threads = 2 x
CPU cores, and then after that it starts to improve rapidly? What
happens at intermediate points, like 160, 192, 224 clients?

It could be interesting to do multiple runs as well, and eliminate
runs with upper and lower bound results while taking an average of the
others. 2/3% is within the noise band of pgbench.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#32

Mithun Cy

mithun.cy@enterprisedb.com

over 8 years ago

In reply to: Robert Haas (#30)

On Wed, Sep 20, 2017 at 11:45 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Aug 29, 2017 at 1:57 AM, Mithun Cy <mithun.cy@enterprisedb.com>

wrote:

All TPS are median of 3 runs
Clients TPS-With Patch 05 TPS-Base %Diff
1 752.461117 755.186777 -0.3%
64 32171.296537 31202.153576 +3.1%
128 41059.660769 40061.929658 +2.49%

I will do some profiling and find out why this case is not costing us
some performance due to caching overhead.

So, this shows only a 2.49% improvement at 128 clients but in the
earlier message you reported a 39% speedup at 256 clients. Is that
really correct? There's basically no improvement up to threads = 2 x
CPU cores, and then after that it starts to improve rapidly? What
happens at intermediate points, like 160, 192, 224 clients?

I think there is some confusion above results is for pgbench simple update
(-N) tests where cached snapshot gets invalidated, I have run this to check
if there is any regression due to frequent cache invalidation and did not
find any. The target test for the above patch is read-only case [1]cache_the_snapshot_performance.ods </messages/by-id/CAD__Oui=Q++3ER0pU2HanJqE7nGtOhWxypxxJQVsJqNS91LG-g@mail.gmail.com> Thanks and Regards Mithun C Y EnterpriseDB: http://www.enterprisedb.com where
we can see the performance improvement as high as 39% (@256 threads) on
Cthulhu(a 8 socket numa machine with 64 CPU cores). At 64 threads ( = CPU
cores) we have 5% improvement and at clients 128 = (2 * CPU cores =
hyperthreads) we have 17% improvement.

Clients BASE CODE With patch %Imp

64 452475.929144 476195.952736 5.2422730281

128 556207.727932 653256.029012 17.4482115595

256 494336.282804 691614.000463 39.9075941867

[1]: cache_the_snapshot_performance.ods </messages/by-id/CAD__Oui=Q++3ER0pU2HanJqE7nGtOhWxypxxJQVsJqNS91LG-g@mail.gmail.com> Thanks and Regards Mithun C Y EnterpriseDB: http://www.enterprisedb.com
</messages/by-id/CAD__Oui=Q++3ER0pU2HanJqE7nGtOhWxypxxJQVsJqNS91LG-g@mail.gmail.com>
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

#33

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Mithun Cy (#32)

On Wed, Sep 20, 2017 at 9:46 PM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:

I think there is some confusion above results is for pgbench simple update
(-N) tests where cached snapshot gets invalidated, I have run this to check
if there is any regression due to frequent cache invalidation and did not
find any. The target test for the above patch is read-only case [1] where we
can see the performance improvement as high as 39% (@256 threads) on
Cthulhu(a 8 socket numa machine with 64 CPU cores). At 64 threads ( = CPU
cores) we have 5% improvement and at clients 128 = (2 * CPU cores =
hyperthreads) we have 17% improvement.

Clients BASE CODE With patch %Imp

64 452475.929144 476195.952736 5.2422730281

128 556207.727932 653256.029012 17.4482115595

256 494336.282804 691614.000463 39.9075941867

Oh, you're right. I was confused.

But now I'm confused about something else: if you're seeing a clear
gain at higher-client counts, why is Jesper Pederson not seeing the
same thing? Do you have results for a 2-socket machine? Maybe this
only helps with >2 sockets.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#34

Mithun Cy

mithun.cy@enterprisedb.com

over 8 years ago

In reply to: Robert Haas (#33)

On Thu, Sep 21, 2017 at 7:25 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Sep 20, 2017 at 9:46 PM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:

I think there is some confusion above results is for pgbench simple update
(-N) tests where cached snapshot gets invalidated, I have run this to check
if there is any regression due to frequent cache invalidation and did not
find any. The target test for the above patch is read-only case [1] where we
can see the performance improvement as high as 39% (@256 threads) on
Cthulhu(a 8 socket numa machine with 64 CPU cores). At 64 threads ( = CPU
cores) we have 5% improvement and at clients 128 = (2 * CPU cores =
hyperthreads) we have 17% improvement.

Clients BASE CODE With patch %Imp

64 452475.929144 476195.952736 5.2422730281

128 556207.727932 653256.029012 17.4482115595

256 494336.282804 691614.000463 39.9075941867

Oh, you're right. I was confused.

But now I'm confused about something else: if you're seeing a clear
gain at higher-client counts, why is Jesper Pederson not seeing the
same thing? Do you have results for a 2-socket machine? Maybe this
only helps with >2 sockets.

My current tests show on scylla (2 socket machine with 28 CPU core) I
do not see any improvement at all as similar to Jesper. But same tests
on power2 (4 sockets) and Cthulhu(8 socket machine) we can see
improvements.

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#35

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Mithun Cy (#34)

On Wed, Sep 20, 2017 at 10:04 PM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:

My current tests show on scylla (2 socket machine with 28 CPU core) I
do not see any improvement at all as similar to Jesper. But same tests
on power2 (4 sockets) and Cthulhu(8 socket machine) we can see
improvements.

OK, makes sense. So for whatever reason, this appears to be something
that will only help users with >2 sockets. But that's not necessarily
a bad thing.

The small regression at 1 client is a bit concerning though.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#36

Andres Freund

andres@anarazel.de

over 8 years ago

In reply to: Robert Haas (#35)

On September 20, 2017 7:22:00 PM PDT, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Sep 20, 2017 at 10:04 PM, Mithun Cy
<mithun.cy@enterprisedb.com> wrote:

My current tests show on scylla (2 socket machine with 28 CPU core) I
do not see any improvement at all as similar to Jesper. But same

tests

on power2 (4 sockets) and Cthulhu(8 socket machine) we can see
improvements.

OK, makes sense. So for whatever reason, this appears to be something
that will only help users with >2 sockets. But that's not necessarily
a bad thing.

Wonder how much of that is microarchitecture, how much number of sockets.

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#37

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Andres Freund (#36)

On Wed, Sep 20, 2017 at 10:32 PM, Andres Freund <andres@anarazel.de> wrote:

Wonder how much of that is microarchitecture, how much number of sockets.

I don't know. How would we tell? power2 is a 4-socket POWER box, and
cthulhu is an 8-socket x64 box, so it's not like they are the same
sort of thing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers