LWLock optimization for multicore Power machines

Started by Alexander Korotkovalmost 9 years ago38 messages

a.korotkov@postgrespro.ru

almost 9 years ago

3 attachment(s)

Hi everybody!

During FOSDEM/PGDay 2017 developer meeting I said that I have some special
assembly optimization for multicore Power machines. From the answers of
other hackers I realized following.

1. There are some big Power machines with PostgreSQL in production use.
Not as many as Intel, but some of them.
2. Community could be interested in special assembly optimization for
Power machines despite cost of maintaining it.

Power processors use specific implementation of atomic operations. This
implementation is some kind of optimistic locking. 'lwarx' instruction
'reserves index', but that reservation could be broken on 'stwcx', and then
we have to retry. So, for instance CAS operation on Power processor is a
loop. So, loop of CAS operations is two level nested loop. Benchmarks
showed that it becomes real problem for LWLockAttemptLock(). However, one
actually can put arbitrary logic between 'lwarx' and 'stwcx' and make it a
single loop. The downside is that this logic has to be implemented in
assembly. See [1] for experiment details.

Results in [1] have a lot of junk which isn't relevant anymore. This is
why I draw a separate graph.

power8-lwlock-asm-ro.png – results of read-only pgbench test on IBM E880
which have 32 physical cores and 256 virtual thread via SMT. The curves
have following meaning.
* 9.5: unpatched PostgreSQL 9.5
* pinunpin-cas: PostgreSQL 9.5 + earlier version of 48354581
* pinunpin-lwlock-asm: PostgreSQL 9.5 + earlier version of 48354581 +
LWLock implementation in assembly.

lwlock-power-1.patch – is the patch for assembly implementation of LWLock
which I used that time rebased to current master.

Using assembly in lwlock.c looks rough. This is why I refactored it by
introducing new atomic operation pg_atomic_fetch_mask_add_u32 (see
lwlock-power-2.patch). It checks that all masked bits are clear and then
adds to variable. This atomic have special assembly implementation for
Power, and generic implementation for other platforms with loop of CAS.
Probably we would have other implementations for other architectures in
future. This level of abstraction is the best I managed to invent.

Unfortunately, I have no big enough Power machine at hand to reproduce that
results. Actually, I have no Power machine at hand at all. So,
lwlock-power-2.patch was written "blindly". I would very appreciate if
someone would help me with testing and benchmarking.

1. /messages/by-id/CAPpHfdsogj38HTDhNMLE56uJy9N8-
=gYa2nNuWbPujGp2n1ffQ@mail.gmail.com

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

power8-lwlock-asm-ro.pngimage/png; name=power8-lwlock-asm-ro.pngDownload

lwlock-power-1.patchapplication/octet-stream; name=lwlock-power-1.patchDownload

diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
new file mode 100644
index c196bb8..548518c
*** a/src/backend/storage/lmgr/lwlock.c
--- b/src/backend/storage/lmgr/lwlock.c
*************** GetLWLockIdentifier(uint32 classId, uint
*** 715,720 ****
--- 715,770 ----
  	return LWLockTrancheArray[eventId];
  }
  
+ #if (defined(__GNUC__) || defined(__INTEL_COMPILER)) && (defined(__ppc__) || defined(__powerpc__) || defined(__ppc64__) || defined(__powerpc64__))
+ 
+ /*
+  * Special optimization for PowerPC processors: put logic dealing with LWLock
+  * state between lwarx/stwcx operations.
+  */
+ static bool
+ LWLockAttemptLock(LWLock *lock, LWLockMode mode)
+ {
+ 	uint32		mask, increment;
+ 	bool		result;
+ 
+ 	AssertArg(mode == LW_EXCLUSIVE || mode == LW_SHARED);
+ 
+ 	if (mode == LW_EXCLUSIVE)
+ 	{
+ 		mask = LW_LOCK_MASK;
+ 		increment = LW_VAL_EXCLUSIVE;
+ 	}
+ 	else
+ 	{
+ 		mask = LW_VAL_EXCLUSIVE;
+ 		increment = LW_VAL_SHARED;
+ 	}
+ 
+ 	__asm__ __volatile__(
+ 	"0:	lwarx	3,0,%4		\n"
+ 	"	and		4,3,%2		\n"
+ 	"	cmpwi	4,0			\n"
+ 	"	bne-	1f			\n"
+ 	"	add		3,3,%3		\n"
+ 	"	stwcx.	3,0,%4		\n"
+ 	"	bne-	0b			\n"
+ 	"	li		%0,0		\n"
+ 	"	b		2f			\n"
+ 	"1: li		%0,1		\n"
+ #ifdef USE_PPC_LWSYNC
+ 	"2:	lwsync				\n"
+ #else
+ 	"2:	isync				\n"
+ #endif
+ 	: "=&r"(result), "+m"(lock->state)
+ 	: "r"(mask), "r"(increment), "r"(&lock->state)
+ 	: "memory", "cc", "r3", "r4");
+ 
+ 	return result;
+ }
+ 
+ #else
+ 
  /*
   * Internal function that tries to atomically acquire the lwlock in the passed
   * in mode.
*************** LWLockAttemptLock(LWLock *lock, LWLockMo
*** 787,792 ****
--- 837,844 ----
  	pg_unreachable();
  }
  
+ #endif
+ 
  /*
   * Lock the LWLock's wait list against concurrent activity.
   *

lwlock-power-2.patchapplication/octet-stream; name=lwlock-power-2.patchDownload

diff --git a/src/backend/port/atomics.c b/src/backend/port/atomics.c
new file mode 100644
index 86b5308..533f252
*** a/src/backend/port/atomics.c
--- b/src/backend/port/atomics.c
*************** pg_atomic_fetch_add_u32_impl(volatile pg
*** 158,160 ****
--- 158,243 ----
  }
  
  #endif   /* PG_HAVE_ATOMIC_U32_SIMULATION */
+ 
+ #if (defined(__GNUC__) || defined(__INTEL_COMPILER)) && (defined(__ppc__) || defined(__powerpc__) || defined(__ppc64__) || defined(__powerpc64__))
+ 
+ /*
+  * Optimized implementation for Power processors.  Atomic operations on Power
+  * processors are implemented using optimistic locking.  'lwarx' instruction
+  * 'reserves index', but that reservation could be broken on 'stwcx.' and then
+  * we have to retry.  Thus, each CAS operation is a loop.  But loop of CAS
+  * operation is two level nested loop.  Experiments on multicore Power machines
+  * shows that we can have huge benefit from having this operation done using
+  * single loop in assembly.
+  */
+ uint32
+ pg_atomic_fetch_mask_add_u32(volatile pg_atomic_uint32 *ptr,
+ 							 uint32 mask, uint32 increment)
+ {
+ 	uint32		result;
+ 
+ 	__asm__ __volatile__(
+ 	"0:	lwarx	%0,0,%4		\n" /* read *ptr and reserve index */
+ 	"	and		3,%0,%2		\n" /* calculate '*ptr & mask" */
+ 	"	cmpwi	3,0			\n" /* compare '*ptr & mark' vs 0 */
+ 	"	bne-	1f			\n" /* exit on '*ptr & mark != 0' */
+ 	"	add		3,%0,%3		\n" /* calculate '*ptr + increment' */
+ 	"	stwcx.	3,0,%4		\n" /* try to store '*ptr + increment' into *ptr */
+ 	"	bne-	0b			\n" /* retry if index reservation is broken */
+ #ifdef USE_PPC_LWSYNC
+ 	"1:	lwsync				\n"
+ #else
+ 	"1:	isync				\n"
+ #endif
+ 	: "=&r"(result), "+m"(*ptr)
+ 	: "r"(mask), "r"(increment), "r"(ptr)
+ 	: "memory", "cc", "r3");
+ 	return result;
+ }
+ 
+ #else
+ 
+ /*
+  * Generic implementation via loop of compare & exchange.
+  */
+ uint32
+ pg_atomic_fetch_mask_add_u32(volatile pg_atomic_uint32 *ptr,
+ 							 uint32 mask_, uint32 add_)
+ {
+ 	uint32		old_value;
+ 
+ 	/*
+ 	 * Read once outside the loop, later iterations will get the newer value
+ 	 * via compare & exchange.
+ 	 */
+ 	old_value = pg_atomic_read_u32(ptr);
+ 
+ 	/* loop until we've determined whether we could make an increment or not */
+ 	while (true)
+ 	{
+ 		uint32		desired_value;
+ 		bool		free;
+ 
+ 		desired_value = old_value;
+ 		free = (old_value & mask_) == 0;
+ 		if (free)
+ 			desired_value += add_;
+ 
+ 		/*
+ 		 * Attempt to swap in the value we are expecting. If we didn't see
+ 		 * masked bits to be clear, that's just the old value. If we saw them
+ 		 * as clear, we'll attempt to make an increment. The reason that we
+ 		 * always swap in the value is that this doubles as a memory barrier.
+ 		 * We could try to be smarter and only swap in values if we saw the
+ 		 * maked bits as clear, but benchmark haven't shown it as beneficial
+ 		 * so far.
+ 		 *
+ 		 * Retry if the value changed since we last looked at it.
+ 		 */
+ 		if (pg_atomic_compare_exchange_u32(ptr, &old_value, desired_value))
+ 			return old_value;
+ 	}
+ 	pg_unreachable();
+ }
+ 
+ #endif
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
new file mode 100644
index c196bb8..ec3bbc3
*** a/src/backend/storage/lmgr/lwlock.c
--- b/src/backend/storage/lmgr/lwlock.c
*************** GetLWLockIdentifier(uint32 classId, uint
*** 727,790 ****
  static bool
  LWLockAttemptLock(LWLock *lock, LWLockMode mode)
  {
! 	uint32		old_state;
  
  	AssertArg(mode == LW_EXCLUSIVE || mode == LW_SHARED);
  
  	/*
! 	 * Read once outside the loop, later iterations will get the newer value
! 	 * via compare & exchange.
  	 */
! 	old_state = pg_atomic_read_u32(&lock->state);
  
! 	/* loop until we've determined whether we could acquire the lock or not */
! 	while (true)
  	{
! 		uint32		desired_state;
! 		bool		lock_free;
! 
! 		desired_state = old_state;
! 
! 		if (mode == LW_EXCLUSIVE)
! 		{
! 			lock_free = (old_state & LW_LOCK_MASK) == 0;
! 			if (lock_free)
! 				desired_state += LW_VAL_EXCLUSIVE;
! 		}
! 		else
! 		{
! 			lock_free = (old_state & LW_VAL_EXCLUSIVE) == 0;
! 			if (lock_free)
! 				desired_state += LW_VAL_SHARED;
! 		}
! 
! 		/*
! 		 * Attempt to swap in the state we are expecting. If we didn't see
! 		 * lock to be free, that's just the old value. If we saw it as free,
! 		 * we'll attempt to mark it acquired. The reason that we always swap
! 		 * in the value is that this doubles as a memory barrier. We could try
! 		 * to be smarter and only swap in values if we saw the lock as free,
! 		 * but benchmark haven't shown it as beneficial so far.
! 		 *
! 		 * Retry if the value changed since we last looked at it.
! 		 */
! 		if (pg_atomic_compare_exchange_u32(&lock->state,
! 										   &old_state, desired_state))
! 		{
! 			if (lock_free)
! 			{
! 				/* Great! Got the lock. */
  #ifdef LOCK_DEBUG
! 				if (mode == LW_EXCLUSIVE)
! 					lock->owner = MyProc;
  #endif
! 				return false;
! 			}
! 			else
! 				return true;	/* someobdy else has the lock */
! 		}
  	}
- 	pg_unreachable();
  }
  
  /*
--- 727,772 ----
  static bool
  LWLockAttemptLock(LWLock *lock, LWLockMode mode)
  {
! 	uint32		old_state,
! 				mask,
! 				increment;
  
  	AssertArg(mode == LW_EXCLUSIVE || mode == LW_SHARED);
  
+ 	if (mode == LW_EXCLUSIVE)
+ 	{
+ 		mask = LW_LOCK_MASK;
+ 		increment = LW_VAL_EXCLUSIVE;
+ 	}
+ 	else
+ 	{
+ 		mask = LW_VAL_EXCLUSIVE;
+ 		increment = LW_VAL_SHARED;
+ 	}
+ 
  	/*
! 	 * Use 'check mask then add' atomic which actually do all the useful job
! 	 * for us.
  	 */
! 	old_state = pg_atomic_fetch_mask_add_u32(&lock->state, mask, increment);
  
! 	/*
! 	 * If state was free according to the mask, we assume that operation was
! 	 * successful.
! 	 */
! 	if ((old_state & mask) == 0)
  	{
! 		/* Great! Got the lock. */
  #ifdef LOCK_DEBUG
! 		if (mode == LW_EXCLUSIVE)
! 			lock->owner = MyProc;
  #endif
! 		return false;
! 	}
! 	else
! 	{
! 		return true;	/* somebody else has the lock */
  	}
  }
  
  /*
diff --git a/src/include/port/atomics.h b/src/include/port/atomics.h
new file mode 100644
index 2e2ec27..4ec0219
*** a/src/include/port/atomics.h
--- b/src/include/port/atomics.h
*************** pg_atomic_sub_fetch_u32(volatile pg_atom
*** 415,420 ****
--- 415,433 ----
  	return pg_atomic_sub_fetch_u32_impl(ptr, sub_);
  }
  
+ /*
+  * pg_atomic_fetch_mask_add_u32 - atomically check that masked bits in variable
+  * and if they are clear then add to variable.
+  *
+  * Returns the value of ptr before the atomic operation.
+  *
+  * Full barrier semantics.
+  */
+ extern uint32
+ pg_atomic_fetch_mask_add_u32(volatile pg_atomic_uint32 *ptr,
+ 							 uint32 mask_, uint32 add_);
+ 
+ 
  /* ----
   * The 64 bit operations have the same semantics as their 32bit counterparts
   * if they are available. Check the corresponding 32bit function for

Alexander Korotkov

a.korotkov@postgrespro.ru

almost 9 years ago

In reply to: Alexander Korotkov (#1)

Re: LWLock optimization for multicore Power machines

On Fri, Feb 3, 2017 at 8:01 PM, Alexander Korotkov <
a.korotkov@postgrespro.ru> wrote:

Unfortunately, I have no big enough Power machine at hand to reproduce
that results. Actually, I have no Power machine at hand at all. So,
lwlock-power-2.patch was written "blindly". I would very appreciate if
someone would help me with testing and benchmarking.

UPD: It appears that Postgres Pro have access to big Power machine now.
So, I can do testing/benchmarking myself.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Bernd Helmle

mailings@oopsware.de

almost 9 years ago

In reply to: Alexander Korotkov (#2)

Re: LWLock optimization for multicore Power machines

On Fri, 2017-02-03 at 20:11 +0300, Alexander Korotkov wrote:

On Fri, Feb 3, 2017 at 8:01 PM, Alexander Korotkov <
a.korotkov@postgrespro.ru> wrote:

Unfortunately, I have no big enough Power machine at hand to
reproduce
that results.  Actually, I have no Power machine at hand at
all.  So,
lwlock-power-2.patch was written "blindly".  I would very
appreciate if
someone would help me with testing and benchmarking.

UPD: It appears that Postgres Pro have access to big Power machine
now.
So, I can do testing/benchmarking myself.

Hi Alexander,

We currently also have access to a LPAR on an E850 machine with 4
sockets POWER8 running a Ubuntu 16.04 LTS Server ppc64el OS. I can do
some tests next week, if you need to verify your findings.

Thanks,

Bernd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas

robertmhaas@gmail.com

almost 9 years ago

In reply to: Alexander Korotkov (#1)

Re: LWLock optimization for multicore Power machines

On Fri, Feb 3, 2017 at 12:01 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:

Hi everybody!

During FOSDEM/PGDay 2017 developer meeting I said that I have some special
assembly optimization for multicore Power machines. From the answers of
other hackers I realized following.

There are some big Power machines with PostgreSQL in production use. Not as
many as Intel, but some of them.
Community could be interested in special assembly optimization for Power
machines despite cost of maintaining it.

Power processors use specific implementation of atomic operations. This
implementation is some kind of optimistic locking. 'lwarx' instruction
'reserves index', but that reservation could be broken on 'stwcx', and then
we have to retry. So, for instance CAS operation on Power processor is a
loop. So, loop of CAS operations is two level nested loop. Benchmarks
showed that it becomes real problem for LWLockAttemptLock(). However, one
actually can put arbitrary logic between 'lwarx' and 'stwcx' and make it a
single loop. The downside is that this logic has to be implemented in
assembly. See [1] for experiment details.

Results in [1] have a lot of junk which isn't relevant anymore. This is why
I draw a separate graph.

power8-lwlock-asm-ro.png – results of read-only pgbench test on IBM E880
which have 32 physical cores and 256 virtual thread via SMT. The curves
have following meaning.
* 9.5: unpatched PostgreSQL 9.5
* pinunpin-cas: PostgreSQL 9.5 + earlier version of 48354581
* pinunpin-lwlock-asm: PostgreSQL 9.5 + earlier version of 48354581 +
LWLock implementation in assembly.

Cool work. Obviously there's some work to do before we can merge this
-- vetting the abstraction, performance testing -- but it seems pretty
neat.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Alexander Korotkov

a.korotkov@postgrespro.ru

almost 9 years ago

In reply to: Bernd Helmle (#3)

1 attachment(s)

Re: LWLock optimization for multicore Power machines

On Fri, Feb 3, 2017 at 11:31 PM, Bernd Helmle <mailings@oopsware.de> wrote:

UPD: It appears that Postgres Pro have access to big Power machine
now.
So, I can do testing/benchmarking myself.

We currently also have access to a LPAR on an E850 machine with 4
sockets POWER8 running a Ubuntu 16.04 LTS Server ppc64el OS. I can do
some tests next week, if you need to verify your findings.

Very good, thank you!

I tried lwlock-power-2.patch on multicore Power machine we have in
PostgresPro.
I realized that using labels in assembly isn't safe. Thus, I removed
labels and use relative jumps instead (lwlock-power-2.patch).
Unfortunately, I didn't manage to make any reasonable benchmarks. This
machine runs AIX, and there are a lot of problems which prevents PostgreSQL
to show high TPS. Installing Linux there is not an option too, because
that machine is used for tries to make Postgres work properly on AIX.
So, benchmarking help is very relevant. I would very appreciate that.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

lwlock-power-3.patchapplication/octet-stream; name=lwlock-power-3.patchDownload

diff --git a/src/backend/port/atomics.c b/src/backend/port/atomics.c
new file mode 100644
index 86b5308..55a9910
*** a/src/backend/port/atomics.c
--- b/src/backend/port/atomics.c
*************** pg_atomic_fetch_add_u32_impl(volatile pg
*** 158,160 ****
--- 158,243 ----
  }
  
  #endif   /* PG_HAVE_ATOMIC_U32_SIMULATION */
+ 
+ #if (defined(__GNUC__) || defined(__INTEL_COMPILER)) && (defined(__ppc__) || defined(__powerpc__) || defined(__ppc64__) || defined(__powerpc64__))
+ 
+ /*
+  * Optimized implementation for Power processors.  Atomic operations on Power
+  * processors are implemented using optimistic locking.  'lwarx' instruction
+  * 'reserves index', but that reservation could be broken on 'stwcx.' and then
+  * we have to retry.  Thus, each CAS operation is a loop.  But loop of CAS
+  * operation is two level nested loop.  Experiments on multicore Power machines
+  * shows that we can have huge benefit from having this operation done using
+  * single loop in assembly.
+  */
+ uint32
+ pg_atomic_fetch_mask_add_u32(volatile pg_atomic_uint32 *ptr,
+ 							 uint32 mask, uint32 increment)
+ {
+ 	uint32		result;
+ 
+ 	__asm__ __volatile__(
+ 	"	lwarx	%0,0,%4		\n" /* read *ptr and reserve index */
+ 	"	and		3,%0,%2		\n" /* calculate '*ptr & mask" */
+ 	"	cmpwi	3,0			\n" /* compare '*ptr & mark' vs 0 */
+ 	"	bne-	$+16		\n" /* exit on '*ptr & mark != 0' */
+ 	"	add		3,%0,%3		\n" /* calculate '*ptr + increment' */
+ 	"	stwcx.	3,0,%4		\n" /* try to store '*ptr + increment' into *ptr */
+ 	"	bne-	$-24		\n" /* retry if index reservation is broken */
+ #ifdef USE_PPC_LWSYNC
+ 	"	lwsync				\n"
+ #else
+ 	"	isync				\n"
+ #endif
+ 	: "=&r"(result), "+m"(*ptr)
+ 	: "r"(mask), "r"(increment), "r"(ptr)
+ 	: "memory", "cc", "r3");
+ 	return result;
+ }
+ 
+ #else
+ 
+ /*
+  * Generic implementation via loop of compare & exchange.
+  */
+ uint32
+ pg_atomic_fetch_mask_add_u32(volatile pg_atomic_uint32 *ptr,
+ 							 uint32 mask_, uint32 add_)
+ {
+ 	uint32		old_value;
+ 
+ 	/*
+ 	 * Read once outside the loop, later iterations will get the newer value
+ 	 * via compare & exchange.
+ 	 */
+ 	old_value = pg_atomic_read_u32(ptr);
+ 
+ 	/* loop until we've determined whether we could make an increment or not */
+ 	while (true)
+ 	{
+ 		uint32		desired_value;
+ 		bool		free;
+ 
+ 		desired_value = old_value;
+ 		free = (old_value & mask_) == 0;
+ 		if (free)
+ 			desired_value += add_;
+ 
+ 		/*
+ 		 * Attempt to swap in the value we are expecting. If we didn't see
+ 		 * masked bits to be clear, that's just the old value. If we saw them
+ 		 * as clear, we'll attempt to make an increment. The reason that we
+ 		 * always swap in the value is that this doubles as a memory barrier.
+ 		 * We could try to be smarter and only swap in values if we saw the
+ 		 * maked bits as clear, but benchmark haven't shown it as beneficial
+ 		 * so far.
+ 		 *
+ 		 * Retry if the value changed since we last looked at it.
+ 		 */
+ 		if (pg_atomic_compare_exchange_u32(ptr, &old_value, desired_value))
+ 			return old_value;
+ 	}
+ 	pg_unreachable();
+ }
+ 
+ #endif
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
new file mode 100644
index ab81d94..766e3de
*** a/src/backend/storage/lmgr/lwlock.c
--- b/src/backend/storage/lmgr/lwlock.c
*************** GetLWLockIdentifier(uint32 classId, uint
*** 727,790 ****
  static bool
  LWLockAttemptLock(LWLock *lock, LWLockMode mode)
  {
! 	uint32		old_state;
  
  	AssertArg(mode == LW_EXCLUSIVE || mode == LW_SHARED);
  
  	/*
! 	 * Read once outside the loop, later iterations will get the newer value
! 	 * via compare & exchange.
  	 */
! 	old_state = pg_atomic_read_u32(&lock->state);
  
! 	/* loop until we've determined whether we could acquire the lock or not */
! 	while (true)
  	{
! 		uint32		desired_state;
! 		bool		lock_free;
! 
! 		desired_state = old_state;
! 
! 		if (mode == LW_EXCLUSIVE)
! 		{
! 			lock_free = (old_state & LW_LOCK_MASK) == 0;
! 			if (lock_free)
! 				desired_state += LW_VAL_EXCLUSIVE;
! 		}
! 		else
! 		{
! 			lock_free = (old_state & LW_VAL_EXCLUSIVE) == 0;
! 			if (lock_free)
! 				desired_state += LW_VAL_SHARED;
! 		}
! 
! 		/*
! 		 * Attempt to swap in the state we are expecting. If we didn't see
! 		 * lock to be free, that's just the old value. If we saw it as free,
! 		 * we'll attempt to mark it acquired. The reason that we always swap
! 		 * in the value is that this doubles as a memory barrier. We could try
! 		 * to be smarter and only swap in values if we saw the lock as free,
! 		 * but benchmark haven't shown it as beneficial so far.
! 		 *
! 		 * Retry if the value changed since we last looked at it.
! 		 */
! 		if (pg_atomic_compare_exchange_u32(&lock->state,
! 										   &old_state, desired_state))
! 		{
! 			if (lock_free)
! 			{
! 				/* Great! Got the lock. */
  #ifdef LOCK_DEBUG
! 				if (mode == LW_EXCLUSIVE)
! 					lock->owner = MyProc;
  #endif
! 				return false;
! 			}
! 			else
! 				return true;	/* somebody else has the lock */
! 		}
  	}
- 	pg_unreachable();
  }
  
  /*
--- 727,772 ----
  static bool
  LWLockAttemptLock(LWLock *lock, LWLockMode mode)
  {
! 	uint32		old_state,
! 				mask,
! 				increment;
  
  	AssertArg(mode == LW_EXCLUSIVE || mode == LW_SHARED);
  
+ 	if (mode == LW_EXCLUSIVE)
+ 	{
+ 		mask = LW_LOCK_MASK;
+ 		increment = LW_VAL_EXCLUSIVE;
+ 	}
+ 	else
+ 	{
+ 		mask = LW_VAL_EXCLUSIVE;
+ 		increment = LW_VAL_SHARED;
+ 	}
+ 
  	/*
! 	 * Use 'check mask then add' atomic which actually do all the useful job
! 	 * for us.
  	 */
! 	old_state = pg_atomic_fetch_mask_add_u32(&lock->state, mask, increment);
  
! 	/*
! 	 * If state was free according to the mask, we assume that operation was
! 	 * successful.
! 	 */
! 	if ((old_state & mask) == 0)
  	{
! 		/* Great! Got the lock. */
  #ifdef LOCK_DEBUG
! 		if (mode == LW_EXCLUSIVE)
! 			lock->owner = MyProc;
  #endif
! 		return false;
! 	}
! 	else
! 	{
! 		return true;	/* somebody else has the lock */
  	}
  }
  
  /*
diff --git a/src/include/port/atomics.h b/src/include/port/atomics.h
new file mode 100644
index 2e2ec27..4ec0219
*** a/src/include/port/atomics.h
--- b/src/include/port/atomics.h
*************** pg_atomic_sub_fetch_u32(volatile pg_atom
*** 415,420 ****
--- 415,433 ----
  	return pg_atomic_sub_fetch_u32_impl(ptr, sub_);
  }
  
+ /*
+  * pg_atomic_fetch_mask_add_u32 - atomically check that masked bits in variable
+  * and if they are clear then add to variable.
+  *
+  * Returns the value of ptr before the atomic operation.
+  *
+  * Full barrier semantics.
+  */
+ extern uint32
+ pg_atomic_fetch_mask_add_u32(volatile pg_atomic_uint32 *ptr,
+ 							 uint32 mask_, uint32 add_);
+ 
+ 
  /* ----
   * The 64 bit operations have the same semantics as their 32bit counterparts
   * if they are available. Check the corresponding 32bit function for

Bernd Helmle

mailings@oopsware.de

almost 9 years ago

In reply to: Alexander Korotkov (#5)

Re: LWLock optimization for multicore Power machines

Am Montag, den 06.02.2017, 16:45 +0300 schrieb Alexander Korotkov:

I tried lwlock-power-2.patch on multicore Power machine we have in
PostgresPro.
I realized that using labels in assembly isn't safe. Thus, I removed
labels and use relative jumps instead (lwlock-power-2.patch).
Unfortunately, I didn't manage to make any reasonable benchmarks.
This
machine runs AIX, and there are a lot of problems which prevents
PostgreSQL
to show high TPS. Installing Linux there is not an option too,
because
that machine is used for tries to make Postgres work properly on AIX.
So, benchmarking help is very relevant. I would very appreciate
that.

Okay, so here are some results. The bench runs against
current PostgreSQL master, 24 GByte shared_buffers configured (128
GByte physical RAM), max_wal_size=8GB and effective_cache_size=100GB.

I've just discovered that max_connections was accidently set to 601,
normally i'd have set something near 110 or so...

transaction type: <builtin: select only>
scaling factor: 1000
query mode: prepared
number of clients: 100
number of threads: 100
duration: 30 s
number of transactions actually processed: 16910687
latency average = 0.177 ms
tps = 563654.968585 (including connections establishing)
tps = 563991.459659 (excluding connections establishing)
transaction type: <builtin: select only>
scaling factor: 1000
query mode: prepared
number of clients: 100
number of threads: 100
duration: 30 s
number of transactions actually processed: 16523247
latency average = 0.182 ms
tps = 550744.748084 (including connections establishing)
tps = 552069.267389 (excluding connections establishing)
transaction type: <builtin: select only>
scaling factor: 1000
query mode: prepared
number of clients: 100
number of threads: 100
duration: 30 s
number of transactions actually processed: 16796056
latency average = 0.179 ms
tps = 559830.986738 (including connections establishing)
tps = 560333.682010 (excluding connections establishing)

<lw-lock-power-1.patch applied>

transaction type: <builtin: select only>
scaling factor: 1000
query mode: prepared
number of clients: 100
number of threads: 100
duration: 30 s
number of transactions actually processed: 14563500
latency average = 0.206 ms
tps = 485420.764515 (including connections establishing)
tps = 485720.606371 (excluding connections establishing)
transaction type: <builtin: select only>
scaling factor: 1000
query mode: prepared
number of clients: 100
number of threads: 100
duration: 30 s
number of transactions actually processed: 14618457
latency average = 0.205 ms
tps = 487246.817758 (including connections establishing)
tps = 488117.718816 (excluding connections establishing)
transaction type: <builtin: select only>
scaling factor: 1000
query mode: prepared
number of clients: 100
number of threads: 100
duration: 30 s
number of transactions actually processed: 14522462
latency average = 0.207 ms
tps = 484052.194063 (including connections establishing)
tps = 485434.771590 (excluding connections establishing)

<lw-lock-power-3.patch applied>

transaction type: <builtin: select only>
scaling factor: 1000
query mode: prepared
number of clients: 100
number of threads: 100
duration: 30 s
number of transactions actually processed: 17946058
latency average = 0.167 ms
tps = 598164.841490 (including connections establishing)
tps = 598582.503248 (excluding connections establishing)
transaction type: <builtin: select only>
scaling factor: 1000
query mode: prepared
number of clients: 100
number of threads: 100
duration: 30 s
number of transactions actually processed: 17719648
latency average = 0.169 ms
tps = 590621.671588 (including connections establishing)
tps = 591093.333153 (excluding connections establishing)
transaction type: <builtin: select only>
scaling factor: 1000
query mode: prepared
number of clients: 100
number of threads: 100
duration: 30 s
number of transactions actually processed: 17722941
latency average = 0.169 ms
tps = 590728.715465 (including connections establishing)
tps = 591619.817043 (excluding connections establishing)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Alexander Korotkov

a.korotkov@postgrespro.ru

almost 9 years ago

In reply to: Bernd Helmle (#6)

Re: LWLock optimization for multicore Power machines

On Mon, Feb 6, 2017 at 8:28 PM, Bernd Helmle <mailings@oopsware.de> wrote:

Am Montag, den 06.02.2017, 16:45 +0300 schrieb Alexander Korotkov:

I tried lwlock-power-2.patch on multicore Power machine we have in
PostgresPro.
I realized that using labels in assembly isn't safe. Thus, I removed
labels and use relative jumps instead (lwlock-power-2.patch).
Unfortunately, I didn't manage to make any reasonable benchmarks.
This
machine runs AIX, and there are a lot of problems which prevents
PostgreSQL
to show high TPS. Installing Linux there is not an option too,
because
that machine is used for tries to make Postgres work properly on AIX.
So, benchmarking help is very relevant. I would very appreciate
that.

Okay, so here are some results. The bench runs against
current PostgreSQL master, 24 GByte shared_buffers configured (128
GByte physical RAM), max_wal_size=8GB and effective_cache_size=100GB.

Thank you very much for testing!

Results looks strange for me. I wonder why there is difference between
lwlock-power-1.patch and lwlock-power-3.patch? From my intuition, it
shouldn't be there because it's not much difference between them. Thus, I
have following questions.

1. Have you warm up database? I.e. could you do "SELECT sum(x.x) FROM
(SELECT pg_prewarm(oid) AS x FROM pg_class WHERE relkind IN ('i', 'r')
ORDER BY oid) x;" before each run?
2. Also could you run each test longer: 3-5 mins, and run them with
variety of clients count?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Andres Freund

andres@anarazel.de

almost 9 years ago

In reply to: Alexander Korotkov (#1)

Re: LWLock optimization for multicore Power machines

Hi,

On 2017-02-03 20:01:03 +0300, Alexander Korotkov wrote:

Using assembly in lwlock.c looks rough. This is why I refactored it by
introducing new atomic operation pg_atomic_fetch_mask_add_u32 (see
lwlock-power-2.patch). It checks that all masked bits are clear and then
adds to variable. This atomic have special assembly implementation for
Power, and generic implementation for other platforms with loop of CAS.
Probably we would have other implementations for other architectures in
future. This level of abstraction is the best I managed to invent.

I think that's a reasonable approach. And I think it might be worth
experimenting with a more efficient implementation on x86 too, using
hardware lock elision / HLE and/or tsx.

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Bernd Helmle

mailings@oopsware.de

almost 9 years ago

In reply to: Alexander Korotkov (#7)

Re: LWLock optimization for multicore Power machines

On Mon, 2017-02-06 at 22:44 +0300, Alexander Korotkov wrote:

Results looks strange for me. I wonder why there is difference
between
lwlock-power-1.patch and lwlock-power-3.patch? From my intuition, it
shouldn't be there because it's not much difference between them.
Thus, I
have following questions.

Yeah, i've realized that as well.

   1. Have you warm up database? I.e. could you do "SELECT sum(x.x)
FROM
   (SELECT pg_prewarm(oid) AS x FROM pg_class WHERE relkind IN ('i',
'r')
   ORDER BY oid) x;" before each run?
   2. Also could you run each test longer: 3-5 mins, and run them
with
   variety of clients count?

The results i've posted were the last 3 run of 9 in summary. I hoped
that should be enough to prewarm the system. I'm going to repeat the
tests with the changes you've requested, though.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Bernd Helmle

mailings@oopsware.de

almost 9 years ago

In reply to: Alexander Korotkov (#7)

4 attachment(s)

Re: LWLock optimization for multicore Power machines

Am Montag, den 06.02.2017, 22:44 +0300 schrieb Alexander Korotkov:

2. Also could you run each test longer: 3-5 mins, and run them
with

variety of clients count?

So here are some other results. I've changed max_connections to 300.
The bench was prewarmed and run 300s each.
I could run more benches, if necessary.

#11

Alexander Korotkov

a.korotkov@postgrespro.ru

almost 9 years ago

In reply to: Bernd Helmle (#10)

Re: LWLock optimization for multicore Power machines

On Tue, Feb 7, 2017 at 3:16 PM, Bernd Helmle <mailings@oopsware.de> wrote:

Am Montag, den 06.02.2017, 22:44 +0300 schrieb Alexander Korotkov:

2. Also could you run each test longer: 3-5 mins, and run them with
variety of clients count?

So here are some other results. I've changed max_connections to 300. The
bench was prewarmed and run 300s each.
I could run more benches, if necessary.

Thank you very much for benchmarks!

There is clear win of both lwlock-power-1.patch and lwlock-power-3.patch in
comparison to master. Difference between lwlock-power-1.patch and
lwlock-power-3.patch seems to be within the margin of error. But win isn't
as high as I observed earlier. And I wonder why absolute numbers are lower
than in our earlier experiments. We used IBM E880 which is actually two
nodes with interconnect. However interconnect is not fast enough to make
one PostgreSQL instance work on both nodes. Thus, used half of IBM E880
which has 4 sockets and 32 physical cores. While you use IBM E850 which is
really single node with 4 sockets and 48 physical cores. Thus, it seems
that you have lower absolute numbers on more powerful hardware. That makes
me uneasy and I think we probably don't get the best from this hardware.
Just in case, do you use SMT=8?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#12

Bernd Helmle

mailings@oopsware.de

almost 9 years ago

In reply to: Alexander Korotkov (#11)

Re: LWLock optimization for multicore Power machines

Am Dienstag, den 07.02.2017, 16:48 +0300 schrieb Alexander Korotkov:

But win isn't
as high as I observed earlier. And I wonder why absolute numbers are
lower
than in our earlier experiments. We used IBM E880 which is actually
two

Did you run your tests on bare metal or were they also virtualized?

nodes with interconnect. However interconnect is not fast enough to
make
one PostgreSQL instance work on both nodes. Thus, used half of IBM
E880
which has 4 sockets and 32 physical cores. While you use IBM E850
which is
really single node with 4 sockets and 48 physical cores. Thus, it
seems
that you have lower absolute numbers on more powerful hardware. That
makes
me uneasy and I think we probably don't get the best from this
hardware.
Just in case, do you use SMT=8?

Yes, SMT=8 was used.

The machine has 4 sockets, 8 Core each, 3.7 GHz clock frequency. The
Ubuntu LPAR running on PowerVM isn't using all physical cores,
currently 28 cores are assigned (=224 SMT Threads). The other cores are
dedicated to the PowerVM hypervisor and a (very) small AIX LPAR.

I've done more pgbenches this morning with SMT-4, too, fastest result
with master was

SMT-4

transaction type: <builtin: select only>
scaling factor: 1000
query mode: prepared
number of clients: 100
number of threads: 100
duration: 300 s
number of transactions actually processed: 167306423
latency average = 0.179 ms
latency stddev = 0.072 ms
tps = 557685.144912 (including connections establishing)
tps = 557835.683204 (excluding connections establishing)

compared with SMT-8:

transaction type: <builtin: select only>
scaling factor: 1000
query mode: prepared
number of clients: 100
number of threads: 100
duration: 300 s
number of transactions actually processed: 173476449
latency average = 0.173 ms
latency stddev = 0.059 ms
tps = 578250.676019 (including connections establishing)
tps = 578412.159601 (excluding connections establishing)

and retried with lwlocks-power-3, SMT-4:

transaction type: <builtin: select only>
scaling factor: 1000
query mode: prepared
number of clients: 100
number of threads: 100
duration: 300 s
number of transactions actually processed: 185991995
latency average = 0.161 ms
latency stddev = 0.059 ms
tps = 619970.030069 (including connections establishing)
tps = 620112.263770 (excluding connections establishing)
credativ@iicl183:~/git/postgres$

...and SMT-8

transaction type: <builtin: select only>
scaling factor: 1000
query mode: prepared
number of clients: 100
number of threads: 100
duration: 300 s
number of transactions actually processed: 185878717
latency average = 0.161 ms
latency stddev = 0.047 ms
tps = 619591.476154 (including connections establishing)
tps = 619655.867280 (excluding connections establishing)

Interestingly the lwlocks patch seems to decrease the SMT influence
factor.

Side note: the system makes around 2 Mio Context Switches during the
benchmarks, e.g.

awk '{print $12;}' /tmp/vmstat.out

cs
10
2153533
2134864
2141623
2126845
2128330
2127454
2145325
2126769
2134492
2130246
2130071
2142660
2136077
2126783
2126107
2125823
2136511
2137752
2146307
2141127

I've also tried a more recent kernel this morning (4.4 vs. 4.8), but
this didn't change the picture. Is there anything more i can do?

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

Alexander Korotkov

a.korotkov@postgrespro.ru

almost 9 years ago

In reply to: Bernd Helmle (#12)

Re: LWLock optimization for multicore Power machines

On Wed, Feb 8, 2017 at 5:00 PM, Bernd Helmle <mailings@oopsware.de> wrote:

Am Dienstag, den 07.02.2017, 16:48 +0300 schrieb Alexander Korotkov:

But win isn't
as high as I observed earlier. And I wonder why absolute numbers are
lower
than in our earlier experiments. We used IBM E880 which is actually
two

Did you run your tests on bare metal or were they also virtualized?

I run tests on bare metal.

nodes with interconnect. However interconnect is not fast enough to

make
one PostgreSQL instance work on both nodes. Thus, used half of IBM
E880
which has 4 sockets and 32 physical cores. While you use IBM E850
which is
really single node with 4 sockets and 48 physical cores. Thus, it
seems
that you have lower absolute numbers on more powerful hardware. That
makes
me uneasy and I think we probably don't get the best from this
hardware.
Just in case, do you use SMT=8?

Yes, SMT=8 was used.

The machine has 4 sockets, 8 Core each, 3.7 GHz clock frequency. The
Ubuntu LPAR running on PowerVM isn't using all physical cores,
currently 28 cores are assigned (=224 SMT Threads). The other cores are
dedicated to the PowerVM hypervisor and a (very) small AIX LPAR.

Thank you very much for the explanation.

Thus, I see reasons why in your tests absolute results are lower than in my
previous tests.
1. You use 28 physical cores while I was using 32 physical cores.
2. You run tests in PowerVM while I was running test on bare metal.
PowerVM could have some overhead.
3. I guess you run pgbench on the same machine. While in my tests pgbench
was running on another node of IBM E880.

Therefore, having lower absolute numbers in your tests, win of LWLock
optimization is also lower. That is understandable. But win of LWLock
optimization is clearly visible definitely exceeds variation.

I think it would make sense to run more kinds of tests. Could you try set
of tests provided by Tomas Vondra?
If even we wouldn't see win some of the tests, it would be still valuable
to see that there is no regression there.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#14

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 9 years ago

In reply to: Alexander Korotkov (#13)

Re: LWLock optimization for multicore Power machines

On 02/11/2017 01:42 PM, Alexander Korotkov wrote:

I think it would make sense to run more kinds of tests. Could you try
set of tests provided by Tomas Vondra?
If even we wouldn't see win some of the tests, it would be still
valuable to see that there is no regression there.

FWIW it shouldn't be difficult to tweak my scripts and run them on
another machine. You'd have to customize the parameters (scales, client
counts, ...) and there are a few hard-coded paths, but that's about it.

regards
Tomas

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Bernd Helmle

mailings@oopsware.de

almost 9 years ago

In reply to: Alexander Korotkov (#13)

Re: LWLock optimization for multicore Power machines

Am Samstag, den 11.02.2017, 15:42 +0300 schrieb Alexander Korotkov:

Thus, I see reasons why in your tests absolute results are lower than
in my
previous tests.
1. You use 28 physical cores while I was using 32 physical cores.
2. You run tests in PowerVM while I was running test on bare metal.
PowerVM could have some overhead.
3. I guess you run pgbench on the same machine. While in my tests
pgbench
was running on another node of IBM E880.

Yeah, pgbench was running locally. Maybe we can get some resources to
run them remotely. Interesting side note: If you run a second postgres
instance with the same pgbench in parallel, you get nearly the same
transaction throughput as a single instance.

Short side note:

If you run two Postgres instances concurrently with the same pgbench
parameters, you get nearly the same transaction throughput for both
instances each as when running against a single instance, e.g.

- single

transaction type: <builtin: select only>
scaling factor: 1000
query mode: prepared
number of clients: 112
number of threads: 112
duration: 300 s
number of transactions actually processed: 121523797
latency average = 0.276 ms
latency stddev = 0.096 ms
tps = 405075.282309 (including connections establishing)
tps = 405114.299174 (excluding connections establishing)

instance-1/instance-2 concurrently run:

transaction type: <builtin: select only>
scaling factor: 1000
query mode: prepared
number of clients: 112
number of threads: 56
duration: 300 s
number of transactions actually processed: 120645351
latency average = 0.278 ms
latency stddev = 0.158 ms
tps = 402148.536087 (including connections establishing)
tps = 402199.952824 (excluding connections establishing)

transaction type: <builtin: select only>
scaling factor: 1000
query mode: prepared
number of clients: 112
number of threads: 56
duration: 300 s
number of transactions actually processed: 121959772
latency average = 0.275 ms
latency stddev = 0.110 ms
tps = 406530.139080 (including connections establishing)
tps = 406556.658638 (excluding connections establishing)

So it looks like the machine has plenty of power, but PostgreSQL is
limiting somewhere.

Therefore, having lower absolute numbers in your tests, win of LWLock
optimization is also lower. That is understandable. But win of
LWLock
optimization is clearly visible definitely exceeds variation.

I think it would make sense to run more kinds of tests. Could you
try set
of tests provided by Tomas Vondra?
If even we wouldn't see win some of the tests, it would be still
valuable
to see that there is no regression there.

Unfortunately there are some test for AIX scheduled, which will assign
resources to that LPAR...i've just talked to the people responsible for
the machine and we can get more time for the Linux tests ;)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 9 years ago

In reply to: Bernd Helmle (#15)

Re: LWLock optimization for multicore Power machines

On 02/13/2017 03:16 PM, Bernd Helmle wrote:

Am Samstag, den 11.02.2017, 15:42 +0300 schrieb Alexander Korotkov:

Thus, I see reasons why in your tests absolute results are lower than
in my
previous tests.
1. You use 28 physical cores while I was using 32 physical cores.
2. You run tests in PowerVM while I was running test on bare metal.
PowerVM could have some overhead.
3. I guess you run pgbench on the same machine. While in my tests
pgbench
was running on another node of IBM E880.

Yeah, pgbench was running locally. Maybe we can get some resources to
run them remotely. Interesting side note: If you run a second postgres
instance with the same pgbench in parallel, you get nearly the same
transaction throughput as a single instance.

Short side note:

If you run two Postgres instances concurrently with the same pgbench
parameters, you get nearly the same transaction throughput for both
instances each as when running against a single instance, e.g.

That strongly suggests you're hitting some kind of lock. It'd be good to
know which one. I see you're doing "pgbench -S" which also updates
branches and other tiny tables - it's possible the sessions are trying
to update the same row in those tiny tables. You're running with scale
1000, but with 100 it's still possible thanks to the birthday paradox.

Otherwise it might be interesting to look at sampling wait events, which
might tell us more about the locks.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17

Alexander Korotkov

a.korotkov@postgrespro.ru

almost 9 years ago

In reply to: Tomas Vondra (#16)

Re: LWLock optimization for multicore Power machines

On Mon, Feb 13, 2017 at 10:17 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com

wrote:

On 02/13/2017 03:16 PM, Bernd Helmle wrote:

Am Samstag, den 11.02.2017, 15:42 +0300 schrieb Alexander Korotkov:

Thus, I see reasons why in your tests absolute results are lower than
in my
previous tests.
1. You use 28 physical cores while I was using 32 physical cores.
2. You run tests in PowerVM while I was running test on bare metal.
PowerVM could have some overhead.
3. I guess you run pgbench on the same machine. While in my tests
pgbench
was running on another node of IBM E880.

Yeah, pgbench was running locally. Maybe we can get some resources to
run them remotely. Interesting side note: If you run a second postgres
instance with the same pgbench in parallel, you get nearly the same
transaction throughput as a single instance.

Short side note:

If you run two Postgres instances concurrently with the same pgbench
parameters, you get nearly the same transaction throughput for both
instances each as when running against a single instance, e.g.

That strongly suggests you're hitting some kind of lock. It'd be good to
know which one. I see you're doing "pgbench -S" which also updates branches
and other tiny tables - it's possible the sessions are trying to update the
same row in those tiny tables. You're running with scale 1000, but with 100
it's still possible thanks to the birthday paradox.

Otherwise it might be interesting to look at sampling wait events, which
might tell us more about the locks.

+1
And you could try to use pg_wait_sampling
<https://github.com/postgrespro/pg_wait_sampling> to sampling of wait
events.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#18

Bernd Helmle

mailings@oopsware.de

almost 9 years ago

In reply to: Alexander Korotkov (#17)

Re: LWLock optimization for multicore Power machines

Am Dienstag, den 14.02.2017, 15:53 +0300 schrieb Alexander Korotkov:

+1
And you could try to use pg_wait_sampling
<https://github.com/postgrespro/pg_wait_sampling> to sampling of wait
events.

Okay, i'm going to try this. Currently Tomas' scripts are still
running, i'll provide updates as soon as they are finished.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19

Bernd Helmle

mailings@oopsware.de

almost 9 years ago

In reply to: Alexander Korotkov (#13)

7 attachment(s)

Re: LWLock optimization for multicore Power machines

Am Samstag, den 11.02.2017, 15:42 +0300 schrieb Alexander Korotkov:

I think it would make sense to run more kinds of tests. Could you
try set
of tests provided by Tomas Vondra?
If even we wouldn't see win some of the tests, it would be still
valuable
to see that there is no regression there.

Sorry for the delay.

But here are the results after having run Tomas' benchmarks scripts.
The pgbench-ro benchmark shows a clear win of the lwlock-power-3 patch,
also you see a slight advantage of this patch in the pgbench-rw graph.
At least, i don't see any signifikant regression somewhere.

The graphs are showing the average values over all five runs for each
client count.

Attachments:

pgbench-reads-writes.pngimage/png; name=pgbench-reads-writes.pngDownload

pgbench-ro.pngimage/png; name=pgbench-ro.pngDownload

pgbench-ro-simple.pngimage/png; name=pgbench-ro-simple.pngDownload

�PNG


IHDR]T�)	pHYs��-z8e�IDATx���\������� S6"*
�:Q��P�������p�g�����:k����*�D-
�Q��Edo��/�Id�`���E������c��or�#
	B!�D8�=�B��\D!�$0B!	�E�BHs!���\D!�$0B!	�E�BHs!���/O�<9z�h�,����iSll����


W�X1f�x�6B�Z%_.:�����.))i��=�dii)�/[�l��i111>>>666
ottt���B!$Q��Q����]�v��9""BMMm�������5h���� ooo�7._�\1;�BU�������m������tBB�����!''���8}}}�7�l�!���j���������MLL`���@CCC����f����*j6x�PS�f�P��@��&�XRR�o����hfV[[���P�h~~���������[nT!!!~~~5��P���=�4j����������������������������Fc
��B��&�x���6m��g{��!�n�:c��+W�DDD�]����J����k�B�r5���/_B�(Y�<s��%K�XXX���A{m4"�B��&�)U����-**�B�Z���C!�$0B!	�E�BHs!���\D!�$0B!	�E�BHs!���\D!�$0B!	�E�BHs!���\D!�$0B!	�E�BHs!���\D!�$0B!	�E�BHs!���\D!�$0B!	�E�BHs!���\D!�$0B!	�E�BHs!���\D!�$0B!	�E�BHs!���;srrf�����:}���+WBcll����


W�X1f��ZjD!�j���8m�4--��������#G���;;;:t��e�PLL�����������k�[�B�����N�z�������������1""BMM
jG����4hPPP������/_���G!��������377��}���{9���9s\\\�}����������X��C5y�B��I�G��{ H9���YYY/_��������>|��W/���(�}455D�(=���P�@Pnx!!!r�B��+d>I�E��������� % _.B���6k�,��\]]G��\aa��O~~�����G���[nl�~~�z!��yG{��l�s���!m�f�� |�������Cbiii��P(d�X...�������sss��F��!����<}��&x�RQ=�z����j���E���{/]����~{����#G�����G�@�u��3f\�r%""b���VVV
o��oB�1)���zy�����__��K��,����Q���r��qx����-��N�8������������g���d����@;;�ZjD���K������~�I���o���P��[����y}I@���~�H�������oqss�����F�{������_�{�~_(, ^El6�j���^��T��hg�d�G�!%P�+�M:����\��K���B���q����k/.-�P��~u�����H���\D5P!?)3����Goc�&_�h��'�b[5k��y{���[w0���v����+������f�&�4l��0B
H7�I���i7���	������v[��V}�Z��U7�~�����J� 3R0QU0B���$'!5&���Goo����2��nfzv]��t�]������x~3��"B����)W���������GU5��n�l��l��b��������&�E�P�J�{y�qP��#Y�������������Lum�~����\D������	�#��-���E��
Z�2����������6�A�<������*��lOT��	�Ha."�/�8�����OO�}���H
;�
nc���������G�<���9wcB^�z?G��D������E���$e��~u������Vu��bC�t������R����Z����b$�$9�D�������"B�����7�c_�C"���.�(E(W����'t�����{Y��L��u���I~��M8��n$H�`."�j��/��:2&)���s�%���(���+daW�!�s���3�U�@�P�����] xT�E���DE^U�r�%w�/G�8qX��{�����[�n�����Q-��
%�I��$?�p�?E�)�!z����~���"BH"�8k��������N��9�0����Ig+����������O+So�j�
��Er���0:��h*�� yW���:Rb��!����,�]y|l������P;��{����C*�Yj����f���o��(�U��NH�������C�8���I��UUyB5N�M��OX�u2N��0Ber�3����	Ku��;�6W���x(Ms���~^6�����W����$'�.
�o>��O��hu"�]����!�1���F�O���u1X��0BeV��e&J%��/�����&�l���q������w��]�x94��������w��L:6SO$�&�����E�-������e3R��i���[������gm�h�Wt����[��V]g���$�.D��>7_��#+��~���,��8d��a."������}�h����
R��W�r��V��mCF�g���^�W�+��'�}��P�+��<��j���!~m���Qj���^E$^~|��5��??����b�'(�?�:N�,Q���,-����|Ftz�	���a."������k��v�~u��+�2����Rc_�<��s���J�HA4)��O�V��cH�t���h��V���&To������Wt5�xX�������@�4�����������j[#�����|[��#����k��������8��!z�����m�����P��]��h����y%Y�F�P������X�/�ty6���|�_mM)�`A��B3W�S�/4#�fK���_x�n��a."�����/����>
�I�tI]E���0�I-�[2-]���p%O�s*�/�����8����w��Md����[A�a."�8������-��#�R�����O�/{9��R��t�^f����e�6!���0�}�a���b
7��Q��	��)�}E�Q�����?��>N�)�no�������*91�2I�]��z��]z��+:�>��EAA%=��� ��8�D��	!����P#!
n�8s����������`���uz+S�*��t�?[��A�jG�,�����%7��W�K����NO��Qm���@��a."���������mI��h
uaW����	�x�!b�T��zRt�|��5)~N������C*�D���M��T����P=�\DH���+����mN��|������8��u�����+���$t,_3'��'���������]>q�5X��)%H��������,n�P���<~��4��	���O�(}N��Ne�����'z����A����\DH������'���N*�Tu��L��+m5}yV&�O:�>A��IP�n�ak��t"���PS������D��x����iy���:j����:r����>F��nJ%��Z�l����'�!%�����D������S����P��N�q���"�I��4�:B\Z��D�B��$}��]@4=0Q�$w.jii���k������166v���������+V�3f<T�55B"�~~���u)9O��Zjz[O�2Q�D,yF��������A;[���!���vg����G��&��&6��)�r�������KKK��l��i��������xxx���(�X���P�v/�����>K��QSU���T��Sd}�T��O=��K����BF�5����[����~U��)�r1;;�����n�|��������a���k��AAAA���
o\�|�B����q����k����@i"��P,��
^�<D�R?hW1#��D���8����O9B�N�\����p8&L�~����^�HBB����d&''���8�N�7�d�R6/������s��Z�V������L�U��bn�����D�'i6�h����X58B��|�5�����N�z���7n������hhh��hjj�(�Qz$���A��������;5(�������G���aQl;5��Z5�"�GU�8���k�������Z��+�H���<�W!b��������r���z��]�t�N���{��)77��B�m�����E�(=��rc�P����kwj �K����%<~7�_��P�����a����r���}�[�����p#�U�|m)U�Z���Hv��bjjjrr���;3��rUUU]\\��_/�IY�r�BJ��W�p���m����6��x|ok���������
��-S5�?�4�@4dXB�C��bbbb���#""�X���	

:~�x�=���[g��q��xt���VVV
o��oB��/�]~r�����.L7��B"�������-�2v��?h�4'��������H�P��b�.]6n�8a��w������t�>}�=88x���K�,��������F��[/�����l��/�u�FvX��n0EU{�����NM���4��v��p}�6�	0B�F�?��"�������P�)��ww���>CO�hX�9���rX*�-��&�Hf �eH!u}��d��C��"�KK��HZ��C7��x"|�������t?�i0Q����$}'�:L����;`4M�������)�E�j]^I���D<
�	�L�������������8���IrB>8�F��N�C�%�E�B2�\D�q�%a�����Y�pS�~c<�7�s�n�����6����{�Z����5�WnC)�"B�B(\{|�����������q��I�����$i��\���I���^��������������?��g���-&:��=���HUlB�w��m"E��)�����7E�	�E�j�\����Z>���5E_g?9ts����>:�������<���M�=�m!��$m��
"���F�:���P���Y����+��Y�������'G���cdT��>.����FKU����<�L���7G�T���FT���=@����P
�K�_�+bS*�wR,V���KxE�C��MG{|o�SE�	KH���O�1�:}(HD��>B�r������U�/_����w7��]L��x.qh�V�b�"�y�����mYZ��8b4�p�ju����\D�&.=>\�+����D
K�1����]�2�B��������5lb�%}�[��
#�j�"BrK�I�;���(�e=���6�Y��M	��K�A�(}((
���e"K�v����"Br�.L;z��.=���l�������"}Q�=�?G��1!�g�/�B
�"B2)��<��t��%�����������c=���x�$�o���
$�TL�Qi6
/X�P�����G@ix������e���E���������/j�y���|�Z��'�j��!�	0������c~z����E�E	��C��]����N��z��>�TL��4��>I����2�?T�*����/��;��-���#�/{�#9���LA�%�����D�����Z�YD���jo5�jH0*/�$��������\����:��6�*Z����D���V��i>��
�DDH�`."$����}����?
J��(V7�a_�/n�%u�_�]$�S��PoM�g�~x����"B4!�xr��/iy���N&��~�3j[�7���t� ��F3b�����DDH�a."D����sc��4�-�k������_��B�<&����SD��E������5���I�)J������Ba�q4ZjzC���q��������G��s�2��HP*D(����PD�Q�\DMO�
����
E�|�����k5q��\m5��?$i��sRWB%D�;��%)���G,M$�����B�	s5Ew^]���M�����,zL���\���~Eqt"�]� u>#��!�����
�$�4����"�h`.���m����!�-����;.s���A��{�D�(�D����!R��X�NR~"&^C!��0QSQ�+<yo��r��L����P�o|[O��T$������$������C������J�:��a�:l�P�\DMBlR��+2
R�Y�P�[����^_���S������P����O����K�{���2�7N���N����qI����'�y���c]Q":��XB
�"j�xnH����6�������f�����GS���d������5,���4�C��a��������o�vF-I�Idf)B}�8r�����%�I�z�����!	�E���g��^y��	q����W�v2�,�/M�k��SRg�S�wM����F5m���Q��xB1�$��UW��o?��d6K���}K�m"YG���~	���&&��z��1B���\D�DZ������{}U���f��^�+�?���F2�A�d��t"j����"���bNN�������o������g�NLL444\�b��1cj����B�������T�-`Z�5�'{��\�g 3���m��+YF���,&Z���!���0�����ZvU�����C�.[�l��i111>>>666
ott��!Py/�v\[���.3KQ�~�&|�����6}����$m=����z+b����������mfAV~q�vsCm=-5��s����}�6k�LqC��2�;v��?��5�f�bZj���I.�9s&>>~��	o������/���M�>����
����������`����x������'�2--�[N�����=�w���!��%�Z��D^�����K��������S�d_���K����=�
a0����[��|b=V}�%w.feeA�B4?~�iIHHpq�\
���)..N___���5b���l�:_\&rX*��~=��l��P����jR%��1����l}O(�jM��
�y��W���*x�����!�\]]a""""%%�{��6l`�Xs���'a������iS�N���������=O�>
��PfGFF>~�x����F��>aaa��}NN��o��I���j�FGGC���X�j��/^����"l��u���333��&`��G�����3|>���b���}����g�
�
2j;�9~�x�=�<y���'��|b=V}�%w.B(��1�U+��{�YMM��7�;T�(	����;�n���K�;��3��ne�Lxi���$����&1��M',�z1B��gm��g�:88@�������m��������9����t1v���r����
A��pBBB`AH���w���/^��
9r����}}}!/!o������Sq�������Um������g��P�@YTT�[jj*EInAZ�&��+dr���K0�^�z%''O�:����P�m��m�������'��Z>���G _.�������������������k�(�Qz��3�W7���r�R:����=��'�,�b���mM�<�zT���^7��*�&���^�y=�X�L��o������u�9{k�,����;p����+��'"*���u��_
!�o�>H��"��%���������<�0f����0���i����=z���{�����i���g]---E�����n{3g�
�c�����6��+���������P�B(BO���-[&���V�,�r�*t{{{":$��O�<�����_/�Q������F���k�rc�P��#r�R.�>���6�V�Z�����Yk�N��LJ_I�jw�L�Z�;Z��HQ���/���a& f�P��J��.\���o�u��m����~�-�������ID��#=����U������C0y����VzYYYo�����af�">|83mddTq�7����{�������'O^�r�Y-T��"0�-Z��=Q`�U���(�^�j���};�#����[g����v�Z+++�7�5T�������>��)f��X>����D��D^�#��J��9��%"�X@gmm
�����/�D l6m����F��Tx����^�
����	<&&V�%�����^
*����K7�}K�]�X�fP#n��������h �����G���jR������p^?���WPM/Y����"00Jl"z�U���	�{}u��yY�e�&:�P&:9��_If���5l]�Bn��
�V�j�w+a���t������5���*J�����+322 l6l������7o���x�"z{0����d�h����,��jPEU�Z��@dN�:5<<���s�������\�M������{��	&0=;w�<e�����������Q=z���*��� K�.O�71**�\��hDM
O�
��������/���q��������	�]Y?�MF���� �kGMY;;����������/���g������
�33����C�8\����K���`zzz�w�#G�

��
�k���P�A�y����������CBB���7k�,�9((�Y:-��]�t�/_�q��A���e�jP*(va��T��m�U�Pnji���q������B<CE[{U��F
Tj^���~�~���Q3���WO3g�z�o�jz��z����P��V�4�]�������x���1���Q]]]q�J���2����O;;;�����9w�����#���/�(w\k��oo����W�-iUm��>w����\�`���r�JB�������0QC���_�~+>���E��]3(8N��%���3���~G�����F��>1����|
---8����/���������0Q������9<~3�fq���=���2��c"�b�f�����>r\B�������k��EEEPe<x��������"j@��<�p) )3��m��������'/F�;e��&?���6J��������`����"j(�??���b�{�mLo����[R�\���A���Ix�)B�����!���_���`�;��T�y,�o�Db��HT���.U��s��&s��������~�r��5������}�����>�.}���H<�!T0Q}z��p��)��e�����6��L;w���n_b������BM�"�77��l����G_��"��V�F6���v����H�����:�������	6�mF�/
G��~��&�Eu��m{u�����Nu����hD�����������$_��:�����Q��&�e���-�Y���#lX�����F��PR��������	>��o�r�}���/lmm�\.�S�3�,}>}�O�c������Y�f��YLKll����


W�X1f��e������+s	}�>{����e�Dn��i�M�y�~Mb���X��Q#u�x�7�	�����D.� l�z�8��w��K�����8������\�������C�.[�l��i111>>>66626:::��i�ET���$�������������<~Y(��j9�$��\#�b���I�Y�B
tl��CgB��gd��������PB>���l>O������?T9�T���?C:����DDDDJJJ���7l��b����{��������iS��5%��<x�s��yyy������3g�C�|��7�n����-��Bp���L�.��7/55V��������n��y���W��o������srr`l���&M���=����������>}����z�j�6d���������W5y�dOOO�)n�|��������a���k��AAAA���26._�\��"�M�R��+���;<�fH�I��~&��N���DM��$j����{���Sms�!�8�-*,^�C]���a����7�\)�Q(	�Of��+�\EE����/^�(	/,,l�����G!� ~���������#U�b��}!� �_������4Xm�=�>����>p���������� ��_�����{���w�^�NJJ��!b�\�R.���G�y��a___�"D�����q�`��\�t�D/�6|��u(��m��q�FH�r���kIHHpqq�:99��������(���`.���iF(�'y�Y��Y�������>�b
a����P�U����}���z������+'�\�o������j���X2����I�h$<���{Fd���Vj)"��h�~� ����k(����e�@��V	�w������xZ�b�P(�5@������������YI�-�
v��i�NKKK0h�`0`�]�FNTT�+������`�x���DtDH�����5j���P����C@�;jDHgggE�	[�l���?�hhh�g�R "c�,�``.�Z�6����������������n)Z�P��b����&k����f"���gP5B��e���B����o��u��s���~}�t���*6H>���`�����r�����C�6o�\<% T����L��t�1c�@FVuio(.:[�(���T A�z�������'OB�	������&��b�VJ[[�f�l~~������l����j�_����~�)�F8��)�z�zjlx\r�_J����Ij����c1�)|D���s���@tY[�W@��b��%��i�&�0z�h��SUU���e�lstt<w�\II������P�1� 2�X�����w�����D�%���������g��]�v��)��/7����U�V����h!�Q5��m���9�>;;[�~�&e�>����_/����������(�&�������~aJ���	�P���8��"�/�#�#�����UR����D%I�kK�h�k�ds.��B���-�����{������d�l�7�dPQ�[�\�reFF��
`666v�����t��pCS��/�E"z+���~b>M�\��� #!��;Q�B����C���t������M�6�|01x�`(�f����\vQ*Xv���:t�g��)=r�7�---aH��P2B�FD�����w��0a��s�������������Q�P��vg��ugDDl���J�F����Vd�]sv���f�D�����>Qd��yNRi���n�u��)x�Gb����$>�������9�l7���L,>F�UAd�sA�?��D��_|�E��=!�
FD%]����X.)5���-��������Y����"Lw������_~��t�)!�-Z4w�\����[7h		�P����;y���9�P:t�k��m�J��6r���� �!/^e"T�P�B�8h� Xd��-LO�baCP���a)X	E}p�c(7����.\.722r���S�N��888"v��%P�����q@26�s)���'�D�O�-nI-����"��y��W������-)�������X��}��H2�s�8�b��D�J�?����`�$�����������D���[������L0UDB@@��Q]]]q�JI���K�.�i55�����}�����.�h����j����99�Kg�s�$���1����b���Hw�^�`���r'kB�\\\\q��>����q��0��%���6bB^	�.�����>3b�PT!BQ�",]b�#kAMM^�s���D�y!�B�j$��@Y�'6)#a�;���F���h|M��.#��x��������������)))��=(	�E�HR���0��K�����d��=U�'��H�XR��8D��(Q��N'�.��stf��u!�v��	���;(
�����������Df�1�:�B����zB>u�H455w��%#��Pe<x��\��"R����#gs����Z�[G��6E_���b(:iW������C=���T�����tE(�B�9O��P!��
U���j*�m�y���v�1���p��%g�c�������[���`��E�0�bD>9���E!}C}U���J��S�r��V���P}����ni�����|f���AW��2��!��R����y��5�����&W��}:���`����R��
����)*���IMUy����E�g�������i��Y��1V�����o�jS��C2;��@_v��2����H>�<�&���E�#B�,��M�4���Qv�������b,�r�7�����}'6���,wE���??��'�	�"�T��6�*{��R�Z����W��]���^D
��7�l;7^�`��an�%�������5VO[-��|�-!q��3%�(LJ�q�<��A,�Xv��w(�����A��;�����i���\�DH�0�'	��6��ff�^������S���*B�o�r�q����Pa	��H�������_	a�R�g�C(e_}Rioo
U������g.TCB"��S�����\t�E�l
6��~K�������J�����m��Fe����s������M������V����m�_�f5��Bus�����%��1����-��,5�����o��
U��������9O�&I?"��@^Pz���)�|R��.bf���{�����`c���	�F�`."�	������������5����1 ��trn�UT@n_������e���#�g���0���p�eRU7#���q1�zZj5qc�z!�*5mE���R�C�9|�p��}�]��4�����c���k���5k�;{����DCC�+V�3��Fya."�@(n�2�����w3�l{[�����CO�A~}�l&���x�
)�
�#Og�P��N	�e�&L�E=�
��f����:T^V�Es�}K8���f�������E��0������\�^�����C�.[�l��i111>>>666���"�_���������n��{K�
��M8uuy�&+/�>50��ok��G��/���B�(���J��,�l��RB�Ea��
S`0��YI^�SPJ�;��������r����g$WWW����HII�����
X,���s��9zzz�6m���������������������*..���|������G���r��������J �&M�$*�Vz0�`tt4�����5��F�����+����[w��=333�0*nV8z��o�������-,,�����O�={������!C������������'{zzBOq��������O��^^^�


������|�r���s1<<������;T���f�3"{=����A(�q���e��u1�f�Q,����1XM���B!�j�p������@�//��T�0BG�U�E)aW����������]��P}�)���v�Tw��H�jl�
��g�d�*$�#r��$�y`��%Z?a���r��g�:88@����oaaa��m�=
O�9����cG�)������'$$�x��{whhhpp����!����#G>|�����������;W+=���LX��-[ �={����udQQTo�������t����2�x��%a�^�����N�z��u���m��q�FH�r��r-			�u����������(���C��"�`����0`���7���]�v2����X�r)
O��xiflR83��9k�
�j�O�i��c��d\[��A��N��x�J��D_,���D ��}�y8�;K��KYT���x�������JME�s��&�tV���B��S�/�`�E�Z�����)��_?�����k�}��AJ
)�mQ>1WY��M�u3��5�f���:����.�������������UKKB�������FB�p����nU�& ����177�����F�������'�z��/�����@CCr�3|s
D*6���r�{F���P�i���-[Bhgee�X�~bc
�a���/N����2�L�I�,�h:}J�B������ZO�B�O�Z���u���~�oV;�.;�R����/Ft���y�x�2�?d�� <���skk���{�ukc���.,A����j���WM�c}P&��;T�XP13P�A���m.�D�����u��s��o���G��R��'�%����(=����U������C0y��@ �t������7o�@���BE8|�pf������+n�������c'O��r�
�Z���E`�-Z���{{�x6??_[�b�,k+G�\���`�����������'d�g?����>�_
�x��yfv�)k��2��M�����r�H����p��R4��3w���T�S��s�A1���nB@���IeF��3~�4�V����)�����/_���Y��WMC��lO\����aK��j�v�u�[~�
d���������t���^"6�6m�����G�f*<UUUh�~�aaa�V���������8<*�Vz)H,�8���/���-�Q7U����65��m���9��>;;�yJ/�Y�'����B@���U�(�����;`�`��q�������;%{=���5*�D\~����M���4��[��_��F5U�B���SJN�n���kd_D�-b.�RZ4UI6���(*�L�tx%�?�d&���mu7��Ll6�(q��+W��������a����n��"���nh�0����d�h�������k!����t������:ujxxx����	y���P���W~�m�� �wwg���{��	&0=;w�<e�����������Q=z��t���3f���3""�beeU�Q���S�\�&#G�����	������'6J�!44����BBBj�;�*|!72��������kt��������7�~�V/4�2�?��)��M�5X��RSBV�^��:��R��F��FWO��7?K�b�M"HYX��s���hu�wQ��/���g�������/@�����o��px>�t����-��������<�A���x�b�������SW+=�����s��y�f��9%�gZZ��W����/C�8h� (��l������]�(,��m�r�'��������FFF.\��*���`��%K�XXX������Q^��b||��'O ���<�	������'6J�����EU�z�����S����k��>�����G��.��}��0w���(�J1r�V4eN�A�nS&/�\O���F�]?w;#��.����_4s�K�?������|���1���Q]]]q�J������(}Ggg��b��s�����1����:�;���`������������6AD�;��Pz.X�@�h�s%!n�a��%K�������l�}���u�?��g�Bt�^�~b�'�*���@���/3�~��[3b���7�C�K��#b����py
}I���.k6}r���r6�i���]��8�5tM��%9ZZZ8p���/^LIIqw���f���P#n��y��io����*E��0�����FT�����0�N�Ef������>�*�qb�!-���J��V�b��(%�kT(/QV!�cF��M�7�oP��7�'>_C�����������]��d,**�*��������;$�?_�R�\����'6������A(�t%6��S�q����Cr�4���A�3��E���zL���Q������x��Q;��g�?!oS&��D���Z���H���
��QH4�3�QeD�8M|J��^�j'a����j�[B�)x�0}�DI��=!�P��)�����<�gu�om��k��/�����{z"�O7a�Y�vk��w!T0Q������[/�B�B����oJu��w���e�\F�R��U�*%�)����%bC��/jon��n;���W{k��/�=��-3�����83U�l�h�K�e�/"���"�A(n�4-6���1_�8��I���q���l:o]���R�1m��R@T��;A9�e�tv���f����w+suU9vSO[m�����c��xu���2<��ZF������+��&�	n���j�R
�PH�%1$:B��J@_��\�":����Y.q��c|-u���Z����L��gGTy�7�P������B����e��1fM��1[�@/|ZZLnG��3�CY�eU��@K�e~��:Ol����ww9���������b�&
���/�.���D������U%
�I�Yr���F��>�^�jiB�#��(���M�q+�v���X�ig�`nP_�E)/������"])J���r����� ��/��������0�R� ����u�Vm���io����5["���������>|�o���.�"�/^���r�\��gxY�|�"�h�����_�f��Y�����Y/�b����W���>�o����W�`d����Kb�	�F��������>'z�,���c��g_�����-����/j����s����/��b7����iv������5���!   77�������nn����	��-�&]O����3�L����-��z��!	?�=���Q.P�O�p
W3Oo?/� /c}��j���������1?}��?�����������gHGWWW����HII�����
X,���s��9zzz�6m����DtY��v��9//Rv��us���v�o��F�-22r��EBPN�:��s���y������J`;v�^���[7o�|��U��aaa��}NN�
�}��I0���G3������������={V�^
���?���{O�<���z�[._�\77��\lr 7_�t���k�:��NJ����v�;��G8�,)"��sO���x��7r�xD�Q�.tp��iH�o�l�o4�P]z�v{Y��e�
J��w��,�|��-�,�i]EE����/^�(	/,,l�����G!� ~���! �9RU.���2r����\��KKK����������9x��8099���
���",88�W�^{�����$���"���+�B���x��������-B�����7����K�.A��:aC���_��n��m7n�,7xp�������Y/�b�B���1Q/�.���Xu�g)-�Z���gi��v�x�ED��
.q	���O�;��h��n"��v@�(W:�P��K���s��F}����-Xx�m�D��<�j)"��h�~� ����k(������@�����v������PQ�X�B(���TW/�^��fff�JZ�h1l����OCvZZZB�A#�������QQQ�P�B�r��uB������!��?>j�����C�knn9v�X�! ���!�'l��e�|��f���M��6�P�i�>��1J�]�m1-��	��XEB8C���H&SE���NX�"��4��}�r8��^���q����Y��H))��J"f"���gP5B��e���B����o��u��s���~}�t���*6H>���`�����r�����C�6o�\<% T����e��A�Iw3fddU�������C�E��JKK�������8y�$T��-++K\k��+Fl�jv���\l*�m��_O����2����J��6
��Np��f%d�&���WC~Mt�S��;
��������&^558-�;M~U}�	�-K��,P�����������v!����a��9����M�����������OUU577�Y�������s%%%P&B(B��|���bQ�������q#��>trrbf��=�v��)S�?~��8���V�Z�enY��C��m�6hg���������IY�5�Yo
`.6	�)��S�n"��Dgj�3��mlZ:,���BB���u��
>����������:u[�P$��T�}��D��������'(����@-�r�����
6�lll����!!�nh�0���HDo����O�����?��d$�s�c"�X����p("���N�8��?��i��&��������������[;t����9Sz�oPZZZ�� ;�d����n�=w��=a��g���a�<puu��c���d���,k��b�G�%#������loS�)}�R�u�i��w�BQ�O��-��9pg7g<�5���=��"�"Yi���_|��gO��a��QI��}{(��@�@�gkk����w���j�������;v�y�f�BM	��h���s�r8�n��A{HH$
�����'��Y�B���C�]�vm����}���AAAPB:.^��D�b�B�q��A���-[��P��������R���������	.��p���S�Bq\77��\l������}��Kf�����>�����"�%����wM�	'��1���nF�P���|}�z����L0UDB@@��Q]]]q�JI���K�.�i55�����}�����.�h����j����999����9w��xv����3�WD�3������;Y*�����7����b.6f���ugz��x���4���<����)�e��?���H�����W���0�
{Jb]!��Bz������_�x1%%���%���h�p�
��0+�������\]�L�����F��&
�� 1!���'�B�P|J�k}0��GSSs��]P2B�
U����:��^`.6N%�����P�ff?�q�3��o��-)9�o�oZ����G���.�(���@!��N���Ff�H}��r���PaQ��3}������������hB!����-Kt���'X<![
�n���v��%dZ���������������� ���B5������?�%����eY��=���Z��0�F�o��^?_�&������y��u}���_��(������E�'E���n|_�C��\lTrso��TPJD�������Z����ik�5�a����L��U�����+x�p�
�n�R.���Gv��Ug'%��('�}���o�������~��g����i(%��j��E��?��B�s��x�zl���%��d(�L������Z�V~���M�����vJ�?w��a���+B�&��� ���U�?g��of�)j���n�j��K�����/���[��8b�O����l�W[D�:����^<������������7]�����������%��[�LC&Q�c�y���-u�R��B��`.*3!?!���n�)E��5���v6C������m�Y�d2�Bizh��i^�NxWD�P#����Ew���p���#E��f/����E?�n�4�������%o���-��.0NSM��BHIa.*'^��1�����/:?B[E��~�Z�x)r����W����^�e���1�Q���7
?JD5^��J�������_d0��2�������^Kn!32���9�Ee������u3:�Q�VB��\T2����W'��)�������g���~~v��%�	��S�����C��6����+�!�P����L�G6_]�YvDhKC��>'u��U�����������xeV�%�'G��������	�j�������$$$�����3g��y�;{����DCC�+V�3���0a^������/O��{Xx���O������={��tJ�����s�U����^�
Y?B)�r1;;���w���'N�}�v������<<<��l��i��������@�����k���P<N
nW�Hj	Y��Mq�{����}���X�".������-�nkE���yC�\�l����U���P�#�_ii��
 a�C�...�=���WSS�>}:4BL4(((���[����/W��7,jm���)��B���m�I��{,`����a��n�����Q���-x��
j���O�G9X+d�!�t��Ecc�/����~��U||<��aaa��>NNNqqq���
o�s����
����.(��sa�@���9k�S��oD��
���������[6���������'�����������F5u5|�����\�zu��-�;��!��KSS�@D��5���N��\H���L|Y(�#�9G�h�j,�������o���[t(-�~M	��u�u��-����BH��$o������l�����
f��������k�(�Qz�������BBBj�;
L��6G�����)}�^w�y��i����� O�����c�
�*��{e��>*��'�!��s1::Bq��}�}������~�zq���877��h����o��A(�����;
P��-%�x�K-�C�i�����vm3Zm��cf�)�H������������Q�hFuD�'D��F��w�^q(�=z@��u��3f\�r%""b���VVV
oT��7L�<y$���&�B�O���	����K�'n����d����\�Y
!�� �r1444))i������(����g���d����@;;;x�6��c?_\�
��J��W�y��}���`�B��/K]/����i�,����7����@�
��E����77����:hl����~�q����J��z�FcW� _F]+�%����4!���wo�j�����!�)�`�(}�����S�+	E�U��fqN�m���LW�-,�]:���c�i���?�5B1cF�F
s����E����yAI��k?���K�Id����G�w|�����������i:��/�Bs��
�^������e���R�������j�����	?��N�*n��n�^���g�>u�!�4`.�����o|�����NMu����m����
����8�n�1
yD-�������p���B��g��'��r��#�����'v�8��:
����*&:w��^�O�-�v����>��>}�!��`.�-����Y[�$��E�[ur�5=Z��p�������y�;nI�n����1rY�)S3`�jb0�PN������/*��)�5h9��_�z�5[���F{Wt�e1�<��t�������0B5=��u����|�����}z"����M���lVM~���W�tO�%n�S�&�6���S1F��
s���]�vw���Y��Sm���mw4v������1+��p������5������{����A�O���������������]v&EH_��c;��������������I9%����-��YC]��&�6���.��QH�+��������w!�����"��p����I�Ee�#M��7��w�j��{x<:AuI�k�f-��O���&CKK��Qx~�����*f�SV��^���BM�b-�e&?]���s��N�L��r�8�����!�d�&m�e���w�;OS���7�H���9�zl��������V�{�BM����f=��tJ���DS��S�mnm�Y���P�y��&�Z�1�B]1�����>��)B��E�)M��;�njIY���m5~��RU��,+(���3��u�}����B
��"y��������94���}�\�u��*u����B�s�S��E�������
149*_x|����b���Y��)�q��dYu&'�t2`�B��\���wgxt�]iYEH���:���4�kM%�����A,�"��b�[�b����N+5r�BU�\���g�����G���=[�N�������0*��|�����G�����D�bY4��H��S���G���=B��`.��������Q�Y��;�T8��f�s�'�E������h{�PA���:�����W��P��A(�`�����Q�. ��
���O����1��G�l��m�c��:jr\�[������Z�{���X��6Y��~�O_C�2��%��oo���Q�����z�x�"B�:�E�py�go�:��bOrPLc�1���h�F��2�&�_iy���PR#��j�uR���ikH����+�B!Tg0?B �_���X\���9���c�V�X�c=)��o\���iG��#�'��M�	},�7���B)�b��B��'��)���Fcu�/�M���;���F�!H�{�ie�����qc��-�q}�m���q#������{r0z���q�6���e���U8��v�[WR��2c-uJ�=��]�������GD�s��{�"��.}��F����������i#�*x��������q�R����	�&~>;��~%7�B!�`.J$�F��.�]��E�Ez�[��AO�[�U�d���wb�Na���~��8N/{O��>��&"B5h��DH�w^]8qkUb����*,��TH���e�����7�]=���hZB8�����N�����2]:!�P�j��(
�_��Y�"[��l�|f�>��l�3	����K������8F�S�4�u@�K}��q#{�h����B��&��<������mL�})n����2��#�o	�c��K��]�K=/]:������0~��>T9�Z=B�Z��r��Wx��C!�e��U)������@C�eD�����r�"7�oN�YJ(�7�'�(����N�?����'q �jX�P.���}�3�8S�����S��Y.!���-����������JE��-���j��q�����piQ[�G!T'�D.f��>��|��"n��QG��oB�Xw��D���ra(c/C���a��/��W(�=����~3��S�Z��Bu�&��c������Y�f��YLKll����


W�X1f��Zj�J	��������q^{z�+u�h3U����m��f���]�J�SH�>A�^VZ����9Dm/�z������m���� �P#"w.����n�Z�RZZ:t��e��M�6-&&���������F�����U�j��6|������iy�v�t���/�	���f�N
2cu3����#�C	U�q1�r�"9$�B	x,�GnQ�Q��;�:��,w;�!��sq������}�Hnx��e55����������A���������|�r�8����+}�!'S�<���p�-��Y��o�	yB�a�kQ���:67���&����"/�9{�\:N��	�\�4�R�OZ��l�����:����LB���;!��$$$����g������������V)u����'^�����7��l�\�AG'�sS��H�N%F_V�+��K!����HR�t��PW)��T�8��_
�>��
���B����)((����8PSS�@D���7l�
I��D��+�s�TT�C�M���X��(+m-b�%1�J��,^�O��C������{� ���e�9�i���W��=�Y�[�!�D( ����������-��F�����
V�6�)<�fR�E_��9���j������\R��g��5�Q�
R�xt����n��:�	�4���r�������wm
��U
���$X�"�j���...�����������F��F}}}���W��	nI�"��.��/mT��?1�c�bf+~,��p�\>A2S�W%�_H��d��h�r�o�]9l�nB�-$$������r�G��`���3f��r�JDD���k����Xq������8�� _�
Q�z�~r"����i�����~V��Tk��r�O����Z�=B�I�/�|���Lp������N�:u��M���3g�\�d���E``������Fi�b����K�D������)Ns�������r}R��aV�}�������Z�������!"��7�lvqqq�v77����:h{�6��������,-�]��7����<�t{Q	�R.��������O|�n�\��M �j���:p<~��TQ�q�DH���&����	����r=I�j���:�����f��!B���5U{SP��Pt��*���'����G�����Q��A��a�9�h�w0�����x`�BUS�\�9����D-��%��	�B��F���c�K}m�1=]�����i}!���P�\�X�\.{7�@�Y����y�q,m�����c{��RW�z!���8�R��E$�HH��S�r���*�2�>�BH9)q.��Q:�{\!���r�"E#��&W����!��!�>���"���)�h�K6R�=�B�MYs�PWC]�S\RvVF i#}6��&�
!�PM(k.�j��^$����W_�A!�8(k."�B�s!���\D!�$0B!	�E�BHs!���\D!�$0B!	�E�BHs!���\D!�$0B!	�E�BHs!���\D!�$0B!	�E�BHs!���\D!�$0B!	�E�BHs!��h��;{����DCC�+V�3��G�B��k��XZZ:t��e��M�6-&&�������������B��k��x��e55����������A�����/_^��B!��5�\LHHpqq�:99�����xB5
4

444������R��A!�D4�\���.,,����C�t���P�@Pn�����B��������������qqqnnn�|}}�-����Wc�3�GJ���T��#�/��<h.������[���1���+k����A!�j�h.r8�����3g.Y����"00������B��k���������{!������!�P�k<�����=R
�o���F�S��4�\D!�>�"B!$���BI`."�B��!��������������\�9s�L�2��GTs;v��?��5kf���������1#<<\UUu���+W�����
?���$CCC���3gBcTT�������5o���fM�����u�����o'J�KX��X��T"��D�V�HF����{?d��S�N}��g/_�l��-�{{{���j"  v�u������M���JIIIOO9r���?�c}�P.0f���Gp������{zz�l����w���'N�}�64zyyu����+��s�B�0���KX����n*�J":t�[�#)w.�x�����o?L[YY9;;?x�@����M�<��O�>������~�������������qx�
�����0
O@��P#Z[[o��B�y������GJ��g������0a��7o���V�;Vn7�H�?K��9��r���������c��E������S��AA(�k�w�������������p�Oz��9�2����>|8L�����`(Gz��ill���_2���v�d��Q�/++k������gZ����������D*�����������;����=t��3fdgg�[�������0�����K===��~��a�^�`�X�����C�����Y3�1�h�B����k����WCY���H�ek����E�	�����J������N�s^	~������T��o_���G���C__��g��EQ�����Q�����+�g1r����h���q��a�x��Mx!�l�������((|�W.00P�Q�	+��r�wS�T�����S�\�t�����Y�������=�,OIeoo_\\�ZZZD������A�*>>���'�f���;w�8

�\���P��o���(���?":V����>�����V�;V�n^�p��+�J���_���S�\�W����������/^|�������{�����������9qR���Uvv���c�������/��i��������{�1�(���V�JNN��}��7������1q�n�� �U�����N�s^������/--���C����S.|>�y���r###!B�N��i���N�8�����������8�7o�<m���o���A��'O��B�9H� �*^)�/����U������[�#)w.��W��G��������b���aHHH��G!��n���
��4�O+�/a��c���D*��������>B!�\D!�$0B!	�E�BHs!���\D!�$0B!	�E�BHs!����������IEND�B`�

pgbench-rw.pngimage/png; name=pgbench-rw.pngDownload

�PNG


IHDR]T�)	pHYs��-z8j�IDATx���\�����7(����( �U������jPW��VE[��k�U�j����� VETPp�H��I�Or""&�������{w�1����`�x<�B!���B5!��!�P�E�B�
�"B!Ts!�����BU�\D!��`."�BU0B�*u����<�a������0=o����x==�U�VM�8���!�P��c..\�PQQ��.//5j�����Y����
���laaQ�F��#B!$�������ccc���SSSa6,,LIIi���0���������V���+WJs/B!�H��999����h<u��goo/�`kk���]����
B!To�"���9s��o/l)**RQQ����	��Qt����G�q"����2d��G���d�����Fuu���b�laa��@}E���osPP����%9B�%�'
Y�
�r1   ))�m��DpHjEE�����-[�i�&a���GGG{{��4�yB���8��k��INN���?!�\��;����~�z33��4J{7B!�H��~�����|�rccc+++"x��>�!�P��{.�X�B8���Q�C=B�����C!��`."�BU0B�*��!�P�E�P3�wn���gC�Pa�z,H`."��������{���^=��GY��E�Ps�6d��X<f����Wd)�zD���\D5[o�_%�=��
R�@)��\���7Y
5u������,������g)�\�?=��'UE��D-�"BH��U���\��*����y%�5�)�(Ub)�um�7}w�N#���E��\*a����������y�~����pK��od?t��=F��~�H.a."��F~i����������_e?��$�cy���X&p�96�Q3���j�������#^�}���,4��p6�d6���3�b�	��[\����) ����
<h$�0BM������O�xq&�(��n����m�����h��>Z�.�q��
EHfqjkMSi5#��!�cs�_d>x�v���K��?�
��mk��=�(��wVW�����&��H���=;����W!�0�0���-��o�}@��"BH6����o���E>I���x��t���������Gku��H'a��
)�A��5�B�;�Q�sR:j�0B����$d>z�r#&����wj�B&��h����gW��
�>����0�L
�n����UG��V�z�j�0B
�m��Go�=J	�r��<��>��vm\��9���Tn��c����}�b����Z(��D��+�IG��3Kl[�<#����O���\D5�v���~�������>�f��t�`�CW���\R�����K��K�ZPL��D4��>D��������n:J/��`����>�s!$5<�K��}�� ����w9���h���`������F=[	?,���c$�4)��O��)Y�!��S���1A���9��:1Y'����	s!T_lN���w_�{������;(0���u6r�d��T��"���!��� ,��?d����n�T�'*�D�3��R[��g��BP�p�,*�"B���K����Y��MxYE��@!� ;��d�������WN�����(�?�-�mLm���h���i��"BH2�I��!w^]|����W�QS�t0��h���I�6�&n1)��?^��6������J6D{�Ix�
�"B����W��Mu���G_e?��x!��E���Q}
3�e����qe2X��K����&)������X�����:�������:�\DU�vm~Qy������_���J��t!5/�Z�b�m�Y�Lul����>#���(��N����W4!����+Q�F-lo�#�E�������\S������i�/o&�������j=Y�F=��v2����raa8I����L�����d�OAU~*��,P��������Vp�\.g��n�\�[���h��������T����#�W��xf��Q��?�^�a�i�=@H:$��������>))IOOo��E���������T}����#����y�����C�U�VM�8�!��nA�w7�aAi���qECQ�����w���&}���ya�/�s��i5���I����Q��)j$����OO�'N2���;�{�vqqqtt,..NOO����,//5j�����Y����
���laa!f#����!T��b����f���^�Qq��zXuv���`���$e)�N�����?�P�7Q�ET:�o[��<��7������C(�4�W�v���� �`V[[[�gXX�������a������#  ���M���+WJg�B5�.N�����goj�y/�"*,��l(��K������
���B�~D�s��jN$�Ecc�1c����	���O�>999,���;22RKK�G����^����mLLd�����/�P
�<�������N
��"m4�UT\�d����S�$���!Uu��5Ku��>z�����uuu���o����)S���9�o��5t�P##���"�"���Eb6�n.88X��KZPPPF�P�U��z^z�E���{��`%��D{�y�qoE?EP�e&i���K:��'�g���FH���&L����}�6L@b�=z����C��w�4i���g������(�9�m[Q������jix<��7�.����J�k���q�c3!����y	�����O����\�'�!�
F��H�
\���4�d�����#F0��=z6���={�LNNvrr����lEEE{{�M�6	�������^����)"��O��v�������A���)����*�15��$)qm5�1��P�!Y.���B9��W�����/.Y�$>>~������P,FEE�:u���������c��9������������ll��E�%H����d���SeU������m���.��H�/)O�������=�&���P��"����m�5kVZZ�����	`j�-[�x{{gdd�t����s``����������������$jDI�Gx���8�d���0��]|U�{��h;����d������`D�Gn!��OT\Hq4���Q�%�����5N�����Q�F������/�\x�����D�M��
l?��Y7���$}
�x���*���:c��f#~�f����(�	�������c����O�>=�F�P�&}�:�p�dP9GI���g�t$:��~4�2����JL~o��#�Da."$�R�^��;<�$��F%�Jok�!v��8�H��$��k��Z��D��(���R����l�����n��*�����PW�``����;j�!�	�-���x��D�1��G����!��$�f���1)�E�u��w��C����'����S��x�os���0j�x�� ���[���#�����G���_���$%��eT:]o��A(�F+B�s�����F']<�p�����S��b�H����$w1����H��=o�����E�y�\D�)��8�	��<����L��d�z[�����a�?$k9I��Z	�2���0B�0jZ o�8s��i�/��
L�>6c�0�nU|�d�zo����5�P
�=V��#�E��
�������R�^�X��m�y���.<�?��
�h��'��[��fs!��^tb�����^�FUQc���aL���$'��7C��6����BDI�"B2� ���{2���QQPl7vx����Uwa��w���?���+��"�2`."$3�Sn�����J,�A6�=�5�H�����&�Qgl�+�!��
s!�K�:~wCl�-a�S���#�*��O�����J6�������A�q�_B��U���w��}Y��d�>�4���Wv��\�[�3i�K4�W��!�(0j$����m����������Y�cX���DJE����L�����(B-�"B
��4�������9�tE1z���4�r�[��|�D.Q�.��"�0jPe%��}���v����a��f*���#LD�h|NZ�%�]e2N���"B
������:v����4ac[]���:�w��wm�������L���s!�{�r�P����'�
�������("��� ��/��j20���������n���3����z,�J�F�/���kD�� �E���Gx�������~���(�[y��"Si����*��U�4��o��GD5I��I������/��{�gV�`���y����-��t���FV����R|^���;i����P����������c��	r1:)�p����D���u���Y�$U-����/�cM�����4��%���%F���O�((�^x�9�P��!1P��d\����j)�
�����������v����A���*��x�PTc1F��/d	?sT�%��Wq�>�F�PS����x ���/�qJ9�%E�~�/cJ�UZ��d�?�Tk^�!y$q.���|���IIIzzz�-��������y�����C��U�&N�X�F���Q1��<�*W��	�G�����T0�'��H��B�K�����OO�'N2���;�{�vqq�����Q����f��5h� ggg��4���4�#T9��G���z|n�v~��#'^s�Y�!��&��� �"��F�\��x����0
���]������%%���gC������G@@���[}W�\)�}EHr\����N?������J��;�2����cC5�r���x��10��p�|�����������}lmmcbb�����X�BHzb�\�w�/%�E-}�8D��:x����?6��B
�.��=zt�����������IQQ�����QUU�"��4�i_����7�V�N� lQfR�^���e/�>h��!�X]rq��	^^^�o��	(�������b*,,T�O������a+��T��#�I\�i��1E��2�E�I�1��0���P���tJ9:J��'Y���n�;�P� Y.���>�|��L&�G���
�p����M��}bbb�����(�Q��LQ�4|�p�F��8bS#�F.O.��x[=j�S�>�F�1�AW��<��\�J�����r_� �I������&M��W�^���/^\�d���;����c��9s���CCC��_offV���[�>&�4�`��7�O�H���&*�KsfMA��hF�|+8%Q0k�'IE�d7^�PC�,�F��m��Y��������R�i�}}}�/_nll���oee����P�� ~�P��e9t���1f��8e��O���~�eb����!�����/N�����!�F�AZ���7x�zS���CM5c�R�C��M#�|O�@���r����}�p�W���:�d��U���Y�=��YJe;H�P��\D-���wv���u��,���m�L(&E�\��r�o0B-�"jYJ�Ew�_|���q�sUj��Z�"�����9H�#D��"jAb�\�usiFa2=������$���?M���"��a.�����`���g���at�d��d�+3���_BX��!B���\D����+�o~�]�J��1�d3���)J���$���/�jQ0QsVT��������8�P_�3u������G��`.�f�ar�_7�
�D
�j����2i5���!U�!�4a.�f��]x�������&�@�h����CW��;E���\D�Mlj����d��g�L���������xB�S0Q������������&:kS_Y)jL'���S��80Q3�*'n�U�W����Rc�i��^f��h-Q������U��zH�`."��#��'{��Y�������o;-=�����U�P��J&v�� ����/d=$0�|�/��3|���pzV�A�3<lS��	��l��dBqB'�)B���?��}�"�c�����u~Y=k�B��mmj��h���P��.3<�%���dV��0�\����F��{�>�"dP�$��
&���$��{�HD�O��0��I�K������$zVK���k���w��G�C��� qw�����L���a�������N2���H��?;������
z�A����:�P&j�v`H���H�%���WIQ>40?��21��Kl��!�����F)�h�
�k/���L��7���J��l�Uf
�}��%����"�
�,���~�>[��"�8�E$^e?�|ibJQ=��Dt���vaj�t\�Q���7��k�,Lx����m�S�����Q$��6��)���P��#��U�1h$�0�������2N��E��f��E��x��
5.�<�I��%7��KC11Y��q�u���������5h��f'p��h�������G��"j������S/��I�*Rd�M�~�C�IS^��y����q$��n�l��7�4.�I�x�������<����� �7�(�0f�s�r��Z�p_O�5���fs5]��7��N���g
��E�_��]���i&�� 7��ual4�}x��{��V�R��Ra�<��"y\B��g#�D���5LE)������z�pp�
��
=��J}���T4���|J��)�b"Iyi�}��Z ��ew�2���5���J�.��=�`""�`.�&�����5�d\(Op�>�"m:�~�bh�zhH9�~8�{��
#���t����2��3�K�4G�84���l5���P[]Y�!�\DH"���i))��:�^F*=��H-t����^��B�PQ�?X4�q���(%���9��g�vo���z����r�.�|��-M[K|~��m����������t����
�(���	I������)%e������~�u����������G�p�v!�c#��t���R��E5/��I�����>��(��e+��F�_�����D��Fz������OOS�^��n���/�D�P����1�s1""b���qqqZZZ,���o�QMM����<z����&::z��y���zzz�V��8q"<$~#ji<�}k��"N�s�`�vS>d��*6
K����&�ba�>g�5��������<��?������[���+|E��ek|��~w����a�H�m�.m!��8[Y��!��Z�
>�����r177w���[�l�:u��{�z��������R\\������/�Y^^>j�(??�Y�fEEE
4������B�F�Ui�)j�x�����|��� �;���u��G��@(N�������x�<�TfV�|���_8;%(���V�zM4NQv')�]��m9Yp��������qG�?�r�n����0��kW{{��O��k�f����������f��
���nnnb6�\�RJ;��:6;{��C����gu��?_om2N��jQ��Hcxy����On�3�|�Rk��O��%w)�Lc��]:���l���
E��BrA�\��p��i�����ccc�d���a�X������ZZZ>qqq���mmmcbb ;�l��>!���woc�����~�ii,pFG�V��j!�+8�w_���"M�k��*��u,q�v�����sC��FW��n��y3l���k�B����4e���3g=z���[C�522***RQ���\UU�H@��:��#�RN�zuIfY�t=
-fVd��vT�[��������'�
�R?��["���w���������xz4�(������;w<==���f������{��������I���=���X\\,\���P]@�F����C

���Q�Q�8z���2�AB��9rf����z\�PQiEf�K���n����	���O.�G�x��X���i�;�D���������!<�����-������NNN�,��VTT�����i�p���HJ�E�8d��jc�C����l��S������0���9��)�a5+Y���������,J��}�3��<���$��+Ex?��#�����o
_I#�I��P��7����P����

�b1***  ���S���\.w��s��	�G��_off&f���5\N��]z����URX��o�6}e;��!��������s�/�*�q'e5v+��v��:�����:�eZ�$�.g?��R�*���YcY���5	�������D8���l�������ahh������}}}�/_nll���oee%Q#j~J�_l�8�aN!=k���xp����lG%�2R�C���Hs��x|�cC%��
����k/}���jUg��(�JJ+��.��w���x�yD�sv!Q��O�G-�d�������p��t�j����unD�*� I�<��K��g��m����r��'������Q�9.(0�?�"TN���)W/V��d��!�v�$�[�W�]{Z��i���GN�(>�m�L��kgU&�/P���P����C�����8~�M���%�9+���=�d�U��������&��Rw������a��)s�L>�7[M?���6#&�w��/���V��l��H�-����XN��c^Y���Lp=&E}�uZ���e=�&�]Fb�y���#���|���kR�~"\B]�L�m�V�����zw4�VW�����3�E�,�&���H�T�*O�CQ�A-��S��_�vtM�K�c�����W����2��}x-��qK�"��I��G��]�[����&�E�-�
��R��9��SU������x�Bp����{���hP��K���0K1����o��=GZ�a���j���A�.�^��mQSdY[���x����
����J�^�m����$Za��=�vJ.}�z���j���G�"��0��q8��cUk�(����[�|AM����I�%w�p�_�	d�+���"���K/������be�Pc�\DR��(Xv�����<
�����muY&����r�*�}��%9okx�����P7���]���[�jm��gJ $��H�������<��������79��b��:6����]F�^c�����o�Id0�y�<�I*��=�%�}��%bq��|��n�[�/����������-"���]�SX�%��U�D��;vl����M���&2�]�v-Z�h��us���[���}����\DRSR�����Os+CQWY�H��*yD�D�������n|�n���� �&:�+?�vdO�{�U���$�?�}�hQ	�,�)�����������{��no��4�S������*��D���)+V�prr�y���`|||���;t� l)//�.�56�����9�E$�����_Pu���������b�{�n�x�j_l(.�P$�$���z9��_h���ExC���+��Hf��������yF�Hb���hdO��=m,Z7�N��Uc(��%���P����/���������)))�{���y3��X�p����aBKKk�����w]CRR����<w�\LL�Y�����]{�����k�����;88��~�����@�}���\.�����}�6�����5�������qE���>|hhh(����N�0a����������C�����?�
��9j;�9e��=�>}���}�QZXX��w���q�����0��<[s~����U���;���l�$u�B��koE�ds���L��b��{���N��������$Z]��x�n�����P����<k+((\�x���R�L $$����S�N'N�����������{�����W�EEEEB�B���������K�.�x�G����;6t�P�K�����������&;;V�}�v�����l{{{�#KJJ�zKOO�D���q�'O���x��Ua��}���g��	e���;�l���UE!������y��k�`.����{���/�K������N#8m�����Z(�<�)#"����
�0�j����g�n��n���I�'�����7������������Y���P?�=�u��_
!t��AH�a��%��#F�W���
����7o�����2���&�������|/_�400�}���jjj�0mee� �|}}y<��I���?�q�P�B�kdd	KA�	�-�"������O����w������c.�zI�y����Yel"������zu� �A5����j�(��(���P�a�)T�������D�1��%�x�	�(��j����%K�@�l���W�^{�������O��=���/��o��|�-������G��Y�3���\n��FNNNjj���=��1c��V�Z}8�7�p��I�3g�������
�^FhbR������k_9�"�����5!�r�����(2���ke=(i�����������O%�������1RpY��^������4�P�@gnn�����/} l�n����	&L�@Wx����^�
�����Y�������]
*�G��;�,--���P����;w�����v�8���\�Q(=���{"�]�kl�}�����^e�Z2!��EY�6��]me��W����������x�H������v65����]��4�Qcrko������YZ����fb2�PE	[�\�zuVV�����a6::z��m��O��OU��OgU- {�D355�e��_�UT��D���3CBB\TT�p�B(������������o�>ooo�g�=������;88������I���b������W����"1#bm���
�{2JjiO�������$/�\>AB����x�[������K��|�C1��Yg3�<O�PD������j����@�?�O�>���G��CC�.]�@�8F��������`ZZZG��t�^^^P�At-]�j8(7��=��jE�h�����o��f����sPPB|�}[�������0�=<<�=�}�v�'��TP��Fa�N�:U�>��TSS�	6�}���%K�@<CE+�]�kl��"����[�Oz�di�Y�!�}��"��N��#Jg(���2e^A+H�j��J�����N�3jb���j�@h�h=�U^F111�B@e&|TSSS��F��[XX����JK������t������cw��
��k�6�����m���&��sG �J����v�$�����.�56�sI&�m�/!���$�6���',7E)�7K��^��VN����	��e[$r��5�����a�����EMX��E���)..655=|������\������$�����Hq����z�p�'/�1���[[�m(V�����}�n���"�Qm�Q��P��2�ZjO�������*?��F�?������P!�:QUU��w/��%%%Pe9rD_�����"W���_��V*E
Y�k���|�"���!�I�{��e�CT�����y����f=kh�~],(�l>��W\��.�{)�e�C�]�fl���GQs����+���L��,j�g3��.�(�}��~)|)���B��0:����	�"zv0�rP'/��*��n�B-�"���P)^�����-���S+��I�T���$p7IMmN#��)����
����Sv�r`'�6
B-�"���o��zi���de�FV�d=(��zF�"�GII��jx��J�����CY�	���<�Y��y�k��]�!�r`.��<M���/�"��S~��}����2�8x\�I�w�;W���3�Di?��O���"4���8���j<k!�a.��z�vk]��2��6�~n^&m���>�]F.��[�����OI����CQ`P��nm}�w��-��!T
�"������.N�
E�&�6�zP�*.�Ssj'����P�����%��G��Z���t�=��E-�!��a.�T�n�Ll� ���V��8�g7���w
(�}T�m��s��N�f
�2�������F��	�3$�u9����c�v��%&&ZZZ��l��������"��k��E��[�n���tKtt��y�������V�Z5q�D��$��EDD,^�8..NKKk���|�M��Z�@R�8����S���P�h�����OB��)o��-��RV"lK'j0�w�.9���1�w���\\l
kY
BR���$�4a���+����K�X��������<��������������5���o��YQQQ�
rvv�������F�MK�����C���e���S�����woWWW�^}�Z�@�����wU(:u7��O��Wc��$G7�4��i���p�Ou,%,MU��;.�t5����0QK�f?�C|��m��.�/���	E<���/��0���O��7of0.<�<L@}�u���kJ�8r�H�=


 e7n���C���?_����k�~�-��3g���\�|����tX	l�[�n�k��c��m��_�.�u"-88��~�����A����0�	&���p8�������������]�F�	���2e����O����=�-aaaJJJ�g��i����7771W���?�=�A\�OB��v�joo�����������;��(.=��KS�8���C�����B)|r�F�&������������v������.��:�)7������)$�l�4�G2��D�O�


/^���NLL,		)..�����' � ~bcc���������@�A.�W�����o��j����>���#F�8|���a����!��+�+���}��=p�L'%U��!b�����bii�����c��|�-B�
>|����Gt.^�z��	�����*h���PkA
V<�ZK\\��p���6&&F[[[�F1��D�\����6m=���k�����	�g���$-���[qbiE\�fEg{���	CY����D��?�T$oQFk����<B��h����P<�5��h2������h$���O�����Z����:��l������;x� T�>�J��w�^(��\����U�x<�����+���.444�Wbbb2z��s��Av���B�A#��!C�L&�?""��W�\ms�N<MM�;4����/_�7n��E���522���4i���vvv����'�uB���TT��E��"1�������y���P�k������k=wI������8���3E�NV&�G	CM��z'�
�F9����;�(�5��H��=�-�rsko,�1"�@�x�' r�j��m���K�,�t��aC�^������w�A��={B�E���
R���@..^����#�=�\<x�p���[��B	�dff����1&�a�������7�G��-RU^^��r[�je+������3g���n999�Z��a��H]]�f�laa������l�V�\�s����'$��3�?�O����]�6����:��(���J��2���l,�������&�������r����?���T%b8e���Y1c1�~6����Mt�2_�zq_�CE��@t�����}0N_��[���1t�0a]�)**��W�
�fccs�����2ccc(!�V�?A�A�(zo���hi����J�'O������/^\�~�W_}u���j����5k�DEE�h�I[�L5���;�����>�>77W�~�&��>���o��I8���(~�8��I���o�������T������ �.��Q���	��E5Y�A��!�F�R��\���rj)�z�t�2YE��L��(��
q\<��9���� �W��.�r����On��/������Ff�&������� ��J��������� x6o�������m��gK���ta.�[�?��3�i"���e� #!��;A�B����@���t������c���|01b�(�������L��ew����kW����+:r�7�MMMaH��P2B�C�����������{�������c�]�v��>	�
P���9s��

�������(��@�\��n��qP)C��c����H��]2����W��$?����p��<�e������H^���e���LYi�(.���hL�6:M�m^�D�	 ������P�Q�P$�\?~|�>} G�M%]�.]�X#)5���%������#GjY��A� !�`�[�nw��#B�����o�]�p!�����^�z�+	xN����;s���9�P=z���>��������+  �BH��K�B�U,T�P#zxx�"��o�{B��K�J���	�rSM���f��]��d���3gBq�|�r(��������8 1�$Y.^�p^P�p>>>P��s���Tg����	WP����$��U�v8@��?�`�W�����?��y�	�z����������F9�}�QS��9���#�'�a�C��tU����G555�j$z���={
����JJJ>�u������ccc?�Z''����������K���c����{7=}@@�3�������V;Y*����w
�����:7�I�\�����&��k}v�Mr��5!
�r	])�)�m���:�rL�Qd�2�!��"���R�TU�zt�n\w��t|,B�;��}Z���b(C><x��+W����@��zPU���������X2.���.�
����e�~+Q�Lf�H!�W�+'��������o��U�Vzv[8�ESUIf�CI������{�d����#G�|�W��\lq����-��i%���ic�?�Y���FRVBNl#G\�&���.����J_{8�0���&�k�Z(��.���E�0[������������P\j���^Ft��f4����V��W��(�o}Yz_��7�3�-"B��a.� o^C(f�OR`�%����V3e0�W��?�o����"�_�����������+&#�ds���,|�&d|�iE��ft��A�6�V9��?��+1|w�GQ>����&��k5�P��)�M�K�.m�g��jm��CB�Za.�����B�H���vLG/b��!������k�=��k���7\�B�����m��t�"�o0�0�b8��zY���U��.x~BH�0�������^t(B�����h:���@�>~��w_e�3�_*<�N<��w�{�6I��[1�����k�~Z�i;1x�o�P�����e�����?��~���qOb��Ao�H��/O8Kx���PL#��2��t^�����
��jR0������'������k���#1�M�<�����#vs�C�;>��7E���%$�����(��
�bR��0`@�K��J��]�-Z�n���s��-������������[�j���?�()��fB��`�7��0� �k+���-��O�
�]��?�D���q�{'�B���TC.*va���+V899�<�hMa0>>>�����8h����F�����5kVTT��A����-,,>l����ts���SnI��cS�^A�8�-����X L���tZ���`E�;�61�B��N��b(��JgI]M��_��[N�{��������r����$�

MII������������?ZZZ[�n��������������s�bbb��*--�v���g���];n�|����~�!//V����_r��W+:X�����'+�`�j������WP�m����������a|�	X��	���_!���:t��������A��#GB�=�L�"�G��Owqq������0%%���g�������G@@������+W��H�\l��JQ�s!�
��A�j���.�������ERF]y����0�W��zj	������u�U���$�7��O����s�Sx*WPP�x����5�N�@HHHqqq�N�N�8����1v���j����A�b����`A��}��]�p!00p���o������c��
y	y3|������Z��dgg���o�q����mooudII	To�������q�'O���x��Ua��}���g��	����;�l�)X�{�X�%..�.�����W���6���z�bs���q�o�vo��E�����^�m��c�a�PP8��x��dN ���V�O��WC�D�8p �I_�5����!��	T�� �������TTT@FO�y���TVV�4���$�B��|�������B����A(����t�������x�&M�v��7��+�FFF�����P�B(BO(���������T]!�9E6���j0��RL�}N�Pl���Z�������K���E��}���;C�����9�O��"e:�1����@�y�2�����B
�F���n[�d	$��
z���g������n,��_�]��n���N��48p��Q��<�j�����Z�a������ZXX��P�3��n������p�m��'O����9s�Lxx8�Z���E`�&&&�|O`�����|�(����\l>DC^��X2z�R&�
��������HK�����|����rW��`���h��a��P#P�B^���I;�� �V�b�v���,�q���0���I��W�f�������a��	t�������088x��5QQQ�Z�-ax|�Z�� ���|���hcZ�2���Z65���;�����>��>77�~JO�&������o��I8���Xc�8k�s���/�\2^�s,�Z+��D��'��;v����V��Ehe�X��$�
)���'����sy<-u�]j����c��%����5����dB%l�Zp���YYY6�7o�����m��A��>}�
U�?�U��������]�~=��V5�Vt0�3g�		<xpQQ�����32��8�7A���������������G���������v���}���;�;v��3g������33��Y[5���AnI���q��b�V,b��h�k��&<�X3����~��O1�e���2o�1���Z8���\d���.
Q4~��>}�@��?�����K�.P,����z����%�����#G$����W@@m]K�.����g�~�Z��,Z�(((��o��;w.���o��aaaP#zxx=zt���tO�A!������R�:u�V}B����l6���kK�,�x��600"v���������VV�I��()�E��S��s�8���w��$F?�����<�'���,N��5��7�c��?n������P��"�u9�v�Pe����_!�2>���)�P#�]E��hggWZ�?�DCC���K��c����{7LT;���`���[dd���%�c� ���hg(=/^,|����P����R9""B�FIa.�������D"8�B�W+1�����*b�xS��+��� �R�Z����>%h���Z
{���I_@��(..655=|������\�������$�!a.����d�����b���H�o��������P�O�=�����Z�h���j��fMUUu���P2���@�y��}}}�	sQ^�$�<��Bq^[��.E�����
���"��*��/��%���q�Y'�
��T����:}�! �QT�\�K)y�k�'d��:c	.��C�/��7
����������H��&�5�NZ4��^v
�9���E��&����	9��S��Pt����db(�u"$��qN� �T�V�
�)�3n_��xl5�o!�d
sQ�$d������2��DJ����Q�":���'��g83����jO+?+"
��v�����NR�B5������;�B���`Z�I-�a������h��pTi�]?W��W����;���v���Mok�#�
!�P���(7�������R6�2�jL��-�FBq81���))*+)�}�����s\Bmb�1f�8�������B���\�����~y�S�
�2[��*E4��M��i�&).��I�o���t����$���v�l-�� �PSU���]�v-Z�h��us���[�����i�=�������7o^||�����U�&N��i.~#�|yn�����/�
d��TBq01�C����������J�[y�(����6��� �z�`�����?�e�Y�e�;6`��j�}_bb���%��f�>�� N��/ROfM�$���������O_��������t��1���G����7k�����A�9;;[XX���Z�k6.�wxO�r���������CeA�(�P��������*=WB�h
���?vj��6"��D$�s�����zZ�d0$�;]�b���S�s�y�0k���z&��?����������_�B�"Dx�0ZXX�������a������#  ���M���+W�g���s1������7�8�P�h$Z���f)�"�����S������T�}.W,�JG]YZ�@�E��=|�x�Q?Os�I�e===�����_~�����&BCCSRRz���y�f��p������������[k��)\���#=z�(((����q����f����n��]���o!���9s&������|�
<��D�n�D��c��m��]�~]x�DZpp�?����c�t���/al&L�7��p���:	����k�B��#GB�=�L�"���F�X�g�H��
������Moo���H�!������W]B���6&&�S�F�w���,<u�����3Q%�m�:��hy����\N����g��$\�a�����_"��"�={�/H������!�{��8�O{�T������p��Ekk����2�������N�:�8q��'66��������@�A.�Wx��\��{��-���������=b����6,99�����B��o��tRR�p��^-KKK����;6t�P�"D����'O�{D����W!,`��!`���v���e�H�j��0k��q������+�l �a'�=z��-����T]LUU�H@�F���o��������������8�_���gm5��6L5&yS�� �?�a�T��T�c~r�M�sz6�R�����#krR���������PK�s���!����k(�<��0��-������+W��Z�j���5@���\�n������JLLLF�}��9�NSSS0h�`2d�Yy�ADD�+������`�x���DpDH��/_�7n��EP�B.@@N�4	jDH;;;E�	����$uK�:�B.���������-��g������F����P]@�F�������Y^���M���l
\���XVQ��������E�Z���� D�����R:O1/�<��L�]yz�-�����m���gb�f��_�I��H�)DNBBT�P�m��}��%�.6l�����={���;���gO(���P�A�@]��x��#G�<z�rq�����C��n]u�9��PMfff��T��1&�a�������7�P �)�*//�r��Z�������>>>g���*��e����#�FuK�:�B.��^#��f�����i������7�T�������O��?�d�o������v���Y&���N&F����V�N���0J��%�P{�?w���96�RX9B�W;���_�����]9���*1U�Y���u�.<�B�W��8} x�n����	&L�@�|����_�z�������Keee���P&B(B�F�H�bQ������Ma#��O�<����g/^��~��������S���f����(-D�0��X��s���5����@������jR��C�%�r1>>^}@Y
�"|G�R����Ja��s����(|+����l��������->���<�n���1���OB��p�TB��Mb��q�q�l�z�m�t�%*Jx&+B�u���j�Ho��|��T��������'(����@-�z����,���7�ltt��m� !�n�
����Ho������������-������xL�O�III�O����:v��#F��Ro��yP���aYx����+|��}��=oP������iJF����v�s��}���t�=z��?~����k�.��'�+����=�K55����(�v�T�3g��,P���/4`�>�*}�^,_�^����[YYI���? 	���&&����M��*�P�(2��1�@��i�9��wR�`r��=�,����XV��w|=��O�B��*���)���_tN�[�sq���}���,=�"�%]�.]�X#)5���%������#GjY��A� !�`�[�nw���6m�h�)!�������X,�^�z�;��4�^�sg��=g
��G�B�~��g�:U]-����"�!�.]
e"T�P�B�����l����	U,l
\?,+����cY�8	"Y.�������������&�2unl������mKx��"&�o�����H����)^qQ��S��]�g��y���!��a��Y��#�h_������C��tU����#|TSSS��F�wm����pZII�����>PG��}��J�166�c�urr����������K���c����{7=}@@�3�������V;Y�cY�8	����C�h�}��~���L��"�$�����o'�����S����������?���G� �d�������<���+)))�#T��EY�o\�|��
	N�
q�E4��w<^��uV��H��/�w1]��8���!��JUUu���P2B�
U��#G>v��L`.�L)�h�E�2.�Z;Tq'SYC�����N��E��Y��#^���?t����[���Y��f�)�����;�U����HIE���������_8����N����d��1~X���B!��(��G�vyV~i&=�J�2P~�a�_��Mq��������
��e\�e���[pw������j_!��\l|a�������)�d����H���S�����J�o?�]6�GQ=�CT"�,�xb�Bb�\l<n����!���Y
c�5�ASY
m;�0���"�zv������'W�*n~�l����0�.�;B������4{�U�'��'���0��a�Q�Ii�-�O��B<�-3S�cy�~{��� ��N�{�����(��#�P�����vyfFa����ts�*L�DU�M�4�[��V]���8n�����\z�5S'g����=�3t�ja0���elN��1��1�Hp�
#b����FI�\XPx�[����	��m�Y�M��z��7|�jY0�S������*�^��d�XR���U�����U�{�Z?gH�z��Rz6z�����!�Z6����Q��������;G)3�#�*�P��LWJ�nkf�9�W~�y�>�wW�y�i�f�!�w7�A!Tg��
���w�SX�K���2f[&E(%b���x�y�o�+];kp�sz��0��t����s�b�j�0�������?��w�eR�D��'*��D�c��\Z^q���#��"et�K#���\��Ik�!�0�)�(u��yO�*����b����35��=ej�m������[�������g�_�2�T�0n�B�`.JM��������T^��^������H�w��_HZ-8Ub�e��?�zkGW^1����[�h����R:B�w0���)?z��O����R5���4a�?�c�!���Z���9�����s��������
�jR9B�j0�+9�����I��7��V��Y1*������l$L�|�*�����r=�OJr�B���������#���%��u�T�f/��~����j�0��R����W�s*���6��C�%8���{�7���N�%gC��������1��6#����� ��_�a6?��4�twBa.�Q~i��K��
�g�e�C_�F�1�\u!�OY��P�	����{�}/J�T�p#CI�Z��m?��B5�����|���������ym���:��N������P\����8���x�U��,�P�N�������K}B�sQ2e%���^zz��B��S���JB������k
�j�(-������'�q#����3�_������dQs������h�$�?�7�P�z��P��.�DC!���X8����w������[:iQ>�LE���/��9D�u���p���R�������x���.j�B�1a.~Zr������|H�*Rd�)c�����p~��(���O�7��%,��k�r(�l�����1f�He�!�$��X(���<�`3�Syth;u���a�&F?��L|��s~��1q�y��F��j�P����!E��Y�k�B2���Q�������eV���Xc����0)�O�|/��������{\��c.����0���ms��
���T
HQ7���uRX*��A!$�����]�-Z�n���s��-������������[�j�����(CP&�y����?`�n�R#s��f*Q�&F�5	W�;y<T����eO��&�)j-g������Sw�Za�#���6P)��TK]�3=BI�����������CaKyy��Q����f��5h� ggg��4���Hu7%����D�u�d����b2�Ik_��G��	���{�u�������S��}�M��������VHB�L!�dK�\�>}���K����-aaaJJJ�g��iWWW���77��4�\�Rj�(66���G�ntj�J�X2,�(����4Q�\�F?N���z|�����M�$[w���M���9t�BR"q.B(Vk�����������mLL���v}%U�����uciZ�Kz�DOc�p��hHWJz|M���+���x~���/lL���Y���[i�!��TI�����"�������	��Qt������F����?rZ9��^����w����S�-��
	�����<�����������{2�d
�u�}6.��g�m���BI�rQ]]���X8[XX�.P�F��R�V�P>|x�G"���.�4��UaR�M���J�/������b�����s�����������|��1��L[�F��AHf��J5{R�E{{�M�6	gcbb��X�Q}RFa�������[�hS3��4���
�����J�*�����~_n������3h���2MU��0B�
)����;����c��9s���CCC��_offV�����nEH����6�U��-�
d�9�{km�z>������������[�F����e�������QU���G�B�!I��GMM
&�l��k��,Y2s���[����._�����������e=HB���7���zB�R��m��h���ji��0u�\OZVA���]o��}+l,�R���X��T�@�CG!��$�E&�YZZ�uX#"�����RWT������;��q�c2��ig�������gL������oNw�x����)�������'�V��:B���R�w���}������L�3`�mk�`���������-�������6�1^w����0BH�5�\|��|���c�>�8hR�������1�b����1����sO��:t�����}��B?k�6�7B!Yh��XVQ|:zYP�i���DmE2�\���|�7�0��YId�����Hw lacS�M����W�hI|�)�BMY�������Z����P�"}[+L�<M�haj}rq.�w#�|����r�)���
�U���?����vjx(�j��a.����?r���a��:��NC-��"
�>L���<��]Z!{z�%���V5*����Y�,E�!�P���r�������������u�`��o�-���'�H����V���9oE�5-�����Ab}�BH~�C.�*H�I��3B`���dWYJ4��>a�D�,=�4<�����SC���7kh����E�J��7�Wa�x��Fa���������R��BM�<�"� )������yv���T>��HU�����yU�ZihL��������8�{��vw���!U�[@)?u�~���V�_!�Ps �H��K*� ��h���x���ey&�����Uo�*��w��_�(H��+JO}��w�['�*rD����������n�
�?!��,��EB�'I��������?���I|/��)����x�?4�k;w"�FX�����w�U�����Qq�O���)�Aw!�PS&'�����y���i�)x�-����1V:��?������V�-)z�w����EI&"����=���s;5�M<B5qr���|��0O� +�U���j��������7N�=�����n`Gn�h��D��],��J���BH��M.���\H�Tk�i�1�m��rM�s*+y}b?;p�U�3G�f����Ui���^�m�-S�B���\<�?�'��b0�4>En��W{7��d�)mO�t��
����oG�.B!�$�X����Kz�w�v.�����av��	n�X^�z&7`g���"�8����������VSi�a#��Gr��i����m_Q��K*8��oL^���6����8%�&����-�������F!$�� y<n;�G5>��8����~n*r(�0/*�f~6��W�&�k7�(B5r��\��_���K��<�#9��0�?�Li\k��>~������Z�!�$&���Xs4����u(#�3
�S��r��{�%�"!�P��A..�'I��6�������'�RW��?!��/y�E�(�~2�I��
��)N'����<�B�&O>����A�D4�P�Lx���3��!��DnE4�PL#����f �B��\$�hT���x����D���9����BH�� 
���U
��0=��EO��W8�1�EM5%Y!�P�!���b.i�!((�3|���B����\D!��trQMM����	��G�lll����������j���'�C�7"�B�O
��f����������������F�����5kVTT��A����-,,�l�X���B!II!sss�����hcXX�������a������#  ���M���+W�`!������999,���;22RKK�G ..���^����6&&�S����
!��)�"T{S�L�9s���Go��5t�P##���"������	��(����`�/iAAA�9B!T�r���|����t���'M�t��YGG���ba���Bu1E�?d��j[�P��i ����������������NNN�,��VTT�����i��OLL$����B!TR����������B�p��)www.��c��9s���������733����B!��@
���g�-[�x{{gdd�t����=00���w���������VVV5"�B�O:��O�����Q�F�B���u�B�*��!�P�E�B�
�"B!Ts!�����BU�\D!��`."�BU0B�*��!�P�E�B�
�"B!Ts!�����BU�\D!��`."�BU0B�*��!�P�E�B�
�"B!Ts!�����BU�\D!��`."�BU0B�*��!�P�&��������������[�j���e="�B-��s���|��Q~~~�f����4h����������B�%�}.���)))��=�]]]=<<V�\)�q!�j�d��qqq����Y[[����!�PK&�\,**RQQ����B���B�%�}.���g�E�Cpp0����TPPPc!�P#�\�����i�p6&&���Q���!C�-�8|��F[cj�;Ep��M3�/Y�
�����;����c��9s���CCC��_/�A!�j�d��,+00���w���������VVV�B�J��#""d=
�B�i�"B!�D�e.6����N�/y����\�"B!�@0B�*��!�P�E�B�
�"B!TE�r���+��}~~��������+Y��^v���h��u����;�n����3gNHH�������W�^-��~FIIIzzz�w�����x����8---��}��7�f����a��a��'i�����U�Y�R�����$7���#G�<{��������N�:�������UG>>>�G:tm�5k���ZJJJff������'���FX0r��'�r�����{����k�n���[�l�:u��{������G��l],\����n��������|��G��kW��9���bbb"���_t�633���{����=
	M�>2�������,�K~��������idd��W7<������;<�@"B�hnn�y�fEh��&{{��O��c.�?>66���;55�4�_�Z~����|��Ga��?Gb��\���5008y�$������^�xA���)�j->422���?8�b��x��2[��3&8N`` T!}������6m�����JFY��Nrrr���iq��)���B~�W����/5�h.\���?Gb��\��z����Q��������q�FkkkYJ������WZZZ�������}��6L������G'O����?/a��7o`w��]u��W7�����}{aK3�������;+_j��������&�E����CCC����v��m���0a���%5�����<w�\����,��?//���o���G���;w���w??�3f�z���~����E��/d��rl6����/5�h>y�s���&�^�jeeE~cll<d���/���P���m[ZZ
�jjjD�Y�����$����#F0��=z@�_�pr2B����r�N#-   ))	~@Dp�fEE�&����/d��r5�����e=X	��\1e���������"�������8;;;�s�r���S������_�~+V���a���/�?A"�AI&77w��IPs���+11���,YO�P�8p@NC�rQ8�f�����?�����[��Y����A���;+�A�A��x�s$>��Ex���o��3���fG�%<�O�p8�:���v�����3�n�z����S��j�JGG��D�������m�5kVZZ� O0}���?��Ti>>>����T�����_����a2�x�s$&��E"8�
�zR��������5�x�h��h���'����p��C8�~!k���Y�R�����$O��B54�E�B�
�"B!Ts!�����BU�\D!��`."�BU0B�*��!�P��6)�f��rPIEND�B`�

pgbench-rw-simple.pngimage/png; name=pgbench-rw-simple.pngDownload

pgbench-skewed.pngimage/png; name=pgbench-skewed.pngDownload

�PNG


IHDR]T�)	pHYs��-z8cIDATx���	\�����w���D�$E�(J��K��or�"���u�w?��b���U�Q�t,�r�"!*�t�����3���J��4�~>��~����w>���z���|e�\.A!�����B5#��!���"B!$���B	a."�BB��!���"B!$���B	a."�BB�����<77��{�����2��<gjj����������n���S��C�7"�B_^=sq��%rrr�2��*..������t(//3f��5k<==cbb�
fcccdd$f#�j#�B!TG�����/'$$����y�Vsss�����h���0yy��s��������#�������l\�vmC�!���:�bNN���7D�����EFFb2**J]]}>_bb�����Yfff�����b66`�B���s.B(��7�[�n���f��������{��mWWW]]���"EEEA%%�">1E_1((H��K��+&����"�Q �P�\x������hc��:D/���o��i/^���*..�),,T��Qt���G�Q��#�Z3x������[.������v����OI���x�����'�������>,KNN���|����'���CR����]B!�����(X^�~=����{o��5|����(cbb�������3����{��y�"""����7������B����wpp��}�����w�ttt`y��!�����k��Uzzz>>>���ujD!����������s��t�������w#B!���<p!���"B!$���B	a."�BB��!���"B!$���B	a."�BB��!���"B!$���B	a."�BB��!���"B!$���B	a."�BB�����0�A8������U�����!�V�6K��ARs!�r�C��+(X������1�$=&��a."�Z��X����]+y��V�����J(�0�0B-���}�P<�s%�aOL,����t�\D�9�^8�Jn������X��A�����yyynnn{������Xoo���$MM�u��M�:���!�yP�=������k$���k�l=��#K�c��jQ�\\�d����\^^>f��5k�xzz���6�������!������������
%��H�U�>������u�p7U��1	Z��N���������,p��F��>�x�����ww�7o��jXX������sa���v���~~~vvv
i\�vmc�%B�ex��W���I������$}���C����&9R����)� �I
��FORR����s.���x{{C4����-�������fff���
i��� �Z��r?��&�!$3���mH��������66�_����#��wn�<�����8�Ru�E�y��u��M�RTT���(XURR*�kH��+��U?3���BRD� [;�����5_>bV�W���e���UF'�]S.T�e���������3��0�|��E�[.<����G�QEE���X�ZXX����F�����T���#�4r����IB,�#��Wy�L�SR%6���5-]5B:jS	;a"�:�[.������v���U������x���O?��u�VA���x+++ss��4�{B����w���`L����A��X v��� "+_��^f�G���NLo���V���(X^�~}ZZ���{!9��������y�fCC��46�n"����t^]y���ayi�Ged��=�F��}������w&"��y���������	���Z�j���������1�tmH#B�ez����Ln^�]w��V}T������P*�Vv�s�^�oh���gyE��x��.�^�Z�leeY�CB-������%r3��|ZCC3b�B�]�y������*�����_]�I��+����0�-��w7X�V��j=p8�P�aW��xqy���v���$=��~�yq�W���'e���UD������?����zNp�6��LK�w~;�������������s��EH�"B����H�u��E�y49U�S�#up���Z[�G�X��'o��^BBuXXR��@���D�nS��v����U��g`������2<T�=C��"B��@�
%���eR�_�Qu�I4�G���PP4�� ��q��MiyM7� �?�����z�=���������[F/p	w���l�g7/�����;�j`."������C".�.����:���8� V��SL?�y�x;)�AjL�����'�������S��FZ�3YX�s�����������[�\��Pk�����
�:�'�Iqa�G�����4�t�=�&;������<LyW�V���;Y:[8Zt�k[c�A���/X`s��r����7����m�s!T\��$7���$�}�G!�&��n}Ee�<����osk��������P:[�k�pmF}<��UZr�����D�����mSw&o��	�l�8/�Z�E���b��4�4�0j�8-)��s�W fU;��CG~u�51�
q�_\v#������n>��1R9����\�v�����J�}��U��Go"�f�I|]�����.���"nJy#�<�l�X��Z�E����a�K�~'�n
�1��~�\?MRW}�f�<�O��p	u���+��W����8������Xw�0�+�i�������2^E���" -���#?�e�}��6��N>9�*B��!���{�W����>&�����-�@|UuVU
�8�Oz:�8!w�/o�����j�2bc�3d�����e�?Z����i7bR��K-a���!Uf�)g������s�e(���\D���P\4T~II ������"��B�L$}��g]�����s��K))�zeE����YO�
������8������'���F�����M�)�S��\�.� -8�PM��1����Q��
s�V-���aG�RU*+�Dd0y�X�D��|���B������O�p�Mm�FE�+�N�};em���r�[����$�da��6���iL���������F�YOI��9w��0H[���!��
f���{o������TT3�E�Z��G�V
��5X������9����������w3����_7����e�
��a~i��W73���y����nz&};��5�I���"�7I�Bn��[���t&��`���
K:Y�S�%�0��`."�Z=��\>�Ij�w�.��77�/N�E�/$�W��������A���
�]v������/-����"��Y{��������@���P�y�pY�����L"�I��H�u�--�PW�) U���x���2���/	<BRaM���B]g
����FS�D$�e�Y��������R���j�[�����|_�W2��Z��]sLe�����B�����[��3���D������WK+��DVC�s�V��mx�7��7~B�}�ULR)�E���8��0�;(+����<��+|UQ���bI�{�0���W���#j���D��W���T&�o�J�nm+y��@�x������I�ZU��t�U����s�����w�����������h�7)}.�~�����@��(�dT�9�W�q��\��,r�3���|�~3>����7���V����]�|]Y�b�h���.C�(�#���@"�/�8t��I<���R�@��"B-��'$`?��i��K��s�fE�;���/�����(��p'�0�\"]�e���G��Op�Z��,�(~�y��a���Y�5��c��������=TKCI�E�J'���_5�D����9�:
�#�_�}�Q����s���!�A$�or?��+.����/m��y!s�T|z��*�[������}r)��y��_Z���a��=�pk��MAF�����J��zE�������i[E���(�$J�D����%��K�\D�)�'AGy5������:�\��(��
M���X��_���u8�a�����js��N[ev�ho���P����fR���j��b��AD�7Q�E��.4DHr�����rn�z��	)��l7��T�c���+�-u�)���#��K����!�H�zx��?����*�*(B�iZ�1,�s��p����%���Pc�+,#�<`."$��$��yNE#JE�=l�U�A[�2���L�;����8c���n��2G���+������u���+��y�v~��RW��"(
�����r�k�:59��V�E=��j2��)���������o�j|��I'v�����]�/��4{��m�&{�'f���>b��������z����)��'�X�a�8��A1�}��g�c��(��%�W��y������� b�U��������E_S!���&D�Q����B�&JE��s.�\�255USSs��e^^^�����f?i���355������NJJ������:u*<$~#B�f����Qrv'�PEz�g5��'
��=�p��a2���e�[�!�� ��n�����������r������E�|U��y�}QD��P�,������A�T#��,�G�a0�|W�����R_�%�I���bzz������9���r��''�>}�XYYgddhkkz����3f��5���111��
���122�b���!���E���I��I.�����._����4��P���im����m��#��+��;������@���N(�Wb�V'�4(S��>�BE+~Bih�����u�E�s���B�!��t�������=�������������#G��������q�������i��r����\���Z�4a����i��D���q������|~"o�-)o.o}��fJ�^�U��mCY�a����P�;��z�~�8d�sN8����������q�`��f@�8p����ww���(uu��|������'������Cv�����B��x���m�MXS!2G��~���u�:���R�N4�(��b�����g7���<���������M5��a�ATeE��G��kv����C����y7�����Oo�������SSSg�������o�vuu���-**RTT<EII��O����B-�����O�o�G'�v�~�z��'����B��Ac�[/��mUk������Q��n�������������V�Te�fT&*���/��S5{���)S�L�81::�p;v��C��������i�.^�H�(xJaa�
����/$zR-00�#GH*�M����v�8y�������j�~�Dt������zi�uV�HVd��Om��[�Zz'�4��E�s�U�,���$��3�}�1�0�m��!���-�={6j�(&�ioo���v�����4kkk������377��u���������7��(�q�(�#p�C�=�">���0���^N��.�]��������2�	N����cc�S�������"������bt�^�X�3�K�o�j���
h,�3��f0��;��_�H|u����\(��sttLII�z���+�����bLL���������3����{��y�"""����7���D{�P���6���G�H1��\j��Vn^��T5%yW�E�mj;��������#���-y'�NQ�T��kC9hQj2��5�zJH���Yq�-"_�rj��;wzzz�}�VYYy��)�����������{����C���!xyy�Z�JOO�������N��	����H��!P����[��7��*)�4��UY:���k/u�O���_�}������^�$Vy�P�r�b������,�xss�}�m�`"��,�SN��&�FN�����I�:�8��J��*�VVV����nD��{r��l$�!�~"n���9*�U!�3����w�f���e��%��_Zh��S�^�S�������-��e�D���W����Im8���,C���'yg���F�E�����$<�K��+��@Q��~#6�9�Qx��G�v+'�����SW"f�
}����������b�A9i1,�	�!OT���0�:��|���D�B�>��Vs�/��rd	='���b�u�$��9:W4��������I�����������dLJP���+���Z�����
u7�:T�ih�a."��d��c�����+�)���W�n���`���D�?��C{w�q3�%�B��]r"���h��q�8lG(��f�Q�+��{��s��W\@�����Dgq+�9�7���W�s��'b�����M
���I5� m�F�!�����"���,E�
s���a�arl3o��J�����e�=V�P��ha�vz�!���o��U����G_�~t'E%&�������������\�&��Z�E�����?���y(~P����1��P�I�K��-
���?�g
����4�����-���|j�6����\����h�h�t��Pk���P��$�!���Z:A��L����&���Cw�&����R��\.������>|+:g�<�8h2���3�0���������52�G�P�z���_G��$h�*�E����T7�f!��h�Uw�������e%�N_y��MA�h��"��Q��r���������Vs�FR�C�n&����\|��S^�=��!��L���6�0��j��S�q����O�
X��F�"�5(U��x�B�/�'�j�����$�,��|���6�M.�]����Z�z�m���>��(x�n��A�gw�i��3>� 19����6G��Y�2Ud�����zjw��+��{B��`."TG."!�y�.�X&�V��G����,��A��(Z:�U^=����JNF8��p�=�r/���L5����*.��*k�$r�M�;��a."T��}*�����N|�	d����5b��6\RY�)�7�v����
��n�����W���n�D�r�����B���d	BH0����v�~��rc��#�%D��Q����*+ByY��H����k���������.?��[.<+�AH_M9�.�L��#
�M�#��a."$�-%W�������z-}^y*
��&4�0����p���g��W]M�.�����Z.��tf�'�5�����)�eS�zk�L�$O�����lYu��9��IA��w��o��\�������b��*%��M?v�P`."$VY���^U�2�d"sTW��<�\��M{}�B���o��"��%�t�l?��r
8gB��"B�s7������IU��P�'�nr�r�r��}[y�DnYJ��s�b�D�� F��#�'���b2����Bu�����e�%{~"���e}&�b�R�
�B�[�K��	�rc��j�C�^(�s^������{�KE��vo����WOc	�B�0�	�C.���]�O7TS��dt$lB��R����N��
�0	'$��o���i�������L�vcz�2�s��> ��s�j�?�]��xW����?������O��*����^�������(�Z�6.���?E�M�.cl�k�K`���\DHDy)9�9����|���g��3"w����i)��/'��y��8�����Q�'<hJQT_�c�l���S2��j�E�>��E�XD�*���0dw*��,�)!�g��xu����=e��Pn��L��h�o3�f�a�n_x��F���!��d�Zr��p��#�N�J=*kG�:t���y�����r)��t#�w-c�|oy�f�o��� $�������+W�LMM���\�l���4���z{{'%%A��u��N���F�������If�V&�����WYo���*o�f��!T�������^��3y=�*(����Q��Q!)U�\LOO?~��3g\\\��������O��={�3f��5���111��
���122jH��)����^Q>��#	>.h��2�=4�������k�������}�h���'7�9�������e�m����#��N�r������@(�2�W�.]srr����������#G�������kH���k_u��}!y��^+`*{s��{���;����J�/�ix�]u�JLJ�`�*�;���BM�n����7n�8X`��P>8���s����>fff���
i��!�Y�����������f�����
�8���A��f��u����L.�S��*8�Q)��Vl��Q��BHJ���__�����m�������EEE���[())�5�Q����� ���!00�#GH�UB�����+�����������E�[O��r��/�p;��N����"C��Sh ���`�0������Th�Ci�����m�:M;�jd���)S�L�81:: �TTT���'#��5�Q������ G�Q���V����D����e�D{0\^UX�e���"=S���)�k�pr��G$��]'����*�Z��W���������|*��;���_�H|u�����g���5��d������]�r�q����>���VVV���
i��!T���d�y��^��S�}(���*�r�<��[\�}y}E�����T
&��h�iE��z���m���B_P�r177w��iAAA���)))W�^]�b���3����{��y�"""BBB6o�lhh���&�[�q�����f����4�{P.i�L�=Pk����Y��/��i�05���Z>�l*����"�����<��;w����|�����2T���c@@�����U����|||����sjoR��<��hz���[J
>@Yr	e�+sqAf��Kn�^�(�4��������
��������hee���5�?�������^�M���F&Qm���#s�Gv���s5�#8���`��>�j��������4<F�Z"�BHD�E�
�������XA1�9�����[�^/��)�%T����������*j�0Q��C6|K���k�D}��(�gg�ybf|!I��;�D��X�rz�5��$3Z�P3���Z.��n%G6
N�9Nu_�����w�����W��9��k4�u&���o<��1B���\D-E��)6�#��|"��vV���q?}p%�������eG��v���c�$5��`.��p���}�v��0������X�7����8��+�D�P�GO��S[��.B���\DR�]A�l ���Yl8��B�����}t�e���^���N��,g��b�m-��"��;�E$����_�����|2��7��}��D9�_�`���+�	����d�v��%"�j�������~�}G��R���m���}G��|���0�)�b�6cb����*BHz`.")���?��
�9���N�k�5���[����\��SSm�9����v��XBRsI��<�y�����w%_�^3G$��-[�)�p������>�����x��B!qa."���@�M#�_�k���aOU�Z;�����	A8�t������F��b��Hz��#[�R��:$jH�_�_|�q�Gp~�~�o�7um�G��DI5�E$
8l��������R}��S����_�����v��L��=�GY.�a�Jn�!������������^o�A�l#;��� $�#�;����g���5�Hn��s5oI���I�ud���MWg���z��2Y���?�6/LD5
�E��E\ ���>P�w�b�p������L��x=GO�����Kr���s5K\.��H�o�-�e���&�c�y\�UV���rj���`:/�@5.�E����Ms��K����%c�O��\M�sJz�p��#���"jf���������E&)'���NbN:U�Ww�������!��a.��$�.Y;�d�%F2�*�Y��c��N��s������:"B���}QNai�v]Muey��s����C��m����V�d0���[�l��M�.\H����z{{'%%ijj�[�n���uj��"j6�����V	qU|���+����2d�f�];�l*~�����3v�n~-BIA���w�og�������"Zs��������wNt\^^>f��5k�xzz���6������H�FSS�Z^s5����?uB�����I��~�M�m:o�V5����C�V������Ceo�7n�@����������t''�m��1�%K�\�|���w����_?�-���v��z^�t)>>���������O�n��a��I�'((�������@���=���T���`������';;���0aBJJ
�a�����tt��s��lp��)�-�G�l��������r���t=z4�v�s���{4g��>}�@OAKXX������sa���v���~~~vvvb6�]����"���r��"�Kz�OT>����M?�d���Z2��B����b�][VV����&&&�:e|���������g�������IHH��{����OW�E999B���@x"$�����\�����C���'N<u����+�%���#���_}���y��=lj��]�/^���677�:�����������%�O��s144F8h����4���((�����}�vH�*��*-������U333�@CCC���u��������������/]����!�}����Jm������O�M'o�g
;^�3�*��f��y�I;+I!�����h��K}���;��~�a�<7�O�UII�����:i��B����r���Z�|5j|�����
������_��������,;;;C�%''w�����B��[=�",C7�=///.�;m�4�P��K@$C�
����.$<jDH�n!�'Tuk����YTT���(X�oN����o�n�����y>s��{��AQe)$9�!�PmmmA�&=��Z�7)��QE/��j�J�������u�������D��P��1�T�P�A��b�
H�-[�8::<x��~�>��7,����W�eAs��Q___X�<��k�S�fE�������x��W�"7n����U}��_�
:�������?�|DD�Y�����������rG�ZXX��'fc��[.��m��
B�{��
������t�MJ)������/j	����4���[0Wmw����fe���ot�<V��C�����c������E��2���w��1~�x�0e���������7�~�z�C`��[����Y�gAbA�'����[��B�%�F��g���	����@�C�E?
�'T��|O }�n�*X�����������-�"�5k�������(!�edd�����������5��_$�b�q~qg8r�8�����]^Y'kY.���jG����u����XK�Z���DQP31�Lx_�@-��/�dggC�@e����;w����x.�|%>�?�U���������y�fH5��j���` 2=<<����^TT�d�x�����%���������!2������~����-,,���Gw�,ggg(@w��=o�<�;CBB�U

�l�}��<���������
�XLMM�1c|��d�}����+|���/�*�~xI�����<�q�05M�J;�O*/O�e�8�7566�*�?����i�]����L��Q4y���B��;p�����W/(��Ar���v��	J��'O�u�'N�����
���������/V���`�-[o�K�.]�p!�����5�Z�/5���#!,v��E���
^�eiiY����eeeX`�X���+V�����6  "v��Uzzz>>>����G��Z�'�����5k�|��7�
���C��������_�~5��_�*����#�1r$y�����%K��RQ"�N=�o��i�d��Z�/�����|��u����������
1���Q555A�u�����lFFF���]�����PUU�v�����	���O�o�����EEE}��D}�%�sG �J�����r�$����������wc-���������0`�������fmmM�B���Mz�I
����h�
���w|G�{����K����P��i�t�>mU�����=�L�C�>P�8qb���7n�HOO����-a&M�y.E����k���@������Qs�$n�c�GM)�(+��I��W?6�6��gYf���F�JJJ���������'O�^� u��+W�����������������{�NGG��Y	���/j�n&k�z��
.Y��v��p���\��+N���C.�L�o69����Xj�D���`��$=
�������k�!�����tQs�]��2O�P�wR��T��".|�Tcn{�y�_���NEo��C����%���56.7�`9YN�\���El�4�]���������;U�5$8F���E���?����[��q<�������E��A����RT��]!���a.�F�N��C(�Oq��/)~0"���	/Sy����j�0Q#��2Y���W$�K�wK��rn�WJ��U�h@��s52����gD/g}|g(����A:.�,B��"jTw{����BR��e6�������S$0*����GDS����6��N�:th�i_������S'�%#��wxq�4�)
�o��e��m��i���t���}!�"j4����Y+wp�j�i9��QU�m������/���$�a���7�I��>{����������a�������=7h_������qp�����Xv���z����Dex�W����e�C��nr���(BMm�7�P\6�Dv�CL{������C�o�7n�t����������t''�m��1�%K�\�|���w��Q����?-���'���

 e�����C;���E��������;!(=<<�>��__�tiFFl^�o���[��{���;���_��iAAA?��c^^�
�}���0�)S��/�f�����?>d��#G�l����=j;�9c��M��3�O�>�,1�/v�B�E�X��;��'8[x��KquuSa������it����d������q�r��%6Y8�N������W�^511III)�...���<s�D�OBB��C:�>}�S�8t�P�<�E�
�����	�uvv����~��Q'N�pssKKK����$��Po4��������*�,�C�FDDT	�����'�:u���^"m����O�=�s144��	/e��={�o�^}fp��/v�B�E�PE����{����l��Z��tUSE�3"��b����w���������!����5�Z������� �,Xe��c��r���&���C����q�O����r���;*�B��PGG��������c/]��i``�l...Lf��t����P�B�*/���SSS#��#B�'''O�4i��eP����B@N�6
jD��]�B(BO��5k>�������}a�0Q�d_��������P��oqR�U���jI��� r^�xU#Tl�v�Z�b���-[<��?@(���P�A���l������<y2..rq�����C��k�N�
% T�YYYm���[ �D;L�:2�SS{Cq����HQTyy9��������������?�<T��-''GPk���Gl���S�go_X;�ET���o��,�\����}�Y�-���ji �:v�W��8�� xv��1~�x�0e������������fjjz�����2===(!�V�?A�A�(zoa�����4B����#333z�����7o���o�����3((h���1110Z�FAA��g�h��������l�Iq�_������������_��Lb�p�0k���PS2�C��}�����v0������p����-���_�P�AE%h�Z��_~�������m�������"���S������\$�C����+�i"��O?�	!G����#b588�����s��]�p�G��|�0j�((������*O������w������Kt�oP�� ;�d�B��o�=>���N������?|����b��}t���b�/�\D���}y���'��o���~�z��IxP5����	$�:?��N�H��8y���B�;��K�^�zA�8�Rj�N�:A.����<y���
6r�
����{���Y�f�v�������[�d�������#�B�@zA��?^��E(}}}!P���oii)h�8q���������=��P�B�
5���#�)�v��{B/.��������������+Vxxx@q�en_���������XV}��
E�+~�l�B	�
�fe�r�/��I�O����?N/�UD��������	:�H����eyy�����}�������hLHH��f�������������]�N�0a�����Q>�����/<Z�bM��KKK�����}!�"�.��w��������X�m�7�I �ZLX��>�Uqq1��'N�>|��7���!h%=(!�E$.vE����y�]R�E���_�JvT!����t��!(�~�*�����:�U"0�X�Y�����_Xy�+����3a�@��
��M��e3�O����"������.:?).�W�8L�N~_����B�)`.��x���!��ui��T*c���W�u��!���"������M�*��K�"dx�l���[Hz\!�T0�'=M����G����"c���s��l���OD!�U�\���\�|ybb��������.]J|����@5������(��zL�db����n�B5���bnn�������g��y��='''[[[���+��7�DM!���}�c�O�Se�IIJ�#�m1B-_�r���|��m����woss���6�^�
��$jt���|����p-9jR�"wl��"B��z��:�6E���D����jr����C�V�FR��`����l��M�6-\X9�V�}�[.jkk��{��UBB��AAA
�Wdo ��p}��]||�^�W$�c�+�8��Iv`I%�rQV���]�7�z�jkkk�G�9f��������8hP�5���z�w���k77�
6t������
�Wdo ���>�y�E�j�L���R1'�o<���x�L!o~!��'���{"�!�6�����o�����BBB������m��X�d����aA]]}������Bjjj��]���K����2���4<<������=i�$��'��?�������={6����Y�������Ov6o&H���	RRR ��2����<x���#F���
N�2e�"�yl6[OO����C�9r�l
��=
>�9c��=�3gN�>}���%,,���>�'����,��o���4�^����$|s�;Xe����9�6�"�hgR�z�J�rM�f��L���?���j.���c����_����"�7��N�yF�g���!�[������WMLL u������}������3��������1v���*�(''A(##���D����_�r%  �����x�G'N�x��)WWW�K��#F����fE���{���]� _�x�mnnudII	To�����%�O��s144F8h����4���(������}�vH�*��*-���Mw����/@(;vl��tK���H���T���[\���O)�(�#d� �)�+�s�/��U�C��|�����w�O�,�����:i��B�&)��W����D��fjjZQQu����kXPPP�4RSS#��B�%''w�����B�*++C(����1t������r���M�r��_"�W(|uuu! �YP#B@Bu�=����K��I�}�[.�,�cG��"i��"xI�@E�����>��������s�1j�� f�P��*?��V�X��e�GG������c����������E����m���V!����7�7+:����7o���P�7�^����>��/��z�������?����BL?F���/���~G�T�\��RSSE8�=��x����@5D~i��+�SrS���5)��*������PD��^�"Y����n�3&�t	i��B��cGX������a�c�����C�)S�������������������f!��Q}�������3..N������C�*�K@��g�h������������Iq�'�;�(���"|�j������l�
$Q�e�m����k
�����i������K2j�	$�)O&�j��p��fb2�PE	Z����_���!l�m�����;w���7O�=Y���OgU- {�D300��n��R
J�7+:�L�������-Y��"]������%��.TA�vww�{������>�����o����?N)����y�Z���g���/~�B���tOB(�;����5��/�\d�%��u}*D����Y8v,�4�^�zA�8��#44�S�N`���'O���KL�8����6�����j8(7/^�X}���Y�lY``���K.\9e�gff��/5���#}}}w��E����]xQx���e���MeeeX`�X���+V��x�����>b.�F��l�:����w�Et��U�[v���}�P��!�������*3%%�B@e&xTMMM��F:t���c��]KKy�����^�vM���	�����*��V��������k���K���@�3����/<Z�ZI(X�aW�tG1[���7��gvI��L�A�u`����-���PD�1�$=�P\\l``p��������q#==���Z�C�\l]bS��
��b�n��,C~�@q��.?o��"�$@II���CP2���@�y��Immm�	s�	}zj���9\,k����Q���s�;o���P�$z��'�Qa.��������M�A����vT�>����l1BHs���,����B\��_���Om��{��V�q�����B���\l�8\���U����WMT�TH��
Y��8:XIth!�a.�d���E��+g���A-S$����[����"B�s��*e�q�#����j�������?�����B����2���|mf��������9lv��*%k���:E��4��(�����������1���b�UX?�����O�v�/����h2��=u����C�L�"����N�:�X,�O������Oi�}��-[�l��M.�[bcc������455��[7u��O56�bK���tc���Eoo�{�n�p�f�?��^�O/3I��V!�����G�\p0�`��mv����������a�������=7h���c��Y�f���gLL��a�lll����7���6��1[���w~�6��,���2f:���{R��x���A�(!���\	_).�h����~��s�����o������BBB�������m��`0�,Yr��eXPWW��cG�����'O����/((������/^�6�-t���� �������\�~}����x��}��ny���;w�����N�����yyy06H���g���L�B������;~���!C�9�a��6z�h(����3D75g��>}�@OAKXX������sa���v���~~~vvv���][��y��-�����
[PVQB��OM�=��cN�en:��'�"B
�,����:L����'����|bV2�_Y��������WMLLRRR��������---��9Q����y�x���O����C!� �+d�
d^ff&l���������Q�F�8q���---���
��B0h���G��rjj�`��Q%KKK'N�x��)WWWxE��#FL�>������P�^�&�pTT�v{����};�`������$&&���V������544�7��
��b������lo�SUY��.������T��`a�I��C����oj�H��"�yD���+��@w��1����>�D���!(�n����u��\.l�N���:::�F�����{��%�N0h�`sqqa2�t���HW�_�s���mB������!����'M��l�2(vuuu! �M�5"d��]!�'l��5�|�����
)���(��j���������?�9����������^j���R�.UwBM��B@ID/@��x��F��v���b�
H�-[�8::<x��~�>P~���b��a������/?y�d\\�������m���`J@�&�����iC�@��v�:u*d���������^�����r����e���g�����y�2�[NN�����W�����@�,X-,,T���(��j��(��@<��������"��+����7:v���HG��C�%����o������t)gt�@y�b�N�8�U�������#,@\�'������c�������)S��ONN.??�~"d�����k��������L�P�Z����������{-���4B����#3�����^��y��o�������8�����_��h���{��111�v�<���\������`nn�u�V�j||���U���l���R���x{��{���W��Q�L����Wnw���P�]�OG5��q{��"�R����L�_T�4���OP�AE%h�Z��_~�������m�������"2�����_���(��_�?M�\���� #!��;~�B�C���z���.����F�����wZZ���{����{�W///��C�Ah``C�����7��]�<|����;��������C�}����>�
Px�y��A��bhhX�Q���sQZ��l�6�i�z��-��S�l��8K�Cg��5$::�Z�����rv	E���L�sq����,;v,��t�z��bq��|�:u�\TWW?y�d-[6l�",������;�f��5%��w�}�d�GGGh������;���5�P���B������R8y�������.�t�����L�**T�G�	O�����P��A���g�F DEG����2,�X����+Vxxx@q�j�*(}||���	�<���
��(�2
R�wu���dzux{�L=�:^x�y���Nu�l��u�P=����k�v�?N/�UD��������	:�H����eyy�����}�������hLHH��f�������<EUU���k��	&����^>�'�bx����G�\�	sii��@})NcC`.J��w��6;�4��?��f��E�C���}�H;u%I!�jS\\e��'�~�����tZIJsQ�D�\����HQ�A����(����%_�8�GMI^�D��PRR:t���P�B�y���O��*����s�)++��lA���8SSS�'�k���Z�q���l���P�!+�0�(.gG��i���P���rjQD]�0��$=��������u,������"������as*D�}z�^m/O~0c�s�w��0���_ee�!B�u���s�����<��?�]S�n�����y�3�M��U�Zn�P���)=d���/+T�^E�B�X�s���u999222���QQQ���������)f�kI����vm��������NL�����7{/�-��!�P��HA�7c�__���o�������?�]S�n�b<L��5t^Q�|h(	'�3��2��X����w��5Z�D���r�c���UN��_�i��]�x���J���>;�]PP��I=��������{Zq��4���w9�������"���/�^�KK�5|B�k�\���HKK\}�b��������������TyE��#F4|��J�%�����������$V�z�!�;��D�"�������'��v��H�>�G$_#�bRR����CBB�X���������"����N���e��u�V��������[o�Mx{�~�H�Za���(r��l�����=��Iv�!���-?5g����������a�>[U���}v;��� ��OR�����7y�o$��;��5�n���e��/�p��*���}���%7b�j�������n_�F�'�k����C�3����n�M����',����\K_�+<�&�]�w���3����� �#�P��3�H���6&�A��*�-�9�)2)(��������d����?{��0%8Z�j%0%&�U����S��<�xVz&E�����DHXi�������;FBcD�VsQ����OT|Te�C��Y���Px�����������B�U�\�.�=����%U'Vf����Wcz�q�oW}��
!�Z-�E�xW�v���6f��?T�&EL���)�"B}y���ry����X��w�����/�nX�O��#C�Vs��*�(>���''�U�P5�Z�T�[�fr8��B�����/�y���������W5�(�����y�&�8�o��{����Fgt����B}i���%�9�v�����>m(�NLU���)+>S�X^NY�)/��(�#�P����������X�4�.�����
�A����E�1z�eG,����<
~�e��et���U��}�!���b��H:{(ju)�r���������YG�Wf9ne��r���E�$a4G�px���"/��#�P+���T���w����B�U&E��2��RLB���g~��l����S��8B��\l|\�
{z�X�/��t��&*��pO|�=s��=3�0�*B�*0YVa��[+��W)B�hS�
����w�_��~R�5o��o���!�a.6.�s�����	>M��'s;1��Q��CN��vw����F�*�Iv�!�>s�q��<�w�{�I�P&mOM�g(0)�_y��
��	?/��`*�q"���bC��K���~��^���:
�g'F7U��q�g�=����c�(M5E�!��ga.6�i���Vg��W�8�������1�,���w�����{���d��BHL����]�~4����+�����Q�"o��S��Rl�}s`�+��!$E0����
�����e�t�2�L�g���
.	,���W��+�O_ck"��"��+���y�:�����<�}4��L
YB���#�r]:��qi��
�t�BRsQ\�/�F�|�e��E_������=��\��z]5�u��^&�%8N�B
���y%��q�/?��b��-�Lj�.����� $���u��\=����j��j��0B!iQ�\��o��e�6m��p�B�%66���;))ISSs��uS�Nmxcs��r���>u����wt�^-�4}������K*K�]�]F\�3PK��A!�W�\�?~~~~�������3f��5���111��
���122jH����������_S�?�tR�f1LU(����E���A���n�?������B��9�����O�!C�Z�������������#G�������kH���km�.=/�D���/�	Z�:����EQ������m;�u���N���)B�$u�E�*-�������U33���x

��4�uT�%�4�������L^#� n��t)�"n���?�z�U��1��hEy�t!�Z�Fxg/**RT~����T���F�����*/�������=.��P����-P�kRS�Z\Rq�|w������:n�����u�Zp�!�Ps�����R\\,X-,,T�kH���]\\��"���#>r��a�NA��S�!h��JM7`���M���]~.qp�7|����Fc�.B��i���Q��hnn�u�V�j||���U>*qp	7&%����D���U��0��v����k����X^�f ^��B�A#����3����{��y�"""BBB6o�lhh��������^G���%���
Y2N�1�-���D4Y��`���������_`<!�����"��VVV���b�
�;vxyy�Z�JOO�������4���<~}�����XA�"��C�iQ�X�������3�0;��� ��M:�B�M�r��d���Vo������l����������o
Zd(2D�1V�0c������l��w:���W����xB57��J���	��l��!haR�I��E1c������������]�t��8BIV���W9O��^���CE��m��R������_������iX#"�j������?|���q�D$�o[j�*��16��6K�+k��)�N=$9P�B�F�������a#��sE{k0����[�N��Q����h��X[b�D!����\|�4�\����8"���)W"s-���c;���7_�\:��Q{u��!�Ps%
��� ��	�?\���5�K��)��|�1D�=���F�u�����l�"����X������j�+�8�f����jxC(�B5��\�$cse.�B��d��YT>���mr���a7+8��h����N}C���r�������w�W=��e���BI)��7��I�|���>����ra��{�[F�f�lyd�
�D���Pye�����2�����������A!���b9����ty.��[�xU�<�8����=72�B�s��2���=~��G��o�����3����BI%)�E��8��<���gR��"�r��Cn���D�~���G=��WN��5�������BHjIC.2e����P��A*�j�Wp��p*������e6�uW����B!�,IG�,b���^��()B����n�����l��]�3D�B
$��7e�gM[�Q�Cn���b�do�"��!�ja�#���p��U���aX�����B�H����z����b�u6�/�(��Q��CB!�"IA.�8�F�b�� _�%TNqF{5��0B�F ��FY����3�����������>��O��PD!�8� 5��U�!�>j}�C����B�PR�� ��2�����#FHj0!�Z0��E�B���\D!��0B!!�E�BH�qrQYY��^bgjj����������n���S��C�7"�B_^#�"��*..������4����3f��5���111��
���122�b��C!���r177�jhh�6��������;�mmmG����ggg'f���k>0�B��!srrddd�����������%&&����������Cv����Q!�B������3<<<|}}o��������[TT���(����T�'f������D?��6|�!�P���;v<t����_�i��]�x������X����P�O�F�����TyE��!T'��4_#�bFFFZZ���5��b��������n�*�I)~c�G�B�C#�bRR����CBB�X���������wvv�p8�w��7o^DD<�y�fCCC1>*�B�!�o�������;X2d�xyy�Z�JOO�������N�!����8������heeY�F�B���y�B!!�E�BHs!��\D!��0B!!�E�BHs!��\D!��0B!!�E�BHs!��\D!��0B!!�E�BHs!��\D!��0B!!�E�BHs!��\D!��0B!!�E�BH�Y�bll���wRR�����u��N�*�!�j�$�����c��Y�f���gLL��a�lllLMM%=.�B���s1,,L^^~����lkk;r�H??��k�Jz\!�Z#��bbb����`���,>>^��A!��I>����JJJ�"�� �j�$��***�������Bh��f��<+00�K!�P+#�\477��u�`5>>���J����K��@(�1���Kj�;Ep��M�/II
�����3����{��y�"""BBB6o�,�A!�j�$��222^^^�V�������166���B�R��E`ee)�Q �B�#B�fB*s�E��"w��~I�/��2B�&���B	a."�BB��!���"B!$$M�x����+W������.^���o����d��}��-��i��������y�������;��_~���?���TMMM�;///h���\�|ybb���:���.]*�a���,,,������KZ�/d-�rUvV�������#1IM.������/^�8`���/_ZZZ�?Z;;;I������{��{w�FOOOee��������'�?vSR#�9����3...w��qrr���O�.]\]]�o�>s��{��A��������[K�,����[�/d-�r�;+]j��������$5�����b���


�v����C�{�3gd��!C-����/���W���QQQ^�p�\z�wx��D��c����m�P�Fxo277���4����������yCZ�/d-�rUvV���������#1IM.���u�������'��'O�?N��K)�*-<�����w���Gedd�����%2�z���7n,������B���=k�,����-���e����x{{CZ����--��S�r�wV�����r�
���Ijr��;p���1c�������������HzP�	��^�|�����=z4h� �A777I���|}}�O���m[�y����_�~
��a��#%8�����_�n��	ZZ�/��~����t��G��?G���\�?�&O�bgg��C�m����)S$=�F�����.\HQ�����I�����1��2q����hX��q����x����}��5�|���XgP�������hc����W��bU�Y�R�����?GH@jr144������FOO���������6T���;����?]eee�����`HzPu�������Q�F1�L{{{�+W�@.BFB(;vL��4����RSS�D�giVTT�n�����B��+W��^�~]�����+f��Q�����\��g�m711�k��������O�Z##����^�z��-����O�� ����&77w��iPs8::�������+�m
��G�Ji(~.
���_����w����oK�/d��r��
t��Y5�W������&���?��c��q����:f��eR��f���X���p�;v�<yr���ZZZm���w+�w(�5���;===��};�,�?��$U����E�r�R-�R��jT����d������&	�Z+ �Q4�'ZZZZ�]SS300������m?~<����p��`��B��+'�����
���I�r!�jj��!���"B!$���B	a."�BB��!���"B!$���B	a."�BB��:�g���*IEND�B`�

results.csvtext/csv; charset=UTF-8; name=results.csvDownload

#20

Alexander Korotkov

a.korotkov@postgrespro.ru

almost 9 years ago

In reply to: Bernd Helmle (#19)

Re: LWLock optimization for multicore Power machines

On Tue, Feb 21, 2017 at 1:47 PM, Bernd Helmle <mailings@oopsware.de> wrote:

Am Samstag, den 11.02.2017, 15:42 +0300 schrieb Alexander Korotkov:

I think it would make sense to run more kinds of tests. Could you
try set
of tests provided by Tomas Vondra?
If even we wouldn't see win some of the tests, it would be still
valuable
to see that there is no regression there.

Sorry for the delay.

But here are the results after having run Tomas' benchmarks scripts.
The pgbench-ro benchmark shows a clear win of the lwlock-power-3 patch,
also you see a slight advantage of this patch in the pgbench-rw graph.
At least, i don't see any signifikant regression somewhere.

The graphs are showing the average values over all five runs for each
client count.

Thank you very much for testing, results look good.
I'll keep trying to access bigger Power machine where advantage of patch is
expected to be larger.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#21

Bernd Helmle

mailings@oopsware.de

almost 9 years ago

In reply to: Alexander Korotkov (#17)

1 attachment(s)

Re: LWLock optimization for multicore Power machines

Am Dienstag, den 14.02.2017, 15:53 +0300 schrieb Alexander Korotkov:

+1
And you could try to use pg_wait_sampling
<https://github.com/postgrespro/pg_wait_sampling> to sampling of wait
events.

I've tried this with your example from your blog post[1]http://akorotkov.github.io/blog/2016/03/25/wait_monitoring_9_6/ and got this:

(pgbench scale 1000)

pgbench -Mprepared -S -n -c 100 -j 100 -T 300 -P2 pgbench2

SELECT-only:

writes pgbench runs have far more events logged, see the attached text
file. Maybe this is of interest...

[1]: http://akorotkov.github.io/blog/2016/03/25/wait_monitoring_9_6/

#22

David Steele

david@pgmasters.net

almost 9 years ago

In reply to: Bernd Helmle (#21)

Re: LWLock optimization for multicore Power machines

On 2/21/17 9:54 AM, Bernd Helmle wrote:

Am Dienstag, den 14.02.2017, 15:53 +0300 schrieb Alexander Korotkov:

+1
And you could try to use pg_wait_sampling
<https://github.com/postgrespro/pg_wait_sampling> to sampling of wait
events.

I've tried this with your example from your blog post[1] and got this:

(pgbench scale 1000)

pgbench -Mprepared -S -n -c 100 -j 100 -T 300 -P2 pgbench2

SELECT-only:

SELECT * FROM profile_log ;
ts | event_type | event | count
----------------------------+---------------+---------------+-------
2017-02-21 15:21:52.45719 | LWLockNamed | ProcArrayLock | 8
2017-02-21 15:22:11.19594 | LWLockTranche | lock_manager | 1
2017-02-21 15:22:11.19594 | LWLockNamed | ProcArrayLock | 24
2017-02-21 15:22:31.220803 | LWLockNamed | ProcArrayLock | 1
2017-02-21 15:23:01.255969 | LWLockNamed | ProcArrayLock | 1
2017-02-21 15:23:11.272254 | LWLockNamed | ProcArrayLock | 2
2017-02-21 15:23:41.313069 | LWLockNamed | ProcArrayLock | 1
2017-02-21 15:24:31.37512 | LWLockNamed | ProcArrayLock | 19
2017-02-21 15:24:41.386974 | LWLockNamed | ProcArrayLock | 1
2017-02-21 15:26:41.530399 | LWLockNamed | ProcArrayLock | 1
(10 rows)

writes pgbench runs have far more events logged, see the attached text
file. Maybe this is of interest...

[1] http://akorotkov.github.io/blog/2016/03/25/wait_monitoring_9_6/

This patch applies cleanly at cccbdde. It doesn't break compilation on
amd64 but I can't test on a Power-based machine

Alexander, have you had a chance to look at Bernd's results?

--
-David
david@pgmasters.net

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23

David Steele

david@pgmasters.net

almost 9 years ago

In reply to: David Steele (#22)

Re: LWLock optimization for multicore Power machines

Hi Alexander,

On 3/16/17 1:35 PM, David Steele wrote:

On 2/21/17 9:54 AM, Bernd Helmle wrote:

Am Dienstag, den 14.02.2017, 15:53 +0300 schrieb Alexander Korotkov:

+1
And you could try to use pg_wait_sampling
<https://github.com/postgrespro/pg_wait_sampling> to sampling of wait
events.

I've tried this with your example from your blog post[1] and got this:

(pgbench scale 1000)

pgbench -Mprepared -S -n -c 100 -j 100 -T 300 -P2 pgbench2

SELECT-only:

SELECT * FROM profile_log ;
ts | event_type | event | count
----------------------------+---------------+---------------+-------
2017-02-21 15:21:52.45719 | LWLockNamed | ProcArrayLock | 8
2017-02-21 15:22:11.19594 | LWLockTranche | lock_manager | 1
2017-02-21 15:22:11.19594 | LWLockNamed | ProcArrayLock | 24
2017-02-21 15:22:31.220803 | LWLockNamed | ProcArrayLock | 1
2017-02-21 15:23:01.255969 | LWLockNamed | ProcArrayLock | 1
2017-02-21 15:23:11.272254 | LWLockNamed | ProcArrayLock | 2
2017-02-21 15:23:41.313069 | LWLockNamed | ProcArrayLock | 1
2017-02-21 15:24:31.37512 | LWLockNamed | ProcArrayLock | 19
2017-02-21 15:24:41.386974 | LWLockNamed | ProcArrayLock | 1
2017-02-21 15:26:41.530399 | LWLockNamed | ProcArrayLock | 1
(10 rows)

writes pgbench runs have far more events logged, see the attached text
file. Maybe this is of interest...

[1] http://akorotkov.github.io/blog/2016/03/25/wait_monitoring_9_6/

This patch applies cleanly at cccbdde. It doesn't break compilation on
amd64 but I can't test on a Power-based machine

Alexander, have you had a chance to look at Bernd's results?

I'm marking this submission "Waiting for Author" as your input seems to
be required.

This thread has been idle for a week. Please respond and/or post a new
patch by 2017-03-28 00:00 AoE (UTC-12) or this submission will be marked
"Returned with Feedback".

--
-David
david@pgmasters.net

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24

Alexander Korotkov

a.korotkov@postgrespro.ru

almost 9 years ago

In reply to: David Steele (#22)

Re: LWLock optimization for multicore Power machines

On Thu, Mar 16, 2017 at 8:35 PM, David Steele <david@pgmasters.net> wrote:

On 2/21/17 9:54 AM, Bernd Helmle wrote:

Am Dienstag, den 14.02.2017, 15:53 +0300 schrieb Alexander Korotkov:

+1
And you could try to use pg_wait_sampling
<https://github.com/postgrespro/pg_wait_sampling> to sampling of wait
events.

I've tried this with your example from your blog post[1] and got this:

(pgbench scale 1000)

pgbench -Mprepared -S -n -c 100 -j 100 -T 300 -P2 pgbench2

SELECT-only:

SELECT * FROM profile_log ;
ts | event_type | event | count
----------------------------+---------------+---------------+-------
2017-02-21 15:21:52.45719 | LWLockNamed | ProcArrayLock | 8
2017-02-21 15:22:11.19594 | LWLockTranche | lock_manager | 1
2017-02-21 15:22:11.19594 | LWLockNamed | ProcArrayLock | 24
2017-02-21 15:22:31.220803 | LWLockNamed | ProcArrayLock | 1
2017-02-21 15:23:01.255969 | LWLockNamed | ProcArrayLock | 1
2017-02-21 15:23:11.272254 | LWLockNamed | ProcArrayLock | 2
2017-02-21 15:23:41.313069 | LWLockNamed | ProcArrayLock | 1
2017-02-21 15:24:31.37512 | LWLockNamed | ProcArrayLock | 19
2017-02-21 15:24:41.386974 | LWLockNamed | ProcArrayLock | 1
2017-02-21 15:26:41.530399 | LWLockNamed | ProcArrayLock | 1
(10 rows)

writes pgbench runs have far more events logged, see the attached text
file. Maybe this is of interest...

[1] http://akorotkov.github.io/blog/2016/03/25/wait_monitoring_9_6/

This patch applies cleanly at cccbdde. It doesn't break compilation on
amd64 but I can't test on a Power-based machine

Alexander, have you had a chance to look at Bernd's results?

Bernd's results look good. But it seems that his machine is not big
enough, so effect of patch isn't huge.
I will get access to big enough Power8 machine in Monday 27 March. I will
run benchmarks there and post results immediately.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#25

Tom Lane

tgl@sss.pgh.pa.us

almost 9 years ago

In reply to: Alexander Korotkov (#5)

Re: LWLock optimization for multicore Power machines

Alexander Korotkov <a.korotkov@postgrespro.ru> writes:

[ lwlock-power-3.patch ]

I experimented with this patch a bit. I can't offer help on testing it
on large PPC machines, but I can report that it breaks the build on
Apple PPC machines, apparently because of nonstandard assembly syntax.
I get "Parameter syntax error" on these four lines of assembly:

and 3,r0,r4
cmpwi 3,0
add 3,r0,r5
stwcx. 3,0,r2

I am not sure what the "3"s are meant to be --- if that's a hard-coded
register number, then it's unacceptable code to begin with, IMO.
You should be able to get gcc to give you a scratch register of its
choosing via appropriate use of the register assignment part of the
asm syntax. I think there are examples in s_lock.h.

I'm also unhappy that this patch changes the generic implementation of
LWLockAttemptLock. That means that we'd need to do benchmarking on *every*
machine type to see what we're paying elsewhere for the benefit of PPC.
It seems unlikely that putting an extra level of function call into that
hot-spot is zero cost.

Lastly, putting machine-specific code into atomics.c is a really bad idea.
We have a convention for where to put such code, and that is not it.

You could address both the second and third concerns, I think, by putting
the PPC asm function into port/atomics/arch-ppc.h (which is where it
belongs anyway) and making the generic implementation be a "static inline"
function in port/atomics/fallback.h. In that way, at least with compilers
that do inlines sanely, we could expect that this patch doesn't change the
generated code for LWLockAttemptLock at all on machines other than PPC.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#26

Alexander Korotkov

a.korotkov@postgrespro.ru

almost 9 years ago

In reply to: Tom Lane (#25)

1 attachment(s)

Re: LWLock optimization for multicore Power machines

On Sat, Mar 25, 2017 at 8:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alexander Korotkov <a.korotkov@postgrespro.ru> writes:

[ lwlock-power-3.patch ]

I experimented with this patch a bit. I can't offer help on testing it
on large PPC machines, but I can report that it breaks the build on
Apple PPC machines, apparently because of nonstandard assembly syntax.
I get "Parameter syntax error" on these four lines of assembly:

and 3,r0,r4
cmpwi 3,0
add 3,r0,r5
stwcx. 3,0,r2

I am not sure what the "3"s are meant to be --- if that's a hard-coded
register number, then it's unacceptable code to begin with, IMO.
You should be able to get gcc to give you a scratch register of its
choosing via appropriate use of the register assignment part of the
asm syntax. I think there are examples in s_lock.h.

Right. Now I use variable bind to register instead of hard-coded register
for that purpose.

I'm also unhappy that this patch changes the generic implementation of

LWLockAttemptLock. That means that we'd need to do benchmarking on *every*
machine type to see what we're paying elsewhere for the benefit of PPC.
It seems unlikely that putting an extra level of function call into that
hot-spot is zero cost.

Lastly, putting machine-specific code into atomics.c is a really bad idea.
We have a convention for where to put such code, and that is not it.

You could address both the second and third concerns, I think, by putting
the PPC asm function into port/atomics/arch-ppc.h (which is where it
belongs anyway) and making the generic implementation be a "static inline"
function in port/atomics/fallback.h. In that way, at least with compilers
that do inlines sanely, we could expect that this patch doesn't change the
generated code for LWLockAttemptLock at all on machines other than PPC.

I moved PPC implementation of pg_atomic_fetch_mask_add_u32() into
port/atomics/arch-ppc.h. I also had to declare pg_atomic_uint32 there to
satisfy usage of this type as argument
of pg_atomic_fetch_mask_add_u32_impl().

I moved generic implementation of pg_atomic_fetch_mask_add_u32() into
port/atomics/generic.h. That seems to be more appropriate place for that
than fallback.h. Because fallback.h has definitions for cases when we
don't have atomic for this platform at all. generic.h has definitions of
lacking atomics using defined atomics.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

lwlock-power-4.patchapplication/octet-stream; name=lwlock-power-4.patchDownload

diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
new file mode 100644
index 3e13394..00a1807
*** a/src/backend/storage/lmgr/lwlock.c
--- b/src/backend/storage/lmgr/lwlock.c
*************** GetLWLockIdentifier(uint32 classId, uint
*** 728,791 ****
  static bool
  LWLockAttemptLock(LWLock *lock, LWLockMode mode)
  {
! 	uint32		old_state;
  
  	AssertArg(mode == LW_EXCLUSIVE || mode == LW_SHARED);
  
  	/*
! 	 * Read once outside the loop, later iterations will get the newer value
! 	 * via compare & exchange.
  	 */
! 	old_state = pg_atomic_read_u32(&lock->state);
  
! 	/* loop until we've determined whether we could acquire the lock or not */
! 	while (true)
  	{
! 		uint32		desired_state;
! 		bool		lock_free;
! 
! 		desired_state = old_state;
! 
! 		if (mode == LW_EXCLUSIVE)
! 		{
! 			lock_free = (old_state & LW_LOCK_MASK) == 0;
! 			if (lock_free)
! 				desired_state += LW_VAL_EXCLUSIVE;
! 		}
! 		else
! 		{
! 			lock_free = (old_state & LW_VAL_EXCLUSIVE) == 0;
! 			if (lock_free)
! 				desired_state += LW_VAL_SHARED;
! 		}
! 
! 		/*
! 		 * Attempt to swap in the state we are expecting. If we didn't see
! 		 * lock to be free, that's just the old value. If we saw it as free,
! 		 * we'll attempt to mark it acquired. The reason that we always swap
! 		 * in the value is that this doubles as a memory barrier. We could try
! 		 * to be smarter and only swap in values if we saw the lock as free,
! 		 * but benchmark haven't shown it as beneficial so far.
! 		 *
! 		 * Retry if the value changed since we last looked at it.
! 		 */
! 		if (pg_atomic_compare_exchange_u32(&lock->state,
! 										   &old_state, desired_state))
! 		{
! 			if (lock_free)
! 			{
! 				/* Great! Got the lock. */
  #ifdef LOCK_DEBUG
! 				if (mode == LW_EXCLUSIVE)
! 					lock->owner = MyProc;
  #endif
! 				return false;
! 			}
! 			else
! 				return true;	/* somebody else has the lock */
! 		}
  	}
- 	pg_unreachable();
  }
  
  /*
--- 728,773 ----
  static bool
  LWLockAttemptLock(LWLock *lock, LWLockMode mode)
  {
! 	uint32		old_state,
! 				mask,
! 				increment;
  
  	AssertArg(mode == LW_EXCLUSIVE || mode == LW_SHARED);
  
+ 	if (mode == LW_EXCLUSIVE)
+ 	{
+ 		mask = LW_LOCK_MASK;
+ 		increment = LW_VAL_EXCLUSIVE;
+ 	}
+ 	else
+ 	{
+ 		mask = LW_VAL_EXCLUSIVE;
+ 		increment = LW_VAL_SHARED;
+ 	}
+ 
  	/*
! 	 * Use 'check mask then add' atomic which actually do all the useful job
! 	 * for us.
  	 */
! 	old_state = pg_atomic_fetch_mask_add_u32(&lock->state, mask, increment);
  
! 	/*
! 	 * If state was free according to the mask, we assume that operation was
! 	 * successful.
! 	 */
! 	if ((old_state & mask) == 0)
  	{
! 		/* Great! Got the lock. */
  #ifdef LOCK_DEBUG
! 		if (mode == LW_EXCLUSIVE)
! 			lock->owner = MyProc;
  #endif
! 		return false;
! 	}
! 	else
! 	{
! 		return true;	/* somebody else has the lock */
  	}
  }
  
  /*
diff --git a/src/include/port/atomics.h b/src/include/port/atomics.h
new file mode 100644
index 2e2ec27..74c2a41
*** a/src/include/port/atomics.h
--- b/src/include/port/atomics.h
*************** pg_atomic_sub_fetch_u32(volatile pg_atom
*** 415,420 ****
--- 415,437 ----
  	return pg_atomic_sub_fetch_u32_impl(ptr, sub_);
  }
  
+ /*
+  * pg_atomic_fetch_mask_add_u32 - atomically check that masked bits in variable
+  * and if they are clear then add to variable.
+  *
+  * Returns the value of ptr before the atomic operation.
+  *
+  * Full barrier semantics.
+  */
+ static inline uint32
+ pg_atomic_fetch_mask_add_u32(volatile pg_atomic_uint32 *ptr,
+ 							 uint32 mask_, uint32 add_)
+ {
+ 	AssertPointerAlignment(ptr, 4);
+ 	return pg_atomic_fetch_mask_add_u32_impl(ptr, mask_, add_);
+ }
+ 
+ 
  /* ----
   * The 64 bit operations have the same semantics as their 32bit counterparts
   * if they are available. Check the corresponding 32bit function for
diff --git a/src/include/port/atomics/arch-ppc.h b/src/include/port/atomics/arch-ppc.h
new file mode 100644
index ed1cd9d..9b6fbba
*** a/src/include/port/atomics/arch-ppc.h
--- b/src/include/port/atomics/arch-ppc.h
***************
*** 23,26 ****
--- 23,74 ----
  #define pg_memory_barrier_impl()	__asm__ __volatile__ ("sync" : : : "memory")
  #define pg_read_barrier_impl()		__asm__ __volatile__ ("lwsync" : : : "memory")
  #define pg_write_barrier_impl()		__asm__ __volatile__ ("lwsync" : : : "memory")
+ 
+ #define PG_HAVE_ATOMIC_U32_SUPPORT
+ typedef struct pg_atomic_uint32
+ {
+ 	volatile uint32 value;
+ } pg_atomic_uint32;
+ 
+ /*
+  * Optimized implementation of pg_atomic_fetch_mask_add_u32() for Power
+  * processors.  Atomic operations on Power processors are implemented using
+  * optimistic locking.  'lwarx' instruction 'reserves index', but that
+  * reservation could be broken on 'stwcx.' and then we have to retry.  Thus,
+  * each CAS operation is a loop.  But loop of CAS operation is two level nested
+  * loop.  Experiments on multicore Power machines shows that we can have huge
+  * benefit from having this operation done using single loop in assembly.
+  */
+ #define PG_HAVE_ATOMIC_FETCH_MASK_ADD_U32
+ static inline uint32
+ pg_atomic_fetch_mask_add_u32_impl(volatile pg_atomic_uint32 *ptr,
+ 								  uint32 mask, uint32 increment)
+ {
+ 	uint32		result,
+ 				tmp;
+ 
+ 	__asm__ __volatile__(
+ 	/* read *ptr and reserve index */
+ #ifdef USE_PPC_LWARX_MUTEX_HINT
+ 	"	lwarx	%0,0,%5,1	\n"
+ #else
+ 	"	lwarx	%0,0,%5		\n"
+ #endif
+ 	"	and		%1,%0,%3	\n" /* calculate '*ptr & mask" */
+ 	"	cmpwi	%1,0		\n" /* compare '*ptr & mark' vs 0 */
+ 	"	bne-	$+16		\n" /* exit on '*ptr & mark != 0' */
+ 	"	add		%1,%0,%4	\n" /* calculate '*ptr + increment' */
+ 	"	stwcx.	%1,0,%5		\n" /* try to store '*ptr + increment' into *ptr */
+ 	"	bne-	$-24		\n" /* retry if index reservation is broken */
+ #ifdef USE_PPC_LWSYNC
+ 	"	lwsync				\n"
+ #else
+ 	"	isync				\n"
+ #endif
+ 	: "=&r"(result), "=&r"(tmp), "+m"(*ptr)
+ 	: "r"(mask), "r"(increment), "r"(ptr)
+ 	: "memory", "cc");
+ 	return result;
+ }
+ 
  #endif
diff --git a/src/include/port/atomics/generic.h b/src/include/port/atomics/generic.h
new file mode 100644
index a5b29d8..ac934ce
*** a/src/include/port/atomics/generic.h
--- b/src/include/port/atomics/generic.h
*************** pg_atomic_sub_fetch_u64_impl(volatile pg
*** 390,392 ****
--- 390,439 ----
  #endif
  
  #endif /* PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U64 */
+ 
+ #if !defined(PG_HAVE_ATOMIC_FETCH_MASK_ADD_U32) && defined(PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U32)
+ #define PG_HAVE_ATOMIC_FETCH_MASK_ADD_U32
+ /*
+  * Generic implementation of pg_atomic_fetch_mask_add_u32() via loop
+  * of compare & exchange.
+  */
+ static inline uint32
+ pg_atomic_fetch_mask_add_u32_impl(volatile pg_atomic_uint32 *ptr,
+ 								  uint32 mask_, uint32 add_)
+ {
+ 	uint32		old_value;
+ 
+ 	/*
+ 	 * Read once outside the loop, later iterations will get the newer value
+ 	 * via compare & exchange.
+ 	 */
+ 	old_value = pg_atomic_read_u32_impl(ptr);
+ 
+ 	/* loop until we've determined whether we could make an increment or not */
+ 	while (true)
+ 	{
+ 		uint32		desired_value;
+ 		bool		free;
+ 
+ 		desired_value = old_value;
+ 		free = (old_value & mask_) == 0;
+ 		if (free)
+ 			desired_value += add_;
+ 
+ 		/*
+ 		 * Attempt to swap in the value we are expecting. If we didn't see
+ 		 * masked bits to be clear, that's just the old value. If we saw them
+ 		 * as clear, we'll attempt to make an increment. The reason that we
+ 		 * always swap in the value is that this doubles as a memory barrier.
+ 		 * We could try to be smarter and only swap in values if we saw the
+ 		 * maked bits as clear, but benchmark haven't shown it as beneficial
+ 		 * so far.
+ 		 *
+ 		 * Retry if the value changed since we last looked at it.
+ 		 */
+ 		if (pg_atomic_compare_exchange_u32_impl(ptr, &old_value, desired_value))
+ 			return old_value;
+ 	}
+ 	pg_unreachable();
+ }
+ #endif

#27

Tom Lane

tgl@sss.pgh.pa.us

almost 9 years ago

In reply to: Alexander Korotkov (#26)

Re: LWLock optimization for multicore Power machines

Alexander Korotkov <a.korotkov@postgrespro.ru> writes:

I moved PPC implementation of pg_atomic_fetch_mask_add_u32() into
port/atomics/arch-ppc.h. I also had to declare pg_atomic_uint32 there to
satisfy usage of this type as argument
of pg_atomic_fetch_mask_add_u32_impl().

Hm, you did something wrong there, because now I get a bunch of failures:

ccache gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -g -O2 -I../../../../src/include -c -o brin.o brin.c
In file included from ../../../../src/include/port/atomics.h:123,
from ../../../../src/include/utils/dsa.h:17,
from ../../../../src/include/nodes/tidbitmap.h:26,
from ../../../../src/include/access/genam.h:19,
from ../../../../src/include/nodes/execnodes.h:17,
from ../../../../src/include/access/brin.h:14,
from brin.c:18:
../../../../src/include/port/atomics/generic.h:154:3: error: #error "No pg_atomic_test_and_set provided"
../../../../src/include/port/atomics.h: In function 'pg_atomic_init_flag':
../../../../src/include/port/atomics.h:178: warning: implicit declaration of function 'pg_atomic_init_flag_impl'
../../../../src/include/port/atomics.h: In function 'pg_atomic_test_set_flag':
../../../../src/include/port/atomics.h:193: warning: implicit declaration of function 'pg_atomic_test_set_flag_impl'
../../../../src/include/port/atomics.h: In function 'pg_atomic_unlocked_test_flag':
../../../../src/include/port/atomics.h:208: warning: implicit declaration of function 'pg_atomic_unlocked_test_flag_impl'
... and so on.

I'm not entirely sure what the intended structure of these header files
is. Maybe Andres can comment.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#28

Alexander Korotkov

a.korotkov@postgrespro.ru

almost 9 years ago

In reply to: Tom Lane (#27)

Re: LWLock optimization for multicore Power machines

On Sat, Mar 25, 2017 at 11:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alexander Korotkov <a.korotkov@postgrespro.ru> writes:

I moved PPC implementation of pg_atomic_fetch_mask_add_u32() into
port/atomics/arch-ppc.h. I also had to declare pg_atomic_uint32 there to
satisfy usage of this type as argument
of pg_atomic_fetch_mask_add_u32_impl().

Hm, you did something wrong there, because now I get a bunch of failures:

ccache gcc -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute
-Wformat-security -fno-strict-aliasing -fwrapv -g -O2
-I../../../../src/include -c -o brin.o brin.c
In file included from ../../../../src/include/port/atomics.h:123,
from ../../../../src/include/utils/dsa.h:17,
from ../../../../src/include/nodes/tidbitmap.h:26,
from ../../../../src/include/access/genam.h:19,
from ../../../../src/include/nodes/execnodes.h:17,
from ../../../../src/include/access/brin.h:14,
from brin.c:18:
../../../../src/include/port/atomics/generic.h:154:3: error: #error "No
pg_atomic_test_and_set provided"
../../../../src/include/port/atomics.h: In function 'pg_atomic_init_flag':
../../../../src/include/port/atomics.h:178: warning: implicit declaration
of function 'pg_atomic_init_flag_impl'
../../../../src/include/port/atomics.h: In function
'pg_atomic_test_set_flag':
../../../../src/include/port/atomics.h:193: warning: implicit declaration
of function 'pg_atomic_test_set_flag_impl'
../../../../src/include/port/atomics.h: In function
'pg_atomic_unlocked_test_flag':
../../../../src/include/port/atomics.h:208: warning: implicit declaration
of function 'pg_atomic_unlocked_test_flag_impl'
... and so on.

I'm not entirely sure what the intended structure of these header files
is. Maybe Andres can comment.

It seems that on this platform definition of atomics should be provided by
fallback.h. But it doesn't because I already
defined PG_HAVE_ATOMIC_U32_SUPPORT in arch-ppc.h. I think in this case we
shouldn't provide ppc-specific implementation
of pg_atomic_fetch_mask_add_u32(). However, I don't know how to do this
assuming arch-ppc.h is included before compiler-specific headers. Thus, in
arch-ppc.h we don't know yet if we would find implementation of atomics for
this platform. One possible solution is to provide assembly implementation
for all atomics in arch-ppc.h.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#29

Alexander Korotkov

a.korotkov@postgrespro.ru

almost 9 years ago

In reply to: Alexander Korotkov (#28)

1 attachment(s)

Re: LWLock optimization for multicore Power machines

On Sun, Mar 26, 2017 at 12:29 AM, Alexander Korotkov <
a.korotkov@postgrespro.ru> wrote:

On Sat, Mar 25, 2017 at 11:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alexander Korotkov <a.korotkov@postgrespro.ru> writes:

I moved PPC implementation of pg_atomic_fetch_mask_add_u32() into
port/atomics/arch-ppc.h. I also had to declare pg_atomic_uint32 there

to

satisfy usage of this type as argument
of pg_atomic_fetch_mask_add_u32_impl().

Hm, you did something wrong there, because now I get a bunch of failures:

ccache gcc -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute
-Wformat-security -fno-strict-aliasing -fwrapv -g -O2
-I../../../../src/include -c -o brin.o brin.c
In file included from ../../../../src/include/port/atomics.h:123,
from ../../../../src/include/utils/dsa.h:17,
from ../../../../src/include/nodes/tidbitmap.h:26,
from ../../../../src/include/access/genam.h:19,
from ../../../../src/include/nodes/execnodes.h:17,
from ../../../../src/include/access/brin.h:14,
from brin.c:18:
../../../../src/include/port/atomics/generic.h:154:3: error: #error "No
pg_atomic_test_and_set provided"
../../../../src/include/port/atomics.h: In function
'pg_atomic_init_flag':
../../../../src/include/port/atomics.h:178: warning: implicit
declaration of function 'pg_atomic_init_flag_impl'
../../../../src/include/port/atomics.h: In function
'pg_atomic_test_set_flag':
../../../../src/include/port/atomics.h:193: warning: implicit
declaration of function 'pg_atomic_test_set_flag_impl'
../../../../src/include/port/atomics.h: In function
'pg_atomic_unlocked_test_flag':
../../../../src/include/port/atomics.h:208: warning: implicit
declaration of function 'pg_atomic_unlocked_test_flag_impl'
... and so on.

I'm not entirely sure what the intended structure of these header files
is. Maybe Andres can comment.

It seems that on this platform definition of atomics should be provided by
fallback.h. But it doesn't because I already defined PG_HAVE_ATOMIC_U32_SUPPORT
in arch-ppc.h. I think in this case we shouldn't provide ppc-specific
implementation of pg_atomic_fetch_mask_add_u32(). However, I don't know
how to do this assuming arch-ppc.h is included before compiler-specific
headers. Thus, in arch-ppc.h we don't know yet if we would find
implementation of atomics for this platform. One possible solution is to
provide assembly implementation for all atomics in arch-ppc.h.

BTW, implementation for all atomics in arch-ppc.h would be too invasive and
shouldn't be considered for v10.
However, I made following workaround: declare pg_atomic_uint32 and
pg_atomic_fetch_mask_add_u32_impl() only when we know that generic-gcc.h
would declare gcc-based atomics.
Could you, please, check it on Apple PPC?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

lwlock-power-5.patchapplication/octet-stream; name=lwlock-power-5.patchDownload

diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
new file mode 100644
index 3e13394..00a1807
*** a/src/backend/storage/lmgr/lwlock.c
--- b/src/backend/storage/lmgr/lwlock.c
*************** GetLWLockIdentifier(uint32 classId, uint
*** 728,791 ****
  static bool
  LWLockAttemptLock(LWLock *lock, LWLockMode mode)
  {
! 	uint32		old_state;
  
  	AssertArg(mode == LW_EXCLUSIVE || mode == LW_SHARED);
  
  	/*
! 	 * Read once outside the loop, later iterations will get the newer value
! 	 * via compare & exchange.
  	 */
! 	old_state = pg_atomic_read_u32(&lock->state);
  
! 	/* loop until we've determined whether we could acquire the lock or not */
! 	while (true)
  	{
! 		uint32		desired_state;
! 		bool		lock_free;
! 
! 		desired_state = old_state;
! 
! 		if (mode == LW_EXCLUSIVE)
! 		{
! 			lock_free = (old_state & LW_LOCK_MASK) == 0;
! 			if (lock_free)
! 				desired_state += LW_VAL_EXCLUSIVE;
! 		}
! 		else
! 		{
! 			lock_free = (old_state & LW_VAL_EXCLUSIVE) == 0;
! 			if (lock_free)
! 				desired_state += LW_VAL_SHARED;
! 		}
! 
! 		/*
! 		 * Attempt to swap in the state we are expecting. If we didn't see
! 		 * lock to be free, that's just the old value. If we saw it as free,
! 		 * we'll attempt to mark it acquired. The reason that we always swap
! 		 * in the value is that this doubles as a memory barrier. We could try
! 		 * to be smarter and only swap in values if we saw the lock as free,
! 		 * but benchmark haven't shown it as beneficial so far.
! 		 *
! 		 * Retry if the value changed since we last looked at it.
! 		 */
! 		if (pg_atomic_compare_exchange_u32(&lock->state,
! 										   &old_state, desired_state))
! 		{
! 			if (lock_free)
! 			{
! 				/* Great! Got the lock. */
  #ifdef LOCK_DEBUG
! 				if (mode == LW_EXCLUSIVE)
! 					lock->owner = MyProc;
  #endif
! 				return false;
! 			}
! 			else
! 				return true;	/* somebody else has the lock */
! 		}
  	}
- 	pg_unreachable();
  }
  
  /*
--- 728,773 ----
  static bool
  LWLockAttemptLock(LWLock *lock, LWLockMode mode)
  {
! 	uint32		old_state,
! 				mask,
! 				increment;
  
  	AssertArg(mode == LW_EXCLUSIVE || mode == LW_SHARED);
  
+ 	if (mode == LW_EXCLUSIVE)
+ 	{
+ 		mask = LW_LOCK_MASK;
+ 		increment = LW_VAL_EXCLUSIVE;
+ 	}
+ 	else
+ 	{
+ 		mask = LW_VAL_EXCLUSIVE;
+ 		increment = LW_VAL_SHARED;
+ 	}
+ 
  	/*
! 	 * Use 'check mask then add' atomic which actually do all the useful job
! 	 * for us.
  	 */
! 	old_state = pg_atomic_fetch_mask_add_u32(&lock->state, mask, increment);
  
! 	/*
! 	 * If state was free according to the mask, we assume that operation was
! 	 * successful.
! 	 */
! 	if ((old_state & mask) == 0)
  	{
! 		/* Great! Got the lock. */
  #ifdef LOCK_DEBUG
! 		if (mode == LW_EXCLUSIVE)
! 			lock->owner = MyProc;
  #endif
! 		return false;
! 	}
! 	else
! 	{
! 		return true;	/* somebody else has the lock */
  	}
  }
  
  /*
diff --git a/src/include/port/atomics.h b/src/include/port/atomics.h
new file mode 100644
index 2e2ec27..74c2a41
*** a/src/include/port/atomics.h
--- b/src/include/port/atomics.h
*************** pg_atomic_sub_fetch_u32(volatile pg_atom
*** 415,420 ****
--- 415,437 ----
  	return pg_atomic_sub_fetch_u32_impl(ptr, sub_);
  }
  
+ /*
+  * pg_atomic_fetch_mask_add_u32 - atomically check that masked bits in variable
+  * and if they are clear then add to variable.
+  *
+  * Returns the value of ptr before the atomic operation.
+  *
+  * Full barrier semantics.
+  */
+ static inline uint32
+ pg_atomic_fetch_mask_add_u32(volatile pg_atomic_uint32 *ptr,
+ 							 uint32 mask_, uint32 add_)
+ {
+ 	AssertPointerAlignment(ptr, 4);
+ 	return pg_atomic_fetch_mask_add_u32_impl(ptr, mask_, add_);
+ }
+ 
+ 
  /* ----
   * The 64 bit operations have the same semantics as their 32bit counterparts
   * if they are available. Check the corresponding 32bit function for
diff --git a/src/include/port/atomics/arch-ppc.h b/src/include/port/atomics/arch-ppc.h
new file mode 100644
index ed1cd9d..cce2b55
*** a/src/include/port/atomics/arch-ppc.h
--- b/src/include/port/atomics/arch-ppc.h
***************
*** 23,26 ****
--- 23,83 ----
  #define pg_memory_barrier_impl()	__asm__ __volatile__ ("sync" : : : "memory")
  #define pg_read_barrier_impl()		__asm__ __volatile__ ("lwsync" : : : "memory")
  #define pg_write_barrier_impl()		__asm__ __volatile__ ("lwsync" : : : "memory")
+ 
+ #if defined(HAVE_ATOMICS) \
+ 	&& (defined(HAVE_GCC__ATOMIC_INT32_CAS) || defined(HAVE_GCC__SYNC_INT32_CAS))
+ 
+ /*
+  * Declare pg_atomic_uint32 structure before generic-gcc.h does it in order to
+  * use it in function arguments.
+  */
+ #define PG_HAVE_ATOMIC_U32_SUPPORT
+ typedef struct pg_atomic_uint32
+ {
+ 	volatile uint32 value;
+ } pg_atomic_uint32;
+ 
+ /*
+  * Optimized implementation of pg_atomic_fetch_mask_add_u32() for Power
+  * processors.  Atomic operations on Power processors are implemented using
+  * optimistic locking.  'lwarx' instruction 'reserves index', but that
+  * reservation could be broken on 'stwcx.' and then we have to retry.  Thus,
+  * each CAS operation is a loop.  But loop of CAS operation is two level nested
+  * loop.  Experiments on multicore Power machines shows that we can have huge
+  * benefit from having this operation done using single loop in assembly.
+  */
+ #define PG_HAVE_ATOMIC_FETCH_MASK_ADD_U32
+ static inline uint32
+ pg_atomic_fetch_mask_add_u32_impl(volatile pg_atomic_uint32 *ptr,
+ 								  uint32 mask, uint32 increment)
+ {
+ 	uint32		result,
+ 				tmp;
+ 
+ 	__asm__ __volatile__(
+ 	/* read *ptr and reserve index */
+ #ifdef USE_PPC_LWARX_MUTEX_HINT
+ 	"	lwarx	%0,0,%5,1	\n"
+ #else
+ 	"	lwarx	%0,0,%5		\n"
+ #endif
+ 	"	and		%1,%0,%3	\n" /* calculate '*ptr & mask" */
+ 	"	cmpwi	%1,0		\n" /* compare '*ptr & mark' vs 0 */
+ 	"	bne-	$+16		\n" /* exit on '*ptr & mark != 0' */
+ 	"	add		%1,%0,%4	\n" /* calculate '*ptr + increment' */
+ 	"	stwcx.	%1,0,%5		\n" /* try to store '*ptr + increment' into *ptr */
+ 	"	bne-	$-24		\n" /* retry if index reservation is broken */
+ #ifdef USE_PPC_LWSYNC
+ 	"	lwsync				\n"
+ #else
+ 	"	isync				\n"
+ #endif
+ 	: "=&r"(result), "=&r"(tmp), "+m"(*ptr)
+ 	: "r"(mask), "r"(increment), "r"(ptr)
+ 	: "memory", "cc");
+ 	return result;
+ }
+ 
+ #endif
+ 
  #endif
diff --git a/src/include/port/atomics/generic.h b/src/include/port/atomics/generic.h
new file mode 100644
index a5b29d8..ac934ce
*** a/src/include/port/atomics/generic.h
--- b/src/include/port/atomics/generic.h
*************** pg_atomic_sub_fetch_u64_impl(volatile pg
*** 390,392 ****
--- 390,439 ----
  #endif
  
  #endif /* PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U64 */
+ 
+ #if !defined(PG_HAVE_ATOMIC_FETCH_MASK_ADD_U32) && defined(PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U32)
+ #define PG_HAVE_ATOMIC_FETCH_MASK_ADD_U32
+ /*
+  * Generic implementation of pg_atomic_fetch_mask_add_u32() via loop
+  * of compare & exchange.
+  */
+ static inline uint32
+ pg_atomic_fetch_mask_add_u32_impl(volatile pg_atomic_uint32 *ptr,
+ 								  uint32 mask_, uint32 add_)
+ {
+ 	uint32		old_value;
+ 
+ 	/*
+ 	 * Read once outside the loop, later iterations will get the newer value
+ 	 * via compare & exchange.
+ 	 */
+ 	old_value = pg_atomic_read_u32_impl(ptr);
+ 
+ 	/* loop until we've determined whether we could make an increment or not */
+ 	while (true)
+ 	{
+ 		uint32		desired_value;
+ 		bool		free;
+ 
+ 		desired_value = old_value;
+ 		free = (old_value & mask_) == 0;
+ 		if (free)
+ 			desired_value += add_;
+ 
+ 		/*
+ 		 * Attempt to swap in the value we are expecting. If we didn't see
+ 		 * masked bits to be clear, that's just the old value. If we saw them
+ 		 * as clear, we'll attempt to make an increment. The reason that we
+ 		 * always swap in the value is that this doubles as a memory barrier.
+ 		 * We could try to be smarter and only swap in values if we saw the
+ 		 * maked bits as clear, but benchmark haven't shown it as beneficial
+ 		 * so far.
+ 		 *
+ 		 * Retry if the value changed since we last looked at it.
+ 		 */
+ 		if (pg_atomic_compare_exchange_u32_impl(ptr, &old_value, desired_value))
+ 			return old_value;
+ 	}
+ 	pg_unreachable();
+ }
+ #endif

#30

Tom Lane

tgl@sss.pgh.pa.us

almost 9 years ago

In reply to: Alexander Korotkov (#29)

Re: LWLock optimization for multicore Power machines

Alexander Korotkov <a.korotkov@postgrespro.ru> writes:

It seems that on this platform definition of atomics should be provided by
fallback.h. But it doesn't because I already defined PG_HAVE_ATOMIC_U32_SUPPORT
in arch-ppc.h. I think in this case we shouldn't provide ppc-specific
implementation of pg_atomic_fetch_mask_add_u32(). However, I don't know
how to do this assuming arch-ppc.h is included before compiler-specific
headers. Thus, in arch-ppc.h we don't know yet if we would find
implementation of atomics for this platform. One possible solution is to
provide assembly implementation for all atomics in arch-ppc.h.

BTW, implementation for all atomics in arch-ppc.h would be too invasive and
shouldn't be considered for v10.
However, I made following workaround: declare pg_atomic_uint32 and
pg_atomic_fetch_mask_add_u32_impl() only when we know that generic-gcc.h
would declare gcc-based atomics.

I don't have a well-informed opinion on whether this is a reasonable thing
to do, but I imagine Andres does.

Could you, please, check it on Apple PPC?

It does compile and pass "make check" on prairiedog.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#31

Andres Freund

andres@anarazel.de

almost 9 years ago

In reply to: Alexander Korotkov (#29)

Re: LWLock optimization for multicore Power machines

Hi,

On 2017-03-31 13:38:31 +0300, Alexander Korotkov wrote:

It seems that on this platform definition of atomics should be provided by
fallback.h. But it doesn't because I already defined PG_HAVE_ATOMIC_U32_SUPPORT
in arch-ppc.h. I think in this case we shouldn't provide ppc-specific
implementation of pg_atomic_fetch_mask_add_u32(). However, I don't know
how to do this assuming arch-ppc.h is included before compiler-specific
headers. Thus, in arch-ppc.h we don't know yet if we would find
implementation of atomics for this platform. One possible solution is to
provide assembly implementation for all atomics in arch-ppc.h.

That'd probably not be good use of effort.

BTW, implementation for all atomics in arch-ppc.h would be too invasive and
shouldn't be considered for v10.
However, I made following workaround: declare pg_atomic_uint32 and
pg_atomic_fetch_mask_add_u32_impl() only when we know that generic-gcc.h
would declare gcc-based atomics.
Could you, please, check it on Apple PPC?

That doesn't sound crazy to me.

+/*
+ * pg_atomic_fetch_mask_add_u32 - atomically check that masked bits in variable
+ * and if they are clear then add to variable.

Hm, expanding on this wouldn't hurt.

+ * Returns the value of ptr before the atomic operation.

Sounds a bit like it'd return the pointer itself, not what it points to...

+ * Full barrier semantics.

Does it actually have full barrier semantics? Based on a quick skim I
don't think the ppc implementation doesn't?

+#if defined(HAVE_ATOMICS) \
+	&& (defined(HAVE_GCC__ATOMIC_INT32_CAS) || defined(HAVE_GCC__SYNC_INT32_CAS))
+
+/*
+ * Declare pg_atomic_uint32 structure before generic-gcc.h does it in order to
+ * use it in function arguments.
+ */

If we do this, then the other side needs a pointer to keep this up2date.

+#define PG_HAVE_ATOMIC_U32_SUPPORT
+typedef struct pg_atomic_uint32
+{
+	volatile uint32 value;
+} pg_atomic_uint32;
+
+/*
+ * Optimized implementation of pg_atomic_fetch_mask_add_u32() for Power
+ * processors.  Atomic operations on Power processors are implemented using
+ * optimistic locking.  'lwarx' instruction 'reserves index', but that
+ * reservation could be broken on 'stwcx.' and then we have to retry.  Thus,
+ * each CAS operation is a loop.  But loop of CAS operation is two level nested
+ * loop.  Experiments on multicore Power machines shows that we can have huge
+ * benefit from having this operation done using single loop in assembly.
+ */
+#define PG_HAVE_ATOMIC_FETCH_MASK_ADD_U32
+static inline uint32
+pg_atomic_fetch_mask_add_u32_impl(volatile pg_atomic_uint32 *ptr,
+								  uint32 mask, uint32 increment)
+{
+	uint32		result,
+				tmp;
+
+	__asm__ __volatile__(
+	/* read *ptr and reserve index */
+#ifdef USE_PPC_LWARX_MUTEX_HINT
+	"	lwarx	%0,0,%5,1	\n"
+#else
+	"	lwarx	%0,0,%5		\n"
+#endif

Hm, we should remove that hint bit stuff at some point...

+	"	and		%1,%0,%3	\n" /* calculate '*ptr & mask" */
+	"	cmpwi	%1,0		\n" /* compare '*ptr & mark' vs 0 */
+	"	bne-	$+16		\n" /* exit on '*ptr & mark != 0' */
+	"	add		%1,%0,%4	\n" /* calculate '*ptr + increment' */
+	"	stwcx.	%1,0,%5		\n" /* try to store '*ptr + increment' into *ptr */
+	"	bne-	$-24		\n" /* retry if index reservation is broken */
+#ifdef USE_PPC_LWSYNC
+	"	lwsync				\n"
+#else
+	"	isync				\n"
+#endif
+	: "=&r"(result), "=&r"(tmp), "+m"(*ptr)
+	: "r"(mask), "r"(increment), "r"(ptr)
+	: "memory", "cc");
+	return result;
+}

I'm not sure that the barrier semantics here are sufficient.

+/*
+ * Generic implementation of pg_atomic_fetch_mask_add_u32() via loop
+ * of compare & exchange.
+ */
+static inline uint32
+pg_atomic_fetch_mask_add_u32_impl(volatile pg_atomic_uint32 *ptr,
+								  uint32 mask_, uint32 add_)
+{
+	uint32		old_value;
+
+	/*
+	 * Read once outside the loop, later iterations will get the newer value
+	 * via compare & exchange.
+	 */
+	old_value = pg_atomic_read_u32_impl(ptr);
+
+	/* loop until we've determined whether we could make an increment or not */
+	while (true)
+	{
+		uint32		desired_value;
+		bool		free;
+
+		desired_value = old_value;
+		free = (old_value & mask_) == 0;
+		if (free)
+			desired_value += add_;
+
+		/*
+		 * Attempt to swap in the value we are expecting. If we didn't see
+		 * masked bits to be clear, that's just the old value. If we saw them
+		 * as clear, we'll attempt to make an increment. The reason that we
+		 * always swap in the value is that this doubles as a memory barrier.
+		 * We could try to be smarter and only swap in values if we saw the
+		 * maked bits as clear, but benchmark haven't shown it as beneficial
+		 * so far.
+		 *
+		 * Retry if the value changed since we last looked at it.
+		 */
+		if (pg_atomic_compare_exchange_u32_impl(ptr, &old_value, desired_value))
+			return old_value;
+	}
+	pg_unreachable();
+}
+#endif

Have you done x86 benchmarking?

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#32

Andres Freund

andres@anarazel.de

almost 9 years ago

In reply to: Andres Freund (#31)

Re: LWLock optimization for multicore Power machines

Hi,

On 2017-04-03 11:56:13 -0700, Andres Freund wrote:

+/*
+ * Generic implementation of pg_atomic_fetch_mask_add_u32() via loop
+ * of compare & exchange.
+ */
+static inline uint32
+pg_atomic_fetch_mask_add_u32_impl(volatile pg_atomic_uint32 *ptr,
+								  uint32 mask_, uint32 add_)
+{
+	uint32		old_value;
+
+	/*
+	 * Read once outside the loop, later iterations will get the newer value
+	 * via compare & exchange.
+	 */
+	old_value = pg_atomic_read_u32_impl(ptr);
+
+	/* loop until we've determined whether we could make an increment or not */
+	while (true)
+	{
+		uint32		desired_value;
+		bool		free;
+
+		desired_value = old_value;
+		free = (old_value & mask_) == 0;
+		if (free)
+			desired_value += add_;
+
+		/*
+		 * Attempt to swap in the value we are expecting. If we didn't see
+		 * masked bits to be clear, that's just the old value. If we saw them
+		 * as clear, we'll attempt to make an increment. The reason that we
+		 * always swap in the value is that this doubles as a memory barrier.
+		 * We could try to be smarter and only swap in values if we saw the
+		 * maked bits as clear, but benchmark haven't shown it as beneficial
+		 * so far.
+		 *
+		 * Retry if the value changed since we last looked at it.
+		 */
+		if (pg_atomic_compare_exchange_u32_impl(ptr, &old_value, desired_value))
+			return old_value;
+	}
+	pg_unreachable();
+}
+#endif

Have you done x86 benchmarking?

I think unless such benchmarking is done in the next 24h we need to move
this patch to the next CF...

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#33

Tom Lane

tgl@sss.pgh.pa.us

almost 9 years ago

In reply to: Andres Freund (#32)

Re: LWLock optimization for multicore Power machines

Andres Freund <andres@anarazel.de> writes:

On 2017-04-03 11:56:13 -0700, Andres Freund wrote:

Have you done x86 benchmarking?

I think unless such benchmarking is done in the next 24h we need to move
this patch to the next CF...

In theory, inlining the _impl function should lead to exactly the same
assembly code as before, on all non-PPC platforms. Confirming that would
be a good substitute for benchmarking.

Having said that, I'm not sure this patch is solid enough to push in
right now, and would be happy with seeing it pushed to v11.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#34

Alexander Korotkov

a.korotkov@postgrespro.ru

almost 9 years ago

In reply to: Andres Freund (#32)

Re: LWLock optimization for multicore Power machines

On Thu, Apr 6, 2017 at 2:16 AM, Andres Freund <andres@anarazel.de> wrote:

On 2017-04-03 11:56:13 -0700, Andres Freund wrote:
+/*
+ * Generic implementation of pg_atomic_fetch_mask_add_u32() via loop
+ * of compare & exchange.
+ */
+static inline uint32
+pg_atomic_fetch_mask_add_u32_impl(volatile pg_atomic_uint32 *ptr,
+                                                             uint32
mask_, uint32 add_)
+{
+   uint32          old_value;
+
+   /*
+    * Read once outside the loop, later iterations will get the newer
value
+    * via compare & exchange.
+    */
+   old_value = pg_atomic_read_u32_impl(ptr);
+
+   /* loop until we've determined whether we could make an increment
or not */
+   while (true)
+   {
+           uint32          desired_value;
+           bool            free;
+
+           desired_value = old_value;
+           free = (old_value & mask_) == 0;
+           if (free)
+                   desired_value += add_;
+
+           /*
+            * Attempt to swap in the value we are expecting. If we
didn't see

+ * masked bits to be clear, that's just the old value. If

we saw them

+ * as clear, we'll attempt to make an increment. The

reason that we

+ * always swap in the value is that this doubles as a

memory barrier.

+ * We could try to be smarter and only swap in values if

we saw the

+ * maked bits as clear, but benchmark haven't shown it as

beneficial
+            * so far.
+            *
+            * Retry if the value changed since we last looked at it.
+            */
+           if (pg_atomic_compare_exchange_u32_impl(ptr, &old_value,
desired_value))
+                   return old_value;
+   }
+   pg_unreachable();
+}
+#endif
Have you done x86 benchmarking?
I think unless such benchmarking is done in the next 24h we need to move
this patch to the next CF...

Thank you for your comments.
I didn't x86 benchmarking. I even didn't manage to reproduce previous
results on Power8.
Presumably, it's because previous benchmarks were done on bare metal, while
now I have to some kind of virtual machine on IBM E880 where I can't
reproduce any win of Power8 LWLock optimization. But probably there is
another reason.
Thus, I'm moving this patch to the next CF.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#35

Alexander Korotkov

a.korotkov@postgrespro.ru

almost 9 years ago

In reply to: Alexander Korotkov (#34)

Re: LWLock optimization for multicore Power machines

On Thu, Apr 6, 2017 at 5:37 PM, Alexander Korotkov <
a.korotkov@postgrespro.ru> wrote:

On Thu, Apr 6, 2017 at 2:16 AM, Andres Freund <andres@anarazel.de> wrote:

On 2017-04-03 11:56:13 -0700, Andres Freund wrote:

Have you done x86 benchmarking?

I think unless such benchmarking is done in the next 24h we need to move
this patch to the next CF...

Thank you for your comments.
I didn't x86 benchmarking. I even didn't manage to reproduce previous
results on Power8.
Presumably, it's because previous benchmarks were done on bare metal,
while now I have to some kind of virtual machine on IBM E880 where I can't
reproduce any win of Power8 LWLock optimization. But probably there is
another reason.
Thus, I'm moving this patch to the next CF.

I see it's already moved. OK!

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#36

Alexander Korotkov

a.korotkov@postgrespro.ru

over 8 years ago

In reply to: Alexander Korotkov (#35)

Re: LWLock optimization for multicore Power machines

On Thu, Apr 6, 2017 at 5:38 PM, Alexander Korotkov <
a.korotkov@postgrespro.ru> wrote:

On Thu, Apr 6, 2017 at 5:37 PM, Alexander Korotkov <
a.korotkov@postgrespro.ru> wrote:

On Thu, Apr 6, 2017 at 2:16 AM, Andres Freund <andres@anarazel.de> wrote:

On 2017-04-03 11:56:13 -0700, Andres Freund wrote:

Have you done x86 benchmarking?

I think unless such benchmarking is done in the next 24h we need to move
this patch to the next CF...

Thank you for your comments.
I didn't x86 benchmarking. I even didn't manage to reproduce previous
results on Power8.
Presumably, it's because previous benchmarks were done on bare metal,
while now I have to some kind of virtual machine on IBM E880 where I can't
reproduce any win of Power8 LWLock optimization. But probably there is
another reason.
Thus, I'm moving this patch to the next CF.

I see it's already moved. OK!

It doesn't seems to make sense to consider this patch unless we get access
to suitable Power machine to reproduce benefits.
This is why I'm going to mark this patch "Returned with feedback".
Once we would get access to the appropriate machine, I will resubmit this
patch.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#37

Tom Lane

tgl@sss.pgh.pa.us

over 8 years ago

In reply to: Alexander Korotkov (#36)

Re: LWLock optimization for multicore Power machines

Alexander Korotkov <a.korotkov@postgrespro.ru> writes:

It doesn't seems to make sense to consider this patch unless we get access
to suitable Power machine to reproduce benefits.
This is why I'm going to mark this patch "Returned with feedback".
Once we would get access to the appropriate machine, I will resubmit this
patch.

What about hydra (that 60-core POWER7 machine we have community
access to)?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#38

Sokolov Yura

y.sokolov@postgrespro.ru

over 8 years ago

In reply to: Tom Lane (#37)

Re: LWLock optimization for multicore Power machines

On 2017-08-30 16:24, Tom Lane wrote:

Alexander Korotkov <a.korotkov@postgrespro.ru> writes:

It doesn't seems to make sense to consider this patch unless we get
access
to suitable Power machine to reproduce benefits.
This is why I'm going to mark this patch "Returned with feedback".
Once we would get access to the appropriate machine, I will resubmit
this
patch.

What about hydra (that 60-core POWER7 machine we have community
access to)?

regards, tom lane

Anyway, I encourage to consider first another LWLock patch:
https://commitfest.postgresql.org/14/1166/
It positively affects performance on any platform (I've tested it
on Power as well).

And I have prototype of further adaptation of that patch for POWER
platform (ie using inline assembly) (not published yet).

--
Sokolov Yura
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers