pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <
Avoid extra locks in GetSnapshotData if old_snapshot_threshold < 0
On a big NUMA machine with 1000 connections in saturation load
there was a performance regression due to spinlock contention, for
acquiring values which were never used. Just fill with dummy
values if we're not going to use them.
This patch has not been benchmarked yet on a big NUMA machine, but
it seems like a good idea on general principle, and it seemed to
prevent an apparent 2.2% regression on a single-socket i7 box
running 200 connections at saturation load.
Branch
------
master
Details
-------
http://git.postgresql.org/pg/commitdiff/2201d801b03c2d1b0bce4d6580b718dc34d38b3e
Modified Files
--------------
src/backend/storage/ipc/procarray.c | 28 ++++++++++++++++++++--------
1 file changed, 20 insertions(+), 8 deletions(-)
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Kevin Grittner wrote:
Avoid extra locks in GetSnapshotData if old_snapshot_threshold < 0
On a big NUMA machine with 1000 connections in saturation load
there was a performance regression due to spinlock contention, for
acquiring values which were never used. Just fill with dummy
values if we're not going to use them.
old_snapshot_threshold is PGC_POSTMASTER, so this is okay AFAICS, but
perhaps it'd be a good idea to add a oneline comment to guc.c indicating
to verify this code if there's an intention to lift that limitation --
snapshots taken before the reload would have invalid lsn/timestamp, so
the current check would simply skip the check, which would be the wrong
thing to do.
I think it's reasonable to want to enable this feature with a simple
reload, so if we ever do that it'd be good to have a pointer about that
gotcha. (I'm not proposing you do that, only add the comment for a
future hacker.)
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Apr 12, 2016 at 12:08 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
Kevin Grittner wrote:
Avoid extra locks in GetSnapshotData if old_snapshot_threshold < 0
On a big NUMA machine with 1000 connections in saturation load
there was a performance regression due to spinlock contention, for
acquiring values which were never used. Just fill with dummy
values if we're not going to use them.old_snapshot_threshold is PGC_POSTMASTER, so this is okay AFAICS, but
perhaps it'd be a good idea to add a oneline comment to guc.c indicating
to verify this code if there's an intention to lift that limitation --
snapshots taken before the reload would have invalid lsn/timestamp, so
the current check would simply skip the check, which would be the wrong
thing to do.I think it's reasonable to want to enable this feature with a simple
reload, so if we ever do that it'd be good to have a pointer about that
gotcha. (I'm not proposing you do that, only add the comment for a
future hacker.)
Perhaps, but this would be one of at least a dozen land mines that
exist for trying to modify this setting to be read on reload.
FWIW, I spent a fair amount of time trying to make it PGC_SIGHUP,
since it would be very nice to allow that; but I kept running into
one problem after another with it, some of which were very hard to
see how to fix. My inclination is that trying to comment all the
places that would need something done if we do this in some later
release would be distracting until such time as we get there, and
might give a false sense of security to anyone who fixes all the
places the comments were scattered.
If there is a consensus that the comments would be worthwhile, I
can do a pass over the code I had before I gave up on PGC_SIGHUP
and try to add comments to all the appropriate spots based on
differences due to when the GUC was changed. If we don't want
that, I'm not sure why this one spot is a better place for such a
comment than all the others.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Kevin Grittner <kgrittn@gmail.com> writes:
On Tue, Apr 12, 2016 at 12:08 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:old_snapshot_threshold is PGC_POSTMASTER, so this is okay AFAICS, but
perhaps it'd be a good idea to add a oneline comment to guc.c indicating
to verify this code if there's an intention to lift that limitation --
Perhaps, but this would be one of at least a dozen land mines that
exist for trying to modify this setting to be read on reload.
FWIW, I spent a fair amount of time trying to make it PGC_SIGHUP,
since it would be very nice to allow that; but I kept running into
one problem after another with it, some of which were very hard to
see how to fix.
It'd be good if you document the problems you found somewhere, before
you forget them, just in case somebody does want to try to lift the
restriction. I agree that scattered code comments wouldn't be the way.
Just a quick email to -hackers to get the info into the archives
might be enough.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Kevin Grittner wrote:
FWIW, I spent a fair amount of time trying to make it PGC_SIGHUP,
since it would be very nice to allow that; but I kept running into
one problem after another with it, some of which were very hard to
see how to fix. My inclination is that trying to comment all the
places that would need something done if we do this in some later
release would be distracting until such time as we get there, and
might give a false sense of security to anyone who fixes all the
places the comments were scattered.
Okay, that's fair.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2016-04-12 16:49:25 +0000, Kevin Grittner wrote:
On a big NUMA machine with 1000 connections in saturation load
there was a performance regression due to spinlock contention, for
acquiring values which were never used. Just fill with dummy
values if we're not going to use them.
FWIW, I could see massive regressions with just 64 connections.
I'm a bit scared of having an innoccuous sounding option regress things
by a factor of 10. I think, in addition to this fix, we need to actually
solve the scalability issue here to a good degree. One way to do so is
to apply the parts of 0001 in
http://archives.postgresql.org/message-id/20160330230914.GH13305%40awork2.anarazel.de
defining PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY and rely on that. Another
to apply the whole patch and simply put the lsn in an 8 byte atomic.
- Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Apr 12, 2016 at 12:38 PM, Andres Freund <andres@anarazel.de> wrote:
On 2016-04-12 16:49:25 +0000, Kevin Grittner wrote:
On a big NUMA machine with 1000 connections in saturation load
there was a performance regression due to spinlock contention, for
acquiring values which were never used. Just fill with dummy
values if we're not going to use them.FWIW, I could see massive regressions with just 64 connections.
With what settings? With or without the patch to avoid the locks when off?
I'm a bit scared of having an innoccuous sounding option regress things
by a factor of 10. I think, in addition to this fix, we need to actually
solve the scalability issue here to a good degree. One way to do so is
to apply the parts of 0001 in
http://archives.postgresql.org/message-id/20160330230914.GH13305%40awork2.anarazel.de
defining PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY and rely on that. Another
to apply the whole patch and simply put the lsn in an 8 byte atomic.
I think that we are well due for atomic access to aligned 8-byte
values. That would eliminate one potential hot spot in the
"snapshot too old" code, for sure.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On 2016-04-12 13:44:00 -0500, Kevin Grittner wrote:
On Tue, Apr 12, 2016 at 12:38 PM, Andres Freund <andres@anarazel.de> wrote:
On 2016-04-12 16:49:25 +0000, Kevin Grittner wrote:
On a big NUMA machine with 1000 connections in saturation load
there was a performance regression due to spinlock contention, for
acquiring values which were never used. Just fill with dummy
values if we're not going to use them.FWIW, I could see massive regressions with just 64 connections.
With what settings?
You mean pgbench or postgres? The former -M prepared -c 64 -j 64 -S. The
latter just a large enough shared buffers to contains the scale 300
database, and adapted maintenance_work_mem. Nothing special.
With or without the patch to avoid the locks when off?
Without. Your commit message made it sound like you need unrealistic or
at least unusual numbers of connections, and that's afaics not the case.
I'm a bit scared of having an innoccuous sounding option regress things
by a factor of 10. I think, in addition to this fix, we need to actually
solve the scalability issue here to a good degree. One way to do so is
to apply the parts of 0001 in
http://archives.postgresql.org/message-id/20160330230914.GH13305%40awork2.anarazel.de
defining PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY and rely on that. Another
to apply the whole patch and simply put the lsn in an 8 byte atomic.I think that we are well due for atomic access to aligned 8-byte
values. That would eliminate one potential hot spot in the
"snapshot too old" code, for sure.
I'm kinda inclined to apply that portion (or just the whole patch with
the spurious #ifdef 0 et al fixed) into 9.6; and add the necessary
checks in a few places. Because I really think this is likely to hit
unsuspecting users.
FWIW, accessing a frequently changing value from a significant number of
connections, at a high frequency, isn't exactly free without a spinlock
either. But it should be much less bad.
Greetings,
Andres Freund
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On Tue, Apr 12, 2016 at 1:56 PM, Andres Freund <andres@anarazel.de> wrote:
On 2016-04-12 13:44:00 -0500, Kevin Grittner wrote:
On Tue, Apr 12, 2016 at 12:38 PM, Andres Freund <andres@anarazel.de> wrote:
On 2016-04-12 16:49:25 +0000, Kevin Grittner wrote:
On a big NUMA machine with 1000 connections in saturation load
there was a performance regression due to spinlock contention, for
acquiring values which were never used. Just fill with dummy
values if we're not going to use them.FWIW, I could see massive regressions with just 64 connections.
With what settings?
You mean pgbench or postgres? The former -M prepared -c 64 -j 64 -S. The
latter just a large enough shared buffers to contains the scale 300
database, and adapted maintenance_work_mem. Nothing special.
Well, something is different between your environment and mine,
since I saw no difference at scale 100 and 2.2% at scale 200. So,
knowing more about your hardware, OS, configuration, etc., might
allow me to duplicate a problem so I can fix it. For example, I
used a "real" pg config, like I would for a production machine
(because that seems to me to be the environment that is most
important): the kernel is 3.13 (not one with pessimal scheduling)
and has tuning for THP, the deadline scheduler, the vm.*dirty*
settings, etc. Without knowing even the kernel and what tuning the
OS and pg have had on your box, I could take a lot of shots in the
dark without hitting anything. Oh, and the output of `numactl
--hardware` would be good to have. Thanks for all information you
can provide.
With or without the patch to avoid the locks when off?
Without. Your commit message made it sound like you need unrealistic or
at least unusual numbers of connections, and that's afaics not the case.
It was the only reported case to that point, so the additional data
point is valuable, if I can tell where that point is. And you
don't have any evidence that even with your configuration that any
performance regression remains for those who have the default value
for old_snapshot_threshold?
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Hi,
On 2016-04-12 14:17:12 -0500, Kevin Grittner wrote:
Well, something is different between your environment and mine,
since I saw no difference at scale 100 and 2.2% at scale 200.
In a readonly test or r/w? A lot of this will be different between
single-socket and multi-socket servers; as soon as you have the latter
the likelihood of contention being bad goes up dramatically.
So,
knowing more about your hardware, OS, configuration, etc., might
allow me to duplicate a problem so I can fix
For example, I used a "real" pg config, like I would for a production
machine (because that seems to me to be the environment that is most
important): the kernel is 3.13 (not one with pessimal scheduling) and
has tuning for THP, the deadline scheduler, the vm.*dirty* settings,
etc. Without knowing even the kernel and what tuning the OS and pg
have had on your box, I could take a lot of shots in the dark without
hitting anything.
That shouldn't really matter much for a read-only, shared_buffer
resident, test? There's no IO and THP pretty much plays no role because
there's very few memory allocations (removing the pressure causing the
well known degradations).
Oh, and the output of `numactl --hardware` would be good to have.
Thanks for all information you can provide.
That was on Alexander's/PgPro's machine. Numactl wasn't installed, and I
didn't have root. But it has four numa domains (gathered via /sys/).
It was the only reported case to that point, so the additional data
point is valuable, if I can tell where that point is. And you
don't have any evidence that even with your configuration that any
performance regression remains for those who have the default value
for old_snapshot_threshold?
I haven't tested yet.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Apr 12, 2016 at 2:28 PM, Andres Freund <andres@anarazel.de> wrote:
On 2016-04-12 14:17:12 -0500, Kevin Grittner wrote:
Well, something is different between your environment and mine,
since I saw no difference at scale 100 and 2.2% at scale 200.In a readonly test or r/w?
Readonly with client and job counts matching scale.
A lot of this will be different between
single-socket and multi-socket servers; as soon as you have the latter
the likelihood of contention being bad goes up dramatically.
Yeah, I know, and 4 socket has been at least an order of magnitude
more problematic in my experience than 2 socket. And the problems
are far, far, far worse on kernels prior to 3.8, especially on 3.x
before 3.8, so it's hard to know how to take any report of problems
on a 4 node NUMA machine without knowing the kernel version.
knowing more about your hardware, OS, configuration, etc., might
allow me to duplicate a problem so I can fixFor example, I used a "real" pg config, like I would for a production
machine (because that seems to me to be the environment that is most
important): the kernel is 3.13 (not one with pessimal scheduling) and
has tuning for THP, the deadline scheduler, the vm.*dirty* settings,
etc. Without knowing even the kernel and what tuning the OS and pg
have had on your box, I could take a lot of shots in the dark without
hitting anything.That shouldn't really matter much for a read-only, shared_buffer
resident, test? There's no IO and THP pretty much plays no role because
there's very few memory allocations (removing the pressure causing the
well known degradations).
I hate to assume which differences matter without trying, but some
of them seem less probable than others.
Oh, and the output of `numactl --hardware` would be good to have.
Thanks for all information you can provide.That was on Alexander's/PgPro's machine. Numactl wasn't installed, and I
didn't have root. But it has four numa domains (gathered via /sys/).
On the machines I've used, it will give you the hardware report
without being root. But of course, it can't do that if it's not
installed. I hadn't yet seen a machine with multiple NUMA memory
segments that didn't have the numactl executable installed; I'll
keep in mind that can happen.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On Tue, Apr 12, 2016 at 2:53 PM, Kevin Grittner <kgrittn@gmail.com> wrote:
Readonly with client and job counts matching scale.
Single-socket i7, BTW.
A lot of this will be different between
single-socket and multi-socket servers; as soon as you have the latter
the likelihood of contention being bad goes up dramatically.Yeah, I know, and 4 socket has been at least an order of magnitude
more problematic in my experience than 2 socket. And the problems
are far, far, far worse on kernels prior to 3.8, especially on 3.x
before 3.8, so it's hard to know how to take any report of problems
on a 4 node NUMA machine without knowing the kernel version.
Also, with 4 node NUMA I have seen far better scaling with
hyper-threading turned off. I know there are environments where it
helps, but high-concurrency on multi-node NUMA is not one of them.
So, anyway, mentioning the HT setting is important, too.
Kevin Grittner
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Andres Freund wrote:
I'm kinda inclined to apply that portion (or just the whole patch with
the spurious #ifdef 0 et al fixed) into 9.6; and add the necessary
checks in a few places. Because I really think this is likely to hit
unsuspecting users.
!!!
Be sure to consult with the RMT before doing anything of the sort.
It might as well decide to revert the whole patch.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2016-04-12 23:52:14 -0300, Alvaro Herrera wrote:
Andres Freund wrote:
I'm kinda inclined to apply that portion (or just the whole patch with
the spurious #ifdef 0 et al fixed) into 9.6; and add the necessary
checks in a few places. Because I really think this is likely to hit
unsuspecting users.!!!
Be sure to consult with the RMT before doing anything of the sort.
I didn't plan to do anything without a few +1's. I don't think we can
release with the state of things as is though. I don't see a less
intrusive way than to get rid of that spinlock on all platforms capable
of significant concurrency.
So, RMT, what are your thoughts on this?
Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Apr 12, 2016 at 11:05 PM, Andres Freund <andres@anarazel.de> wrote:
On 2016-04-12 23:52:14 -0300, Alvaro Herrera wrote:
Andres Freund wrote:
I'm kinda inclined to apply that portion (or just the whole patch with
the spurious #ifdef 0 et al fixed) into 9.6; and add the necessary
checks in a few places. Because I really think this is likely to hit
unsuspecting users.!!!
Be sure to consult with the RMT before doing anything of the sort.
I didn't plan to do anything without a few +1's. I don't think we can
release with the state of things as is though. I don't see a less
intrusive way than to get rid of that spinlock on all platforms capable
of significant concurrency.So, RMT, what are your thoughts on this?
I think that a significant performance regression which affects people
not using snapshot_too_old would be a stop-ship issue, but I disagree
that an issue which only affects people using the feature is a
must-fix. It may be desirable to fix it, but I don't think we should
regard it as a hard requirement. It's reasonable to fix some kinds of
issues after feature freeze, but not at the price of accepting
arbitrary amounts of new code that may have problems of its own.
Every release will have some warts.
My testing yesterday of latest master, specifically
deb71fa9713dfe374a74fc58a5d298b5f25da3f5, last night did not show
evidence of a regression under heavy concurrency, as per
/messages/by-id/CA+TgmobpHAqsOeHc-ooRsjzTKw1H4s4P1VBtwh1KkKO+6Mp8_Q@mail.gmail.com
- that test was of course run without enabling "snapshot too old".
My guess is that 2201d801b03c2d1b0bce4d6580b718dc34d38b3e was
sufficient to put things right, and that we now have a problem only
when "snapshot too old" is enabled.
I have never understood why you didn't include 64-bit atomics in the
original atomics implementation, and I really think we should have
committed a patch to add them long before now. Also noteworthy is the
fact that, by itself, such a patch cannot break anything except
perhaps the build, for, lo!, unused macros and functions do not do
anything. On the whole, I think that putting such a patch into
PostgreSQL 9.6 is likely to save us more pain than it causes us. I
would be disinclined to endorse applying part of it, because that
seems likely to complicate back-patching for no real gain.
Of course, the real fly in the ointment here is what we're going to do
with the atomics once we have them. But AFAICS, there's no patch for
that, yet. I don't think that I wish to take a position on whether a
patch that hasn't been written yet should be applied. So I think the
next step is that you should post the patches that you think should be
applied in final form and those should be reviewed by knowledgeable
people. Then, based on those reviews, the RMT can decide what to do.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
I have never understood why you didn't include 64-bit atomics in the
original atomics implementation, and I really think we should have
committed a patch to add them long before now.
What will you do on 32-bit platforms (or, more generally, anything
lacking 64-bit-wide atomics)?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas wrote:
On Tue, Apr 12, 2016 at 11:05 PM, Andres Freund <andres@anarazel.de> wrote:
I didn't plan to do anything without a few +1's. I don't think we can
release with the state of things as is though. I don't see a less
intrusive way than to get rid of that spinlock on all platforms capable
of significant concurrency.So, RMT, what are your thoughts on this?
I think that a significant performance regression which affects people
not using snapshot_too_old would be a stop-ship issue,
Agreed.
but I disagree that an issue which only affects people using the
feature is a must-fix.
Agreed.
It's reasonable to fix some kinds of
issues after feature freeze, but not at the price of accepting
arbitrary amounts of new code that may have problems of its own.
Every release will have some warts.
Agreed.
The patch being proposed for commit is fiddly architecture-specific
stuff which is likely to destabilize the tree for quite some time, and
cause lots of additional work to Andres and anyone else likely to work
on such low-level details, such as Robert, both of which already have
plenty to do.
The snapshot-too-old feature is said to be great and shows lots of
improvement in certain cases, and no regression can be measured for
those who have it turned off. The regression only seems to show up if
you turn it on and have a crazily high rate of read-only transactions.
I think this can wait for 9.7.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Apr 13, 2016 at 9:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
I have never understood why you didn't include 64-bit atomics in the
original atomics implementation, and I really think we should have
committed a patch to add them long before now.What will you do on 32-bit platforms (or, more generally, anything
lacking 64-bit-wide atomics)?
We fall back to emulating it using spinlocks. This isn't really an
issue in practice because 32-bit x86 has native 64-bit atomics, and
it's hard to point to another 32-bit platform that is likely to be
have enough concurrency for the lack of 64-bit atomics to matter.
Actually, it looks like we have 64-bit atomics in the tree already;
it's only the fallback implementation that is missing (so anything you
do using 64-bit atomics would need an alternate implementation that
did not rely on them).
But the really interesting that the patch to which Andres linked does
is introduce machinery to try to determine whether a platform has
8-byte single-copy atomicity; that is, whether a load or store of an
aligned 8-byte value is guaranteed not to be torn. We currently avoid
assuming that, but this requires additional spinlocks in a significant
number of places; the regression seen using "snapshot too old" at high
concurrency is merely the tip of the iceberg. And the annoying thing
about avoiding that assumption is that it actually is true on pretty
much every modern platform. Look at this gem Andres wrote in that
patch:
+/*
+ * 8 byte reads / writes have single-copy atomicity on 32 bit x86 platforms
+ * since at least the 586. As well as on all x86-64 cpus.
+ */
+#if defined(__i568__) || defined(__i668__) || /* gcc i586+ */ \
+ (defined(_M_IX86) && _M_IX86 >= 500) || /* msvc i586+ */ \
+ defined(__x86_64__) || defined(__x86_64) || defined(_M_X64) /*
gcc, sunpro, msvc */
+#define PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY
+#endif /* 8 byte single-copy atomicity */
I don't know if that test is actually correct, and I wonder about
compile-time environment vs. run-time environment, but I have my
doubts about how well PostgreSQL 9.6 would run on an i486. I doubt
that is the platform for which we should be optimizing.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Apr 13, 2016 at 9:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
I have never understood why you didn't include 64-bit atomics in the
original atomics implementation, and I really think we should have
committed a patch to add them long before now.
What will you do on 32-bit platforms (or, more generally, anything
lacking 64-bit-wide atomics)?
We fall back to emulating it using spinlocks.
That's what I thought you were going to say, and it means that any
"performance improvement" patch that relies on 64-bit atomics in hotspot
code paths is going to be a complete disaster on anything but modern Intel
hardware. I'm not sure that's a direction we want to go in. We need to
stick to a set of atomics that's pretty widely portable.
This isn't really an
issue in practice because 32-bit x86 has native 64-bit atomics, and
it's hard to point to another 32-bit platform that is likely to be
have enough concurrency for the lack of 64-bit atomics to matter.
It's not concurrency I'm worried about, it's the sheer overhead of
going through the spinlock code.
I'd be okay with atomics that were defined as "pointer width", if
we have a need for that, but I'm suspicious of 64-bits-exactly.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Apr 13, 2016 at 10:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Apr 13, 2016 at 9:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
I have never understood why you didn't include 64-bit atomics in the
original atomics implementation, and I really think we should have
committed a patch to add them long before now.What will you do on 32-bit platforms (or, more generally, anything
lacking 64-bit-wide atomics)?We fall back to emulating it using spinlocks.
That's what I thought you were going to say, and it means that any
"performance improvement" patch that relies on 64-bit atomics in hotspot
code paths is going to be a complete disaster on anything but modern Intel
hardware. I'm not sure that's a direction we want to go in. We need to
stick to a set of atomics that's pretty widely portable.
I think 64-bit atomics *are* pretty widely portable. Can you name a
system with more than 4 CPU cores that doesn't support them?
This isn't really an
issue in practice because 32-bit x86 has native 64-bit atomics, and
it's hard to point to another 32-bit platform that is likely to be
have enough concurrency for the lack of 64-bit atomics to matter.It's not concurrency I'm worried about, it's the sheer overhead of
going through the spinlock code.
I'm not sure I understand exactly what the concern is here. I agree
that there is a possibility that any patch which uses 64-bit atomics
could regress performance on platforms that do not support 64-bit
atomics. That's why I argued initially against having fallbacks for
*any* atomic operations; I was of the opinion that we should be
prepared to carry two implementations of anything that was going to
depend on atomics. I lost that argument, perhaps for the best. I
think one of the problems here is that very few of us have any
hardware available which we could even use to test performance on
systems that lack support for both 32 and 64 bit atomics. We can
compile without atomics on the hardware we do have and see how that
goes, but that's not necessarily indicative of what will happen on
some altogether different CPU architecture. In some cases there might
be an emulator, like the VAX emulator Greg Stark was playing with, but
that's not necessarily indicative either, and also, really, who cares?
I think it would be cool if somebody started a project to try to
optimize the performance of PostgreSQL on, say, a Raspberry Pi. Then
we might learn whether any of this stuff actually matters there or
whether the problems are completely elsewhere (like too much
per-backend memory consumption). However, for reasons that are
probably sort of obvious, I doubt I'll have much luck getting
EnterpriseDB to fund work on that project - if it ever happens, it
will probably have to be the work of a dedicated hobbiest, or somebody
who has a tangible need to build an embedded system using PostgreSQL.
I'd be okay with atomics that were defined as "pointer width", if
we have a need for that, but I'm suspicious of 64-bits-exactly.
I think LSNs are an important case, and they are not pointer width.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Apr 13, 2016 at 10:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
That's what I thought you were going to say, and it means that any
"performance improvement" patch that relies on 64-bit atomics in hotspot
code paths is going to be a complete disaster on anything but modern Intel
hardware. I'm not sure that's a direction we want to go in. We need to
stick to a set of atomics that's pretty widely portable.
I think 64-bit atomics *are* pretty widely portable. Can you name a
system with more than 4 CPU cores that doesn't support them?
No, you're ignoring my point, which is what happens on single-CPU
32-bit machines, and whether we aren't going to destroy performance
on low-end machines in pursuit of better performance on high-end.
Now, to the extent that a patch uses a 64-bit atomic op to replace
a spinlock acquisition, it might be pretty much a wash if low-end
machines have to use a spinlock to emulate the atomic op. But it
would be really easy for the translation to replace one spinlock
acquisition with multiple spinlock acquisitions, and that would hurt.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2016-04-13 08:36:47 -0400, Robert Haas wrote:
I think that a significant performance regression which affects people
not using snapshot_too_old would be a stop-ship issue, but I disagree
that an issue which only affects people using the feature is a
must-fix. It may be desirable to fix it, but I don't think we should
regard it as a hard requirement. It's reasonable to fix some kinds of
issues after feature freeze, but not at the price of accepting
arbitrary amounts of new code that may have problems of its own.
Every release will have some warts.
My problem with that is that snapshot-too-old is essentially a
efficiency feature for busy and large databases. Regressing noticeably
when it's enabled in it's natural habitat seems sad.
Of course, the real fly in the ointment here is what we're going to do
with the atomics once we have them. But AFAICS, there's no patch for
that, yet. I don't think that I wish to take a position on whether a
patch that hasn't been written yet should be applied. So I think the
next step is that you should post the patches that you think should be
applied in final form and those should be reviewed by knowledgeable
people. Then, based on those reviews, the RMT can decide what to do.
Well, I'm less likely to write a patch when there's no chance that it's
going to be applied. Which the rest of the thread sounds like...
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Apr 13, 2016 at 10:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Apr 13, 2016 at 10:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
That's what I thought you were going to say, and it means that any
"performance improvement" patch that relies on 64-bit atomics in hotspot
code paths is going to be a complete disaster on anything but modern Intel
hardware. I'm not sure that's a direction we want to go in. We need to
stick to a set of atomics that's pretty widely portable.I think 64-bit atomics *are* pretty widely portable. Can you name a
system with more than 4 CPU cores that doesn't support them?No, you're ignoring my point, which is what happens on single-CPU
32-bit machines, and whether we aren't going to destroy performance
on low-end machines in pursuit of better performance on high-end.Now, to the extent that a patch uses a 64-bit atomic op to replace
a spinlock acquisition, it might be pretty much a wash if low-end
machines have to use a spinlock to emulate the atomic op. But it
would be really easy for the translation to replace one spinlock
acquisition with multiple spinlock acquisitions, and that would hurt.
One of us is confused, or we're just talking past each other, because
I don't think I'm ignoring your point at all. In fact, I think I just
responded to it rather directly. I agree that the exact risk you are
describing exists. However, the multiple spinlock cycles that you are
concerned about will only occur on a platform that doesn't support
64-bit atomics. In order to test whether there is a performance
problem on such hardware, or how serious that problem is, we'd need to
have access to such hardware, and I don't know where to find any such
hardware. Do you?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Apr 13, 2016 at 11:01 AM, Andres Freund <andres@anarazel.de> wrote:
Well, I'm less likely to write a patch when there's no chance that it's
going to be applied. Which the rest of the thread sounds like...
I hope somebody writes it at some point, because we surely want to fix
this for 9.7. However, I agree that there seems to be a tangible lack
of enthusiasm for doing anything about it right now. I'm slightly
surprised by that, but that's OK: I just work here.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2016-04-13 11:08:21 -0300, Alvaro Herrera wrote:
The patch being proposed for commit is fiddly architecture-specific
stuff which is likely to destabilize the tree for quite some time, and
cause lots of additional work to Andres and anyone else likely to work
on such low-level details, such as Robert, both of which already have
plenty to do.
Personally I think this is an 9.6 open-item, and primarily Kevin has to
work on it.
Note that there really shouldn't be too much fiddly bits, we've already
had 64bit atomics, just no fallback. This is just copying the fallback
code from 32bit atomics to 64bit atomics.
But what I'm actually proposing isn't even using the 64bit atomics from
that patch, just to add
--- a/src/include/port/atomics/arch-ppc.h
+++ b/src/include/port/atomics/arch-ppc.h
@@ -24,3 +24,6 @@
#define pg_read_barrier_impl() __asm__ __volatile__ ("lwsync" : : : "memory")
#define pg_write_barrier_impl() __asm__ __volatile__ ("lwsync" : : : "memory")
#endif
+
+/* per architecture manual doubleword accesses have single copy atomicity */
+#define PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY
to the appropriate files (ia64, ppc, x86) and then add an #ifndef
PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY to GetXLogInsertRecPtr's acquisition
of the spinlock. I.e.
#ifndef PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY
SpinLockAcquire(&Insert->insertpos_lck);
#endif
current_bytepos = Insert->CurrBytePos;
#ifndef PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY
SpinLockRelease(&Insert->insertpos_lck);
#endif
not because I think it's perfectly pretty that way, but because it's
very easy to demonstrate that there's no regressions for anybody.
The regression only seems to show up if you turn it on and have a
crazily high rate of read-only transactions. I think this can wait
for 9.7.
I don't think 120k read tps is all that high anymore these days. And you
can easily create scenarios that are *much* worse than pgbench. E.g. a
loop in a volatile plpgsql function will acquire
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2016-04-13 10:42:03 -0400, Tom Lane wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Apr 13, 2016 at 10:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
That's what I thought you were going to say, and it means that any
"performance improvement" patch that relies on 64-bit atomics in hotspot
code paths is going to be a complete disaster on anything but modern Intel
hardware. I'm not sure that's a direction we want to go in. We need to
stick to a set of atomics that's pretty widely portable.I think 64-bit atomics *are* pretty widely portable. Can you name a
system with more than 4 CPU cores that doesn't support them?No, you're ignoring my point, which is what happens on single-CPU
32-bit machines, and whether we aren't going to destroy performance
on low-end machines in pursuit of better performance on high-end.
I think generally the only platform of concern wrt is arm (< armv8),
which doesn't have 64bit atomicity and doesn't have
single-copy-atomicity for 8 byte values either (C.f.
https://wiki.postgresql.org/wiki/Atomics). But:
Now, to the extent that a patch uses a 64-bit atomic op to replace
a spinlock acquisition, it might be pretty much a wash if low-end
machines have to use a spinlock to emulate the atomic op. But it
would be really easy for the translation to replace one spinlock
acquisition with multiple spinlock acquisitions, and that would hurt.
Which is why I'm actually proposing to *not* use a pg_atomic_uint64, just a
single define to remove the spinlock acquisition:
http://archives.postgresql.org/message-id/20160413150839.mevdlgekizxyjhc5%40alap3.anarazel.de
I think there are a number of LSNs which we'd be better of replacing LSN
manipulations with an actual atomic operation (including fallback to the
spinlock) might be beneficial. But that'd be a larger patch, and would
require more testing; which is why I'm proposing the above.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Apr 13, 2016 at 11:18 AM, Andres Freund <andres@anarazel.de> wrote:
I think generally the only platform of concern wrt is arm (< armv8),
which doesn't have 64bit atomicity and doesn't have
single-copy-atomicity for 8 byte values either (C.f.
https://wiki.postgresql.org/wiki/Atomics).
That page is sort of confusing, because it says that platform has
those things but then says ***, which is footnoted to mean "linux
kernel emulation available", but it's not too clear whether that
applies to all atomics or just 8-byte atomics. The operator
precedence of / (used as a separator) vs. footnotes is not stated.
It's also not clear what "linux kernel emulation available" actually
means. Should we think of those things being fast, or slow?
At any rate, I do actually have a Raspberry Pi 2 here so if we ever
commit a patch that might suck without real 64-bit atomics we might be
able to actuall test whether it does or not. But as you say, no such
patch is being proposed at the moment.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Apr 13, 2016 at 10:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
No, you're ignoring my point, which is what happens on single-CPU
32-bit machines, and whether we aren't going to destroy performance
on low-end machines in pursuit of better performance on high-end.
One of us is confused, or we're just talking past each other, because
I don't think I'm ignoring your point at all. In fact, I think I just
responded to it rather directly. I agree that the exact risk you are
describing exists. However, the multiple spinlock cycles that you are
concerned about will only occur on a platform that doesn't support
64-bit atomics. In order to test whether there is a performance
problem on such hardware, or how serious that problem is, we'd need to
have access to such hardware, and I don't know where to find any such
hardware. Do you?
As Andres says, low-end ARM machines are probably the most common such
hardware right now. I have two non-ARM machines in the buildfarm that
certainly haven't got such instructions (prairiedog and gaur/pademelon).
Now I wouldn't propose that we need to concern ourselves very much with
performance on those two decade-plus-old platforms, but I do think that
performance on small ARM machines is still of interest.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Apr 13, 2016 at 10:01 AM, Andres Freund <andres@anarazel.de> wrote:
My problem with that is that snapshot-too-old is essentially a
efficiency feature for busy and large databases. Regressing noticeably
when it's enabled in it's natural habitat seems sad.
With a real-world application with realistic simulated user load
there was no such regression and a big gain in performance over
time, so we're talking about adjusting how broad a range of
workloads it benefits. I don't have a strong opinion yet, since I
haven't run the benchmarks on big machines (scheduled for the day
after tomorrow); but as an example, if I only see such regression
on a Linux kernel with version a version < 3.8 I am going to be
less concerned about getting something into 9.6, since IMO it is
completely irresponsible to run a NUMA machine with 4 or more nodes
on an OS with a substandard NUMA scheduler. I'm not sure when 3.8
became available, but according to Wikipedia Version 3.10 of the
Linux kernel was released in June 2013, so it's not like you need
to be on the bleeding edge to have a decent scheduler.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2016-04-13 11:27:09 -0400, Robert Haas wrote:
That page is sort of confusing, because it says that platform has
those things but then says ***, which is footnoted to mean "linux
kernel emulation available", but it's not too clear whether that
applies to all atomics or just 8-byte atomics. The operator
precedence of / (used as a separator) vs. footnotes is not stated.
/ has a higher precedence than footnotes. Not sure how to make that
easily clear. I'm not exactly a mediawiki expert.
It's also not clear what "linux kernel emulation available" actually
means. Should we think of those things being fast, or slow?
Slow. It means that the compiler generates a syscall to perform the
atomic. The syscall disables preemption, then performs the actual math,
re-enables preemption, and returns. That's a lot more expensive than a
spinlock. There's
/*
* 64 bit atomics on arm are implemented using kernel fallbacks and might be
* slow, so disable entirely for now.
* XXX: We might want to change that at some point for AARCH64
*/
#define PG_DISABLE_64_BIT_ATOMICS
for that reason (in the current tree, not patch).
The whole fallback facility exists to make it easier to port software to
arm; but I wouldn't want to rely on it if not necessary.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2016-04-13 10:31:19 -0500, Kevin Grittner wrote:
With a real-world application with realistic simulated user load
there was no such regression and a big gain in performance over
time, so we're talking about adjusting how broad a range of
workloads it benefits.
I think it depends very heavily on the type of application. To be
affected you need a high rate of snapshot acquisitions. So lots of small
statements, or possibly longer running stuff involving volatile
functions (which IIRC get new snapshots continually).
but as an example, if I only see such regression on a Linux kernel
with version a version < 3.8 I am going to be less concerned about
getting something into 9.6, since IMO it is completely irresponsible
to run a NUMA machine with 4 or more nodes on an OS with a substandard
NUMA scheduler. I'm not sure when 3.8 became available, but according
to Wikipedia Version 3.10 of the Linux kernel was released in June
2013, so it's not like you need to be on the bleeding edge to have a
decent scheduler.
I don't think effect of adding a single spinlock (an exclusive lock!) in
a hot path is likely to be hugely dependant on the kernel version.
We've had such cases before, and felt the pain. E.g. the spinlock in the
ProcArrayLock used to be a *HUGE* contention point, and it has pretty
much the same acquisition pattern as this spinlock now.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Apr 13, 2016 at 10:59 AM, Andres Freund <andres@anarazel.de> wrote:
but as an example, if I only see such regression on a Linux kernel
with version a version < 3.8 I am going to be less concerned about
getting something into 9.6, since IMO it is completely irresponsible
to run a NUMA machine with 4 or more nodes on an OS with a substandard
NUMA scheduler. I'm not sure when 3.8 became available, but according
to Wikipedia Version 3.10 of the Linux kernel was released in June
2013, so it's not like you need to be on the bleeding edge to have a
decent scheduler.I don't think effect of adding a single spinlock (an exclusive lock!) in
a hot path is likely to be hugely dependant on the kernel version.
My experience is that is easily can be. We had a customer who
could not scale beyond a certain point due to spinlock contention
on a single spinlock already present in stock pg. We tried lots of
config tweaks and a some custom patches to no avail. Then we had
them upgrade from RHEL 6.latest to RHEL 7.latest, and they could
scale much, much farther. No OS or pg config changes were made at
the same time. The difference is that they went from kernel
version kernel 2.6.32 to kernel version 3.10.0. The early version
3 kernels had a NUMA scheduler rewrite that was a disaster compared
to 2.6.32. They rewrote it again in 3.8, with dramatic effect.
We've had such cases before, and felt the pain. E.g. the spinlock in the
ProcArrayLock used to be a *HUGE* contention point, and it has pretty
much the same acquisition pattern as this spinlock now.
It would be great to have improvements in such access patterns, no
doubt. I'll be happy if we get there. I don't have a problem
trying to contribute to the effort, either, if people think that
might actually be a net gain. But if we have a point where those
not using the new feature are unaffected, and the question is about
the range of workloads where the new feature will be helpful in
9.6, it doesn't seem to me to rise to the level of a bug or a
release blocker.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2016-04-12 14:53:57 -0500, Kevin Grittner wrote:
On Tue, Apr 12, 2016 at 2:28 PM, Andres Freund <andres@anarazel.de> wrote:
On 2016-04-12 14:17:12 -0500, Kevin Grittner wrote:
Well, something is different between your environment and mine,
since I saw no difference at scale 100 and 2.2% at scale 200.In a readonly test or r/w?
Readonly with client and job counts matching scale.
A lot of this will be different between
single-socket and multi-socket servers; as soon as you have the latter
the likelihood of contention being bad goes up dramatically.Yeah, I know, and 4 socket has been at least an order of magnitude
more problematic in my experience than 2 socket. And the problems
are far, far, far worse on kernels prior to 3.8, especially on 3.x
before 3.8, so it's hard to know how to take any report of problems
on a 4 node NUMA machine without knowing the kernel version.
On an EC2 m4.10xlarge (dedicated, but still a VM) - sorry I don't have
anything better at hand right now, and it was already running.
postgres config:
postgres -D /srv/data/dev/
-c shared_buffers=64GB \
-c max_wal_size=64GB \
-c maintenance_work_mem=32GB \
-c huge_pages=on \
-c max_connections=400 \
-c logging_collector=on -c log_filename='postgresql.log' \
-c log_checkpoints=on -c autovacuum=off \
-c autovacuum_freeze_max_age=80000000 \
-c synchronous_commit=off
Initialized with pgbench -q -i -s 300
Before each run I prewarmed with
psql -c "create extension if not exists pg_prewarm;select sum(x.x) from (select pg_prewarm(oid) as x from pg_class where relkind in ('i', 'r') order by oid) x;" > /dev/null 2>&1;
running pgbench -M prepared -c 128 -j 128 -n -P 1 -T 100 -S
With -c old_snapshot_threshold=0:
latency average = 0.218 ms
latency stddev = 0.154 ms
tps = 584666.289753 (including connections establishing)
tps = 584867.785569 (excluding connections establishing)
With -c old_snapshot_threshold=10:
latency average = 1.112 ms
latency stddev = 1.246 ms
tps = 114883.528964 (including connections establishing)
tps = 114905.555943 (excluding connections establishing)
With 848ef42bb8c7909c9d7baa38178d4a209906e7c1 (and followups) reverted:
latency average = 0.210 ms
latency stddev = 0.050 ms
tps = 607734.407158 (including connections establishing)
tps = 607918.118566 (excluding connections establishing)
A quicker (each -T 10) test, without restarts between scale reuns, of
other scales:
scale thres=0 thresh=10
1 15377.761645 15017.789751
1 16285.111754 14829.493870
2 29563.478651 28790.462964
4 62649.628931 50935.364141
8 84557.464387 85631.348766
16 101475.002295 93908.910894
32 347435.607586 167702.527893
64 575640.880911 150139.375351
128 594782.154256 112183.933956
196 584290.957806 92080.129402
256 583921.995839 79345.378887
398 582138.372414 58100.798609
- Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Apr 13, 2016 at 1:19 PM, Andres Freund <andres@anarazel.de> wrote:
On an EC2 m4.10xlarge (dedicated, but still a VM) - sorry I don't have
anything better at hand right now, and it was already running.postgres config:
postgres -D /srv/data/dev/
-c shared_buffers=64GB \
-c max_wal_size=64GB \
-c maintenance_work_mem=32GB \
-c huge_pages=on \
-c max_connections=400 \
-c logging_collector=on -c log_filename='postgresql.log' \
-c log_checkpoints=on -c autovacuum=off \
-c autovacuum_freeze_max_age=80000000 \
-c synchronous_commit=offInitialized with pgbench -q -i -s 300
Before each run I prewarmed with
psql -c "create extension if not exists pg_prewarm;select sum(x.x) from (select pg_prewarm(oid) as x from pg_class where relkind in ('i', 'r') order by oid) x;" > /dev/null 2>&1;running pgbench -M prepared -c 128 -j 128 -n -P 1 -T 100 -S
With -c old_snapshot_threshold=0:
latency average = 0.218 ms
latency stddev = 0.154 ms
tps = 584666.289753 (including connections establishing)
tps = 584867.785569 (excluding connections establishing)With -c old_snapshot_threshold=10:
latency average = 1.112 ms
latency stddev = 1.246 ms
tps = 114883.528964 (including connections establishing)
tps = 114905.555943 (excluding connections establishing)With 848ef42bb8c7909c9d7baa38178d4a209906e7c1 (and followups) reverted:
latency average = 0.210 ms
latency stddev = 0.050 ms
tps = 607734.407158 (including connections establishing)
tps = 607918.118566 (excluding connections establishing)
Yuck. Aside from the fact that performance tanks when the feature is
turned on, it seems that there is a significant effect even with it
turned off.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On 2016-04-13 13:25:14 -0400, Robert Haas wrote:
With -c old_snapshot_threshold=0:
latency average = 0.218 ms
latency stddev = 0.154 ms
tps = 584666.289753 (including connections establishing)
tps = 584867.785569 (excluding connections establishing)With -c old_snapshot_threshold=10:
latency average = 1.112 ms
latency stddev = 1.246 ms
tps = 114883.528964 (including connections establishing)
tps = 114905.555943 (excluding connections establishing)With 848ef42bb8c7909c9d7baa38178d4a209906e7c1 (and followups) reverted:
latency average = 0.210 ms
latency stddev = 0.050 ms
tps = 607734.407158 (including connections establishing)
tps = 607918.118566 (excluding connections establishing)Yuck. Aside from the fact that performance tanks when the feature is
turned on
A quick look at the former shows that it's primarily contention around
the new OldSnapshotTimeMapLock not, on that hardware in that workload,
the spinlock. Which isn't that surprising because it adds an exclusive
lock to a path which doesn't contain any other exclusive locks these
days...
I have to say, I'm *highly* doubtful that it's ok to add an exclusive
lock in a readonly workload to such an hot path, without any clear path
forward how to fix that scalability issue. This doesn't apear to be
requiring just a bit of elbow grease, but a fair bit more.
it seems that there is a significant effect even with it turned off.
It looks that way, but I'd rather run a bit more careful and repeated
tests to make sure about that part. At a factor of 5, as with the on/off
tests, per-run varitions don't play a large role, but at smaller
percentages it's worthwhile to put more care into it. If possible it'd
be helpful to avoid a VM too...
Andres
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On Wed, Apr 13, 2016 at 12:25 PM, Robert Haas <robertmhaas@gmail.com> wrote:
[test results with old_snapshot_threshold = 0 and 10]
From the docs:
| A value of -1 disables this feature, and is the default.
Yuck. Aside from the fact that performance tanks when the feature is
turned on, it seems that there is a significant effect even with it
turned off.
No evidence of that has been provided. -1 is off; 0 is for testing
very fast expiration.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2016-04-13 13:52:15 -0500, Kevin Grittner wrote:
On Wed, Apr 13, 2016 at 12:25 PM, Robert Haas <robertmhaas@gmail.com> wrote:
[test results with old_snapshot_threshold = 0 and 10]
From the docs:
| A value of -1 disables this feature, and is the default.
Hm, ok, let me run that as well then. The reason for the massive
performance difference presumably is that
MaintainOldSnapshotTimeMapping() is cut short due to
/* No further tracking needed for 0 (used for testing). */
if (old_snapshot_threshold == 0)
return;
which means that OldSnapshotTimeMap isn't acquired exclusively.
Yuck. Aside from the fact that performance tanks when the feature is
turned on, it seems that there is a significant effect even with it
turned off.No evidence of that has been provided. -1 is off; 0 is for testing
very fast expiration.
I'll run with -1 once the current (longer) run has finished.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Apr 13, 2016 at 1:56 PM, Andres Freund <andres@anarazel.de> wrote:
I'll run with -1 once the current (longer) run has finished.
Just for the record, were any of the other results purporting to be
with the feature "off" also actually running with the feature set
for its fastest possible timeout?
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On 4/12/16 12:30 PM, Tom Lane wrote:
It'd be good if you document the problems you found somewhere, before
you forget them, just in case somebody does want to try to lift the
restriction. I agree that scattered code comments wouldn't be the way.
Just a quick email to -hackers to get the info into the archives
might be enough.
I think a code comment pointing at the archived message would be good
though...
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Apr 13, 2016 at 3:08 PM, Kevin Grittner <kgrittn@gmail.com> wrote:
On Wed, Apr 13, 2016 at 1:56 PM, Andres Freund <andres@anarazel.de> wrote:
I'll run with -1 once the current (longer) run has finished.
Just for the record, were any of the other results purporting to be
with the feature "off" also actually running with the feature set
for its fastest possible timeout?
Mine were testing something else entirely, so I didn't touch
old_snapshot_threshold at all.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On 2016-04-13 13:52:15 -0500, Kevin Grittner wrote:
On Wed, Apr 13, 2016 at 12:25 PM, Robert Haas <robertmhaas@gmail.com> wrote:
[test results with old_snapshot_threshold = 0 and 10]
From the docs:
| A value of -1 disables this feature, and is the default.
Yuck. Aside from the fact that performance tanks when the feature is
turned on, it seems that there is a significant effect even with it
turned off.No evidence of that has been provided. -1 is off; 0 is for testing
very fast expiration.
Longer tests are running, but, again on the previous hardware with only
two sockets, the results for 128 clients are:
0:
progress: 100.0 s, 593351.0 tps, lat 0.215 ms stddev 0.118
progress: 200.0 s, 594035.9 tps, lat 0.215 ms stddev 0.118
progress: 300.0 s, 594013.3 tps, lat 0.215 ms stddev 0.117
-1:
progress: 100.0 s, 600835.3 tps, lat 0.212 ms stddev 0.049
progress: 200.0 s, 601466.1 tps, lat 0.212 ms stddev 0.048
progress: 300.0 s, 601529.5 tps, lat 0.212 ms stddev 0.047
reverted:
progress: 100.0 s, 612676.6 tps, lat 0.208 ms stddev 0.048
progress: 200.0 s, 613214.3 tps, lat 0.208 ms stddev 0.047
progress: 300.0 s, 613384.3 tps, lat 0.208 ms stddev 0.047
This is all on virtualized (though using a dedicated instance)
hardware. So they numbers are to be taken with a grain of salt. But I
did run shorter tests in various orders, and the runtime difference
apears to be very small.
- Andres
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On 2016-04-13 14:08:49 -0500, Kevin Grittner wrote:
On Wed, Apr 13, 2016 at 1:56 PM, Andres Freund <andres@anarazel.de> wrote:
I'll run with -1 once the current (longer) run has finished.
Just for the record, were any of the other results purporting to be
with the feature "off" also actually running with the feature set
for its fastest possible timeout?
Yes, I'd only used 0 / 10. I think that shows that the contention, for
me, is primarily the lwlock, not the spinlock.
Greetings,
Andres Freund
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Hi Kevin,
On 2016-04-13 12:21:10 -0700, Andres Freund wrote:
0:
progress: 100.0 s, 593351.0 tps, lat 0.215 ms stddev 0.118
progress: 200.0 s, 594035.9 tps, lat 0.215 ms stddev 0.118
progress: 300.0 s, 594013.3 tps, lat 0.215 ms stddev 0.117-1:
progress: 100.0 s, 600835.3 tps, lat 0.212 ms stddev 0.049
progress: 200.0 s, 601466.1 tps, lat 0.212 ms stddev 0.048
progress: 300.0 s, 601529.5 tps, lat 0.212 ms stddev 0.047reverted:
progress: 100.0 s, 612676.6 tps, lat 0.208 ms stddev 0.048
progress: 200.0 s, 613214.3 tps, lat 0.208 ms stddev 0.047
progress: 300.0 s, 613384.3 tps, lat 0.208 ms stddev 0.047
Setting it to 1 gives:
progress: 100.0 s, 115413.7 tps, lat 1.107 ms stddev 1.240
progress: 200.0 s, 114907.4 tps, lat 1.113 ms stddev 1.244
progress: 300.0 s, 115621.4 tps, lat 1.106 ms stddev 1.238
If you want me to rn some other tests I can, but ISTM we have the data
we need?
- Andres
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On Wed, Apr 13, 2016 at 3:01 PM, Andres Freund <andres@anarazel.de> wrote:
If you want me to rn some other tests I can, but ISTM we have the
data we need?
Thanks for the additional detail on how this was run. I think I
still need a little more context, though:
What is the kernel on which these tests were run?
Which pg commit were these tests run against?
If 2201d801 was not included in your -1 tests, have you identified
where the 2% extra run time is going on -1 versus reverted? Since
several other threads lately have reported bigger variation than
that based on random memory alignment issues, can we confirm that
this is a real difference in what is at master's HEAD? Of course,
I'm still scheduled to test on bare metal machines in a couple
days, on two different architectures, so we'll have a few more data
points after that.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2016-04-13 15:21:31 -0500, Kevin Grittner wrote:
On Wed, Apr 13, 2016 at 3:01 PM, Andres Freund <andres@anarazel.de> wrote:
If you want me to rn some other tests I can, but ISTM we have the
data we need?Thanks for the additional detail on how this was run. I think I
still need a little more context, though:What is the kernel on which these tests were run?
3.16. I can upgrade to 4.4 if necessary. But I still believe very
strongly that this is side-tracking the issue. An exclusive lock (or
spinlock) in a very hot path, which previously didn't have a specific
exclusively locked lock, will present scalability issues, regardless of
kernel.
Which pg commit were these tests run against?
85e00470. + some reverts (the whitespace commits make this harder...) in
the reverted case.
If 2201d801 was not included in your -1 tests, have you identified
where the 2% extra run time is going on -1 versus reverted?
No. It's hard to do good profiles on most virtualized hardware, since
hardware performance counters are disabled. So you only can do OS
sampling; which has a pretty big performance influence.
I'm not entirely sure what you mean with "2201d801 was not included in
your -1 tests". The optimization was present.
Since several other threads lately have reported bigger variation than
that based on random memory alignment issues, can we confirm that this
is a real difference in what is at master's HEAD?
It's unfortunately hard to measure this conclusively here (and in
general). I guess we'll have to look, on native hardware, where the
difference comes from. The difference is smaller on my laptop, and my
workstation is somewhere on a container ship, other physical hardware I
do not have.
Greetings,
Andres Freund
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On Wed, Apr 13, 2016 at 3:47 PM, Andres Freund <andres@anarazel.de> wrote:
On 2016-04-13 15:21:31 -0500, Kevin Grittner wrote:
What is the kernel on which these tests were run?
3.16. I can upgrade to 4.4 if necessary.
No, I'm not aware of any problems from 3.8 on.
But I still believe very strongly that this is side-tracking the issue.
As long as I know it isn't a broken NUMA scheduler, or that there
were fewer than four NUMA memory nodes, I consider it a non-issue.
I just need to know whether it fits that problem profile to feel
comfortable that I can interpret the results correctly.
Which pg commit were these tests run against?
85e00470. + some reverts (the whitespace commits make this harder...) in
the reverted case.If 2201d801 was not included in your -1 tests, have you identified
where the 2% extra run time is going on -1 versus reverted?No. It's hard to do good profiles on most virtualized hardware, since
hardware performance counters are disabled. So you only can do OS
sampling; which has a pretty big performance influence.I'm not entirely sure what you mean with "2201d801 was not included in
your -1 tests". The optimization was present.
Sorry, the "not" was accidental -- I hate reverse logic errors like that.
Based on the commit you used, I have my answer. Thanks.
Since several other threads lately have reported bigger variation than
that based on random memory alignment issues, can we confirm that this
is a real difference in what is at master's HEAD?It's unfortunately hard to measure this conclusively here (and in
general). I guess we'll have to look, on native hardware, where the
difference comes from. The difference is smaller on my laptop, and my
workstation is somewhere on a container ship, other physical hardware I
do not have.
OK, thanks. I can't think of anything else to ask for at this
point. If you feel that you have enough to press for some
particular course of action, go for it. Personally, I want to do
some more investigation on those big machines.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On 2016-04-13 16:05:25 -0500, Kevin Grittner wrote:
OK, thanks. I can't think of anything else to ask for at this
point. If you feel that you have enough to press for some
particular course of action, go for it.
I think we, at the very least, need a clear proposal how to resolve the
scalability issue around OldSnapshotTimeMapLock in 9.6. Personally I
think we shouldn't release with such a large regression due to a
performance oriented feature; but if we do, we need to be confident that
we can easily resolve it for 9.7. In contrast to the spinlock issue I
don't see an easy way unfortunately. Without such a plan it seems too
likely to go unfixed for a long time otherwise.
Personally, I want to do some more investigation on those big
machines.
Sounds good, especially around the regression with the feature disabled.
Andres
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On Thu, Apr 14, 2016 at 12:23 AM, Andres Freund <andres@anarazel.de> wrote:
On 2016-04-13 16:05:25 -0500, Kevin Grittner wrote:
OK, thanks. I can't think of anything else to ask for at this
point. If you feel that you have enough to press for some
particular course of action, go for it.I think we, at the very least, need a clear proposal how to resolve the
scalability issue around OldSnapshotTimeMapLock in 9.6. Personally I
think we shouldn't release with such a large regression due to a
performance oriented feature; but if we do, we need to be confident that
we can easily resolve it for 9.7. In contrast to the spinlock issue I
don't see an easy way unfortunately. Without such a plan it seems too
likely to go unfixed for a long time otherwise.Personally, I want to do some more investigation on those big
machines.Sounds good, especially around the regression with the feature disabled.
I've also run read-only test on 4x18 Intel machine between master and
snapshot_too_old reverted. In particular, I've reverted following commits:
8b65cf4c5edabdcae45ceaef7b9ac236879aae50
848ef42bb8c7909c9d7baa38178d4a209906e7c1
80647bf65a03e232c995c0826ef394dad8d685fe
a6f6b78196a701702ec4ff6df56c346bdcf9abd2
2201d801b03c2d1b0bce4d6580b718dc34d38b3e
I've obtained following results.
clients master sto-reverted
1 13918 12997
2 26143 26728
4 50521 52539
8 104330 103785
10 129067 132606
20 255561 255844
30 368472 371359
40 444486 450429
50 489950 497705
60 563606 564385
70 710579 718860
80 916480 934170
90 1089917 1152961
100 1201337 1240055
110 1147208 1207727
120 1116256 1167681
130 1066475 1120891
140 1040379 1085904
150 974064 1022160
160 938396 976487
170 953636 978120
180 920772 953843
We can see small but certain regression after snapshot too old feature.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
sto-comparison.pngimage/png; name=sto-comparison.pngDownload
�PNG
IHDR � � +
I pHYs � �+t* x IDATx���\��giT,<� ,�1 ���n��<���D=�[��0;���/�4l�g�Q)�~�����3���.?&��K�R PV|� �� Pj+ ��V @�!� �RCX ��� J
a ��
(5� Pj+ ��
V�>��D�k��v^z��3N�fMkT�j��9�0V������e
��� ��Q�ll�j^7Gq��JL�k��r������] @���"|8���rf����C����G.�j6�j�y�jff�H>7���OD����^�m��L$�/�����^{l���n��T|�4C�Eo��zk}W�Bu eH����%
���Ok�|���4�����T�U����P���������]�aP���v����x����6W��#�z8�r�ra��v�YR�to�����?j:�we'��V���R P��+�$�U�{�W{���/�^�����/��{^�����;��[������m��t���L��S��:[���F��m������[� ��&������yD
�y'��������-`W����, J�|������$ y,��f�����D���X���<��qs�b��������s���Wk=����)�z�^T�?f��iv�06�&�� ���v��(��. P��l Q�o�3�j���F��/�U�U�(6�j�Z�CI�8BN3SY��@�Ed������ g����Z��-nR���Y ��)JX^���lMfD�T�"�'��q���OR2�7���:_
�\��r �J��� �.OOO�K�B*|XI�g����Q���/.]zHO�^��i��K���D�9VM�v �J�\-���zD����Z����zK��ccn�#����W���?/���#W��v���+ @y�|��]^!��bF����/a�,��m�m����������^I����WFx��y�����9�D3�.���J��C"�\
��5k���K�F���]:Gd!P��Y����� �������J��X�f���o�)�x���7�o���l��D��jc����G�T���4���#Ou������C��a���8����������|n��@"��]<�>��y����1T����G�W/{>wP���1+ eL>�J�3�����B�f>f�s�L�J��R�����*���i!�;.v�Q�D�a�27�����m������;wd��b����`[�5|��p+\W P��#���8r� WE5�����R�8��5{��gQ��bN�ak�}��n��T�������|A%�y�Q6��]����I��]M�B��lJ~��w 2)]�e�0���Za� �2$?��*�8��|o%��]��l:����������n���z�F������w e�� J
a ��
(5� Pj+ ��V @�!� �RCX ��� J
a ��
(5� Pj+ ��V @�!� �RCX ��� J
a ��
(5� Pj+ ��V @�!� �RCX ��� J
a ��
(5� Pj+ ��V @�!� �RCX ��� J
a ��
@��v�8xo�9�Z��R
a�|�j/�v�4�%�J]��xl �_+ ��<�0�S$�^q��#� @���P�e'9�y J��2��Vt�F����BGo�P @!� �]C�H*r�y����s�z�+ (
��2K|���RJZ���R����S���%[ @ � �Y���q�4�|-����'=�[�2��9�%#��, �V �,�@@6${v����G"Qv�����m�}��!us�n}IW��k �:���k�Q������5����0��^�>#M��^(�zHQ�g1����=����_�|�q��8� J �
@�$����9NE��D�i+��H���dj�4�?PH
]�qZ�X,
?K�g9���8P�>��2w��E�#h���W�+ ��!� �E�I�o�Z15�r8�^"�6�����o����j��2���CK�_G����O��������c��������<�z����������]�)�HM�:6��B^�� a����OD�fMj�|��md��G��������>-��9�0V������e��� ��Q�ll�j^7G_q��JL�k��r����tP�emSq�.$�������*���[D��(# �.�M�3�q�h�B���tp�������?�U�h���6����/����<� ���V���5�X�L9�~�VW�|������O�L����Q�� ��|"��<�zm{�!{����m'e��|w��631��b���u���Z�U���]�g��������vfn���{�#�|��b�D�����A^�/_a����SN��5T�����g����Sas��������,��������/����~�7x�G��im����iG�x��e��}��m���GF�p^�������!���'��w��!�t��0�N���
���W^@9�3�%:hh��`����o��'~���+g�:v��� �u�
+�i���^����i6�S��eL�.���.j����`?l\�1�E�N�����_�-��S�f���k:��E&����D���v�W#R��jio_��-����y���n�<����������Nnr���t[�
?�m(>O*�E�h����Y�>���1�?����%��,y�JDo� ����+��� _�WX1����H���C
G�����:���vE�{z����y�h��������s���W��!xF��x��{Q���Y���9�����_'��
����� �_&�>�V���������0�$�8���1�yE�k�� ���@����$���6q����X��%;�$�#�$13�?��-L�"2P�HJ`r�����8��D�oF����\�U���������@��9P����������i���s����iNg���s��}E~Nb��e�V�[w��[3�� |Wq����9�f��Q��76X�������������0���J����XR��r������|4{zz��S �7o�(+����h��C����+����w{zrsn_��yz������W,�{�
s�w������� (��Z�9�D���H46���.�=0�D�9�I�v �J�\-���z��ci�������,�����GdcQK5��]��� �*y��)���/�����c�x<�8��bq�M�Ew�����yN�������
�S���-������<# ��*DX��#>|�,���~6�A��'�5[���>�)�Px��y�����9�D3�.���J��C"�\
��5k���K�F�.�]:Gd!P)pW��F �>EN*����
;sw�������g�Oy��[��;|�����l�� �����Jt�J���������~�ec�}c�~ ���WXII�K%Mn�l������q<i��g���L4����u�����Rq]5u�A�)h~�YG�.��9������T�+q�M3��&C%������G��.�vw���}� �*����������
�[��p*���T�^�~���}�a���z��Y�����S 8���F���A�O�917����?AY���?��S��������I����sYjd��0- ��+�g�4����un���91�zT6�����Ok�Lx�i?-��a���:���a�27�����m������;wd��b����`[�5|��p�=��]�}�H*��S�X�=�^�&���e��s���nu������?�O0����#>t �B��7�$+�p7�����������+D��0_�}h��� �D>��Z��#�pU�7�r32RjWW��a���
���Q���
�������KR#wO��:�*9�K�����j,��L��j��R�fS�w��I�-c����F�/xW e[������=m�����Cy���$�f����������f��2��
:vm���?�(� ?5�Y�F�����{vI?*>�$%I��Iw�89�� t��h�\�"Q�N�0�fo,��@Y���@�L�\L�Z�����%0t2�1������n���z�F��T,]�U�K*�6�9G>]�j��E^_&�����I�}�LG��������v�����v�o5��|T������A�S��N���$r22D��s��(8H���� ��U����Ci�_���#��T�_�o��c�iuo���U+k|�Z��w�����u�C����B���P��g��������/AC��O��������/!� ��V ��W{������h��|nSy�=}W�Dqp��q��6���3����}�1�W\��U\B�;w��;�g�Z���Z�r ?iH�`���W/6$�+i)�hgQ� �
� e��T������H��W�=��E�N��1Y>���l�/��U����=z���%��|�7��������n�j**\�W�KiiT���q�����tt�p�6r?�����Q��������CXPJYI%�4?I�1f���COLjn��g!�����]�����!�l�H�'��+w���c�W�}%�s��Y�K;K���a0�Q���L�Q��T
e�
���,�|�Ie��+[_�OW�T���_U5��P����`�x'�nf��_��<_%;��z�����3��+�<����
r����E�f�����
a@�����"�g������P���GOX�O>����?=�~�I��zg�G���s}hlRZ{�����o�B6�m6�P����D�'�����
a@����""����]����*5r6�%����-)C$�������Q����pF�b�������g�}�,����W����3��C�S;GZ9St��b���������!c��|q�S (V ���������<>������n���+�<�U*�zM|�u<l�6����k��"����6�����+��WB.��mE���������m��g��y�'9K��,�����3z���'.�P� � (�=�8�M� ���yR�u��� U�l���
'����F}�����]~��C�>-MX�ej�#^������Ej���_�n������/^��Q�m�4f�W���a@�h�pCo���)%�0W�T��'S���=�d�q���
������K�^������:����k��`j��Q�c�~U���Iv]�9�����������;�iHo�E[�'5mS<��rCXP:�����MEW��gy���L*O_�
��K�5�����f��s�U��S�w�}��������E'��M�VR50��6���wl��)��(��"���������}Z��1�[F3}e'��a@��|��THC]|���I%-C��$������~N��\������f�?����MX|��5�S��S����9�N:z��������o�Vj9����D��z��usD��N �$
;F�������3eg8��
���>s4�����_1�H ` �
����9{�#���H*���_-�6m\�w:�����kx<���������g<x�n��������[���][V���c���{�do��U�Ms6������D��P4�D��>�wp���=j�h�X���
@����dDB����I�Q#�{��U�]��h����{Y�
�������W`�
s���1�v�\����k������{����5���mM�&�r/��yZ9Kt9R���"
��"�
@����dN���UL���r���/��<(�p9[����F�]���f>�����=��
s{����#W���ul��T�$���"iaE[O����?�qgy^�t��
9�/ (� %��_���Z����Z�>>���m�Y��M���E#RJ���`tW�v����G�=s3&>�������$��Dy��/'�+���kX1?5 � � e�Dt^1�-�s���9H��>Sv>{/��j�d���%\`!X�������v�A@X�?��~L[��7����>�y�z��/�Q ��
�2a~�f��u�k����NF>�O7���}��%YZ���n�F�-95��������9|��)������~N�i^wrJ?�S@IAXPR�Hz0����
��u��N���~R>�SA��_EM��.�8|vN���D&���G'/��,W^���rR?]|Q'��J�[�m��[�����"+ ��s���"���$�"2���5}���hT�$k+y-{-[����u�6�|�W�M���4�&&������X���t�2Gw:���lC���� 9!� (�}��<
YR���N�%P��^�����Nm/����~��? +�p�We���ES�?�4�'&R�N��Ct�^����8�?�����-��#��yoh�w�6�6 �!� (�����I�����]Y������H�������.?�9�KE����b�J�����
�IEFC�z
gn�;�)x�(,��S�!��H�U�s����`���So�7�f�6n�hi@q�2 ( � ��o�g�Y���F�G/�@9�p9����$��^�����;����;�Gli��{�����i����@Gw�6����^"y�A����l��lNn�����fN�/����n�ro1�: (� �I�b:��5����.���/w#���:w��|�����>��G,��;C���T�ih�/#�����
-�����I���G�(�������A�N��!� �a�}��!����]�r11�yj������c���1��w��q���E�e�5�[�HO��-s�O�HG�S�vQ���%����
�V ��~�W)�GbV+Q2n����/�vv������������5��r�p������C��fn�[(�Ot�0�D_[W�WF8���EzF (8� �����O��[�mv�Q:��u�v1�}uX@�q��yiB��=g��9����������w<5�dn���8����l!
?�?����
�2 �V ��3��S�m���
V�QRzU������_-',=����������&�_;�����|>� ,R�`*���WD�o�����P�V �$�J����Z�OuD
M�n�Y�r33��ud`��YkB���+�'����V^��;��#�oR�O
_���&��a����i�Kov�)-zw5������5;O��es�_�n������������?m.�.����CX`���� ��\�;t�����s?���@UU0sX��=[����+�s�9�
��Bt������
�����R�D3&�/��Y�e'B��������I�@�|;[���H4T��k���s�}P��Sz���Ld�������_����7���s���^���������������]K������0�@A � ����O'���g�R���v)�Ku���V,UVv�}�>s�=s��������� ���R�w�)m]B&�g�B�[�V/��$�^J<���M���T� � a�5���+�����Z:p����7:.I�8����ZY�V����4U�mM�Z�h���l3Vn���2��5�����C�����t�P��H#�P�t��9��G��q�E (� �D���i.�������*?ZR�����m���
jt�4�n��ms����L�n'���8�������)��������.\/;Q ~ � �����������lV�L��|�R��M}v*���������jW��m������e��v�U��������������y'��:\:�O�� ��� ��Vx�����|k�Z�U~��?Ue����s���z�����~���?��v�]an�*�<�5����&y�>��J��
��W"h�V��S�� @X`�!NF� !=0w��goV���*���E�o� �VG��gW�&K/�|z���Qw����fR�W�
�U
�Kw��/+
V��'��������
������W � �@�}hmM=��SZ��7@qy ��c<�Y%�T���_�'���U}�F�����N� ;����i�B^@���F�7������$I��M�������'i�r�P��,@&��yZ���|�������z���8Z��5�SyB�v��crz���G��
��z!�������;�����B��� ��+��_16�YZ<�f�7n����� �+ %.�O1�����TlVa��F��]M�[��
�v}�~��z�h�&���
�96G��?������l�~�����+n�|CX(Y?��+Ny���$=��[O���*e_@����m%����?U4cR�UD��>v�W
a�d���gZ�/�f����YE�
��w�4�o�:d���w�;m����y.G^((����Oqh-�zU�������Y�t�z�����K�5�S��+U�!��_2^�}k���W
a�]?��� +��R�RlVQ6m���g����������?����Y;�7�f_���'����'B^�?���7��Z�
^x�U�������ZlRZ�)����f�8G>���O���%&���H�����~�y+ �P������'"]�&�?{P���������\��s,��9�0V������e��
��v-�����������
@�$�����'��m�+�^�k�|�F%����U
������>�w\��B>�fw��;�{����{����GE���g�"0#�W�]%K����,�gX>
���c93�|��![]Es|�4C�E��@�zk}Wmf*���v>��y����z�Db�2m�O���|w��v�v�����_GF������1�����6j��
�|��F�S[GM\vh�����;�~�����gg�F��nd��(�;�����KC<�mLi�L2�����j�
+��84�rB>��iC�0|�,^� �����u�5��;���nw���x1���o���1�m�t�#��/r�,^��{��U?�������\���]|]�~�E �m��WnU�d'�K��5*i���*��
��zj���G��K��O&t��iB�?��s8������3��7=}���������D�#�ow~;+=���J���
+�i���^����i6�S�(�`0�����F�\
�y'���������_@���zo���Z6���_���/2��Nm&�tl����Z�UK{�Zn9&�n^\]��zV�o@Q��(",�����K��R����E�����~���J����r����}�V��l�����^U��9�qC������<@"��Dg�������4j�b�[ ����b4s��$"��Va���E��p`lnM���qq�����g��!0��"��s��p���j��X�3r4��[���jWWo�r��,�n��Lu�;l�|�UJ/��U��>f��;B/�[NF>l������u�z�����D��
���[���i�&��I�v�(44{?�����I�s71K6$. d*��@I�Lxq6�������h��hq�:D�9����J��r�������&D�j$%WW�6�H��[�k��B�h�B�g�u?|X���m�����������}A,�Ed��*��~�,�0�/��:t��e��f�r�Y�{����(��O�J��5;������Dt���������~vL��X�@�!�d�RD�/��L�e������6����*�\-LR����o� �"�W��8@Aw���k��kU���p��v����1���=x���M�H�E'I+m��KE��]uy�T�u������+���o�[�<O/��F??���|�z
+��_>��<���x���}�E��Z�!�T��G�o�[�P�/��y�����j�"�k& e;�T�b�����="QZb�u�N �g�b�Z�n��ok�������*eE��z��
����i�7���;Um�|���N�nnI���/��E��7�m��t��f����~����<���5��I����4jn��~rW�l-�������ci��"���"�d}�
�j���K�F��{��YT4k�b��<{%}�S�^��c��z[��X��;�Kf(��$r5���,����� ��]����I����=����p�J���Q!hi�����X{$M(����MB/���j�������D����I��$��8�y�Q�v"OG�U��������YqvR�$
�)�}��J)_a%%%.�4����c��S��������;m��>j�l����ev�����^��4�����u���n���3�|�=Lu���43p�a�1T��M�H�<����]\]�T P:���'W ~~���|�U��q�Vm�����������b�"���?H���<��I��D�-���hw��C�y�����
y�U>�J�3�����B�f>f�s�L��g��v���X�~p��,.�O��q���b���u��FL}�L�6�Ag�b���;2GF)�� �Hj�(��|Rhj�&��|�U�����#�����[O
��l����N���H7�����5ey�]�S��c^����s�:�PN�+j-�x�UQ�j�fd��4��!\�fS�w��I�-c�����i8l�������X���{�4T�E]�i^j���}Wc9�f�uxW���b�
@y�R�����z]��S�����U�0]���\M{g�������s��/������_[Mt�41�b"�+#��qquPt��
T�����k��u=�S��C�9��@������lW Jb�b������D*F*�UIs������@��_-�'9Y��������=GU�� ��p�e�������sC64r~s;E>��*���� kO[F=��X����_<G���������n��-~R�I3jmE?w�6�Y����VL!M���
3P�V ~���Ck�����4�d����V�f z���o������0���,
�����z���+���J^E�~�L_���~H�i'��BG��b����������*�����Rk�������SD���� �u�c78���$��BX�!�i�������~����b���U�Y��{�v��C�7�:������q�����I���ob(|�X��)�8��g��}+�H���&rps�+D����g���\n���6������S���(���JJw1A^�b���Cp���~L���*k�����]I�U�K��;B/373��]-�8��;���+|������I��J~d�P�c~�#�x�.��^�?~�k����0bn4��_��4�|Z���(V+ ?�����uZ�NU6��
6��\����>����}�(���WW�Zs�����_,�TU���c|B�l�S����h���k^�V���Iv�V�_�r1�U������=��]��.�K.]�������E^� �@�BX��E^���*�$&�����G���*�E�:���I*����A�oo�����?m�x���.�s�i�`���=������a�9��cL���CU>����B��Go��<.�k�����@��%�>�K'��Y��+�� �zTf^�viI��#�@!� 'qR���#P�����at���lV)W���w����cK�v�����������g��1�)>��(@*�>z{���n>������go�E�OK����d[J�R*i�u�r�&���X�X��wE2���]��8����lEg����(�5 2!� ��$��D�y�
enVY�5��������A�
3z.��m���[�_������c��#KLl����[c�=z}���'/���e���TYZ����N��}��8Z5�
���G��r�<��,���e^�[�,:q�?�w�?�e ���PLR��vF���B��*"��lV���PS��s�p���=!gn&e����:�gc%���o1�:�:=������n��W��7���"K*�/"�@qAX(_$�fis�t-]lV���l�s���}��K�^�/�}Qz�W�R�zz�����UE�a��o������>�-z�q�����m�����8������&��$(v+ E�ERalP3��(v��v��U�tu��h��f�������~�nXO�Y���0iT3g�x)g�[L��t���.?������w�3��un��_w�Nm0�9
kFj���L�iBk��Z��|��
+ E%��E���4���S7���-ih�l]P�����Q�V�k������j�����I*���5�1�I�_�r���7�K?�������U�����X*�.�����eV2WHKAX�j�w���VA����=]M�v�K�_n9p������R��>}����B�
P@+ %B�:� �y�:�m�T��'n�8|%��C����S(� �>U��CK��2&~���AaW��_��CX (~6�����{%VI��3�udng#��
�-����a �(�8�O�L��]�+ �g�B9�� J
a ��
^����������4� (<�(+ E���y���bP��_(+ E�=5�o���$����ov��N�� P� � ����?)��UI��.-Y- ��AX(��Ew��V�uWRI���l����[ @��P����-<#��Ty��,X. ��AX(��8QX�S�V������e���u�-
��AX(���)%m�fS�Xq*�(kv+ (�V
k��T��P>��~
���V P&!� ��s��5�HT�
��,�� ��BX(���������j��1�
���Pp�����>_��IUy��������� �*���;�/M���eD"��Ccz��Z �a����v���~D�8��c�u*�[ @��P@�O�?��4I+F��+� ��V
(x��8���XnT����!� �m+ �Fx��^�z�������?�[ @���P�p22�j*.TQUep�6l� P�!� ����?����Yq
yK�V5���
��CX�/�����^n�����L��Y�Z @����o���r+5���kQ�yC=v+ (V ��]���i5�8����^8c �$ � ����b�h{#��~����� ��@X�>�XL�w����'����s�����[ @9Q�������K?pH�����U��$����Q��|��V.V�9r�a�DU��u'�9��x{'$�Z,GU����y]*JW ��s>D����Vg������_��] @yQ��rw�K���>���z����\���i���d��zk}Wmf*���v>��y�����C��i�N�n�=6��J7�#
�@�����@����|���y�J��V P~!��;2�I*=�?�3��GO�GX�����qF�Cd��O���F�C���4�a��t��/����~�7x�G��im����iG�x��eI�}��m���GF�p^�������v������
&�������I���3�X (A�+�GQ������$F�_�Y
X{>���y���.;����������Nnr��/ �w|��lWN-���������`G�6u:��U_�H�����}-�Z7/pW���� |�wS"�$i,�3m\�ms�o? �Q��
���Q����=�3y����Lv1oU��O�c[4���alnM���qq�V����J��Z�O���K8��_�Zg���9�R����E��U��&��$����P3x/Q����e �V��@j��w���9�����f�9���� D����W�kaL��f��I��=hj�6��q��$f�� {�0!��@U#)��]�� ��s��8��6M�H�z5<LX� ��)BXI���f���"_g���k'l��z��^k�RD���'$K�dn��]H��\��v��oP���_}W��-qe�l�Z{���[ @yS��re�������=�e�o<����|��-9P���Qo�BW�b��{�g�E-��D�9zH&0��R1W���a����(-��VA��U���G�&�xzz�UC����h���4:��������=��^E��������Z�����W��zu+f�T���U�j���K���>�]:Gd!P����(�����N��o;O����>��hf����]2C��~H$��A�fA�*������/�ab��?�?#� �����VL��g�������D�$ijW�6��N�w�� �$1|��4y��W�sM����fy���v|�L/"GS]���7��G�l��s�'9��"�i��b|k�� %�7�5Ns�
�{�� ,(�n����;��?���T�V���E2M�>���o����`[�5|��\�&�b?-��a���:����en1�=2=��|�
�Cw���%��wPLN�K~��Y>������
�� �|*�����h~442*.�zO6���C=����l6%?q�;��.�2v��X1������n[��%�����KC��V*9�K�����j,��L��j��
�x��R�-U��f�n9 �V�
$���1�&�%��:��)��:�������C~�/�� ���Q�����y�ve
U����� ���U���o��
U/I���zu1�P��z (+ _����$h&?c����� �5+ _�;8&!����|����a���V P�!� |a���Z�d1O>7��� �&���=��v��.�_�s
j�:��[ @9����=T���TS>���-��d � a�iZ���A?����Z�
��=��] @y��� ����)�3�����[V�Vc�$ @X�!x�_�F$R����� � �
@�����q��jo��E�z&�5�� a���M�
�����^ @) � ��!�Nl�:�jW����9�E ��
�{�D�
a���Z;J��vw��|>��� @a���� ��e&���J��&U=fz��$��*<nw �4V�<
������3'��D�5C,���*��e,T _@X ����7���W���Z��Z �T��0F*�����~UUUA �S�!�@y��JT�;+�C�K� (}L{kO���G~8�W�`u��lRD+P {���cV�I �����SW� ����H]���4�F�@�.������~[��[�:�c%u�/+�0��X���� �Ibr������������l��[W��b&5�C�Bs�&�i���-rx��gK�� �-m7����vL�H��y���[��d9;(`��>e��>�����~���Jo�����:ItR�3����������x��C�f;W'z0�����S*-Y�����.�-�k�V >���M����I~z�3���A��������h�3|�F���q����O��:oQ�
�����Nl>x�Qv,��w{����{uo���g��z���E��i��G�X:�Q=��^,��`�As�=�-��<������:�������]�~��]/��t����B�,�Sgu����/�AX `�G���������>#O91��� �U���0oWh��(��f�l�V�F�g�17�x��u���D�W��%r��n�!kk���>o\S�a�_MjN�l��Ww�����s���'39����u���sl�Td�?�������m+ ��~
�WaZ�M;CU�������4Z�&��ap�Y��p��^^c����J]Fz���N\N{�)N�z�t���= �����i����m�����-)E1�I3�\��j%�:a�#Nz�% ��jbP��u�����w���}��Mi��VTPd����T|������Az������t�=7h�v���;���x�y#rF���h��������S>2����zRM=�B�v���[0�V�<jr�h����:�6���=�r�V �BK�B�6E���=�!���+D��/�e��H����k��I3� ���[��v$�8�k�������S��J�����1��/�e������Q2+P��~f���7����W��N ��T�<����B�(���t�q-���@/K����I�9X�ls_��m��IC�������}�l5m�u�Z���MQ��~����(j�����1�c�7��R@X��g����F$V$����WO��nE P>�|)=#!�^>���H�.%k���d���CE
���������]o��.��V��������l����kO��}��R�H6E!:�uA�o�o�a���4m��H��.v;;����#�L!��������ag}=�s�*i����� �4*�1j5L��;���|�n���������4���?Z�1Q� �O���#"u�N�X�GG�
�.�e��al���c��$��n`����������=]��-�v=�Fdlq�������������;�k�yP����@'c��|5�a��e������X��?s����r� ���ZW��9U��������Q_>�^���`�2�l*�����w��_t[�v�H���J�]������i�_�\�������!��$����M�z��f�"