pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <
Avoid extra locks in GetSnapshotData if old_snapshot_threshold < 0
On a big NUMA machine with 1000 connections in saturation load
there was a performance regression due to spinlock contention, for
acquiring values which were never used. Just fill with dummy
values if we're not going to use them.
This patch has not been benchmarked yet on a big NUMA machine, but
it seems like a good idea on general principle, and it seemed to
prevent an apparent 2.2% regression on a single-socket i7 box
running 200 connections at saturation load.
Branch
------
master
Details
-------
http://git.postgresql.org/pg/commitdiff/2201d801b03c2d1b0bce4d6580b718dc34d38b3e
Modified Files
--------------
src/backend/storage/ipc/procarray.c | 28 ++++++++++++++++++++--------
1 file changed, 20 insertions(+), 8 deletions(-)
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Kevin Grittner wrote:
Avoid extra locks in GetSnapshotData if old_snapshot_threshold < 0
On a big NUMA machine with 1000 connections in saturation load
there was a performance regression due to spinlock contention, for
acquiring values which were never used. Just fill with dummy
values if we're not going to use them.
old_snapshot_threshold is PGC_POSTMASTER, so this is okay AFAICS, but
perhaps it'd be a good idea to add a oneline comment to guc.c indicating
to verify this code if there's an intention to lift that limitation --
snapshots taken before the reload would have invalid lsn/timestamp, so
the current check would simply skip the check, which would be the wrong
thing to do.
I think it's reasonable to want to enable this feature with a simple
reload, so if we ever do that it'd be good to have a pointer about that
gotcha. (I'm not proposing you do that, only add the comment for a
future hacker.)
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Apr 12, 2016 at 12:08 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
Kevin Grittner wrote:
Avoid extra locks in GetSnapshotData if old_snapshot_threshold < 0
On a big NUMA machine with 1000 connections in saturation load
there was a performance regression due to spinlock contention, for
acquiring values which were never used. Just fill with dummy
values if we're not going to use them.old_snapshot_threshold is PGC_POSTMASTER, so this is okay AFAICS, but
perhaps it'd be a good idea to add a oneline comment to guc.c indicating
to verify this code if there's an intention to lift that limitation --
snapshots taken before the reload would have invalid lsn/timestamp, so
the current check would simply skip the check, which would be the wrong
thing to do.I think it's reasonable to want to enable this feature with a simple
reload, so if we ever do that it'd be good to have a pointer about that
gotcha. (I'm not proposing you do that, only add the comment for a
future hacker.)
Perhaps, but this would be one of at least a dozen land mines that
exist for trying to modify this setting to be read on reload.
FWIW, I spent a fair amount of time trying to make it PGC_SIGHUP,
since it would be very nice to allow that; but I kept running into
one problem after another with it, some of which were very hard to
see how to fix. My inclination is that trying to comment all the
places that would need something done if we do this in some later
release would be distracting until such time as we get there, and
might give a false sense of security to anyone who fixes all the
places the comments were scattered.
If there is a consensus that the comments would be worthwhile, I
can do a pass over the code I had before I gave up on PGC_SIGHUP
and try to add comments to all the appropriate spots based on
differences due to when the GUC was changed. If we don't want
that, I'm not sure why this one spot is a better place for such a
comment than all the others.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Kevin Grittner <kgrittn@gmail.com> writes:
On Tue, Apr 12, 2016 at 12:08 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:old_snapshot_threshold is PGC_POSTMASTER, so this is okay AFAICS, but
perhaps it'd be a good idea to add a oneline comment to guc.c indicating
to verify this code if there's an intention to lift that limitation --
Perhaps, but this would be one of at least a dozen land mines that
exist for trying to modify this setting to be read on reload.
FWIW, I spent a fair amount of time trying to make it PGC_SIGHUP,
since it would be very nice to allow that; but I kept running into
one problem after another with it, some of which were very hard to
see how to fix.
It'd be good if you document the problems you found somewhere, before
you forget them, just in case somebody does want to try to lift the
restriction. I agree that scattered code comments wouldn't be the way.
Just a quick email to -hackers to get the info into the archives
might be enough.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Kevin Grittner wrote:
FWIW, I spent a fair amount of time trying to make it PGC_SIGHUP,
since it would be very nice to allow that; but I kept running into
one problem after another with it, some of which were very hard to
see how to fix. My inclination is that trying to comment all the
places that would need something done if we do this in some later
release would be distracting until such time as we get there, and
might give a false sense of security to anyone who fixes all the
places the comments were scattered.
Okay, that's fair.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2016-04-12 16:49:25 +0000, Kevin Grittner wrote:
On a big NUMA machine with 1000 connections in saturation load
there was a performance regression due to spinlock contention, for
acquiring values which were never used. Just fill with dummy
values if we're not going to use them.
FWIW, I could see massive regressions with just 64 connections.
I'm a bit scared of having an innoccuous sounding option regress things
by a factor of 10. I think, in addition to this fix, we need to actually
solve the scalability issue here to a good degree. One way to do so is
to apply the parts of 0001 in
http://archives.postgresql.org/message-id/20160330230914.GH13305%40awork2.anarazel.de
defining PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY and rely on that. Another
to apply the whole patch and simply put the lsn in an 8 byte atomic.
- Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Apr 12, 2016 at 12:38 PM, Andres Freund <andres@anarazel.de> wrote:
On 2016-04-12 16:49:25 +0000, Kevin Grittner wrote:
On a big NUMA machine with 1000 connections in saturation load
there was a performance regression due to spinlock contention, for
acquiring values which were never used. Just fill with dummy
values if we're not going to use them.FWIW, I could see massive regressions with just 64 connections.
With what settings? With or without the patch to avoid the locks when off?
I'm a bit scared of having an innoccuous sounding option regress things
by a factor of 10. I think, in addition to this fix, we need to actually
solve the scalability issue here to a good degree. One way to do so is
to apply the parts of 0001 in
http://archives.postgresql.org/message-id/20160330230914.GH13305%40awork2.anarazel.de
defining PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY and rely on that. Another
to apply the whole patch and simply put the lsn in an 8 byte atomic.
I think that we are well due for atomic access to aligned 8-byte
values. That would eliminate one potential hot spot in the
"snapshot too old" code, for sure.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On 2016-04-12 13:44:00 -0500, Kevin Grittner wrote:
On Tue, Apr 12, 2016 at 12:38 PM, Andres Freund <andres@anarazel.de> wrote:
On 2016-04-12 16:49:25 +0000, Kevin Grittner wrote:
On a big NUMA machine with 1000 connections in saturation load
there was a performance regression due to spinlock contention, for
acquiring values which were never used. Just fill with dummy
values if we're not going to use them.FWIW, I could see massive regressions with just 64 connections.
With what settings?
You mean pgbench or postgres? The former -M prepared -c 64 -j 64 -S. The
latter just a large enough shared buffers to contains the scale 300
database, and adapted maintenance_work_mem. Nothing special.
With or without the patch to avoid the locks when off?
Without. Your commit message made it sound like you need unrealistic or
at least unusual numbers of connections, and that's afaics not the case.
I'm a bit scared of having an innoccuous sounding option regress things
by a factor of 10. I think, in addition to this fix, we need to actually
solve the scalability issue here to a good degree. One way to do so is
to apply the parts of 0001 in
http://archives.postgresql.org/message-id/20160330230914.GH13305%40awork2.anarazel.de
defining PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY and rely on that. Another
to apply the whole patch and simply put the lsn in an 8 byte atomic.I think that we are well due for atomic access to aligned 8-byte
values. That would eliminate one potential hot spot in the
"snapshot too old" code, for sure.
I'm kinda inclined to apply that portion (or just the whole patch with
the spurious #ifdef 0 et al fixed) into 9.6; and add the necessary
checks in a few places. Because I really think this is likely to hit
unsuspecting users.
FWIW, accessing a frequently changing value from a significant number of
connections, at a high frequency, isn't exactly free without a spinlock
either. But it should be much less bad.
Greetings,
Andres Freund
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On Tue, Apr 12, 2016 at 1:56 PM, Andres Freund <andres@anarazel.de> wrote:
On 2016-04-12 13:44:00 -0500, Kevin Grittner wrote:
On Tue, Apr 12, 2016 at 12:38 PM, Andres Freund <andres@anarazel.de> wrote:
On 2016-04-12 16:49:25 +0000, Kevin Grittner wrote:
On a big NUMA machine with 1000 connections in saturation load
there was a performance regression due to spinlock contention, for
acquiring values which were never used. Just fill with dummy
values if we're not going to use them.FWIW, I could see massive regressions with just 64 connections.
With what settings?
You mean pgbench or postgres? The former -M prepared -c 64 -j 64 -S. The
latter just a large enough shared buffers to contains the scale 300
database, and adapted maintenance_work_mem. Nothing special.
Well, something is different between your environment and mine,
since I saw no difference at scale 100 and 2.2% at scale 200. So,
knowing more about your hardware, OS, configuration, etc., might
allow me to duplicate a problem so I can fix it. For example, I
used a "real" pg config, like I would for a production machine
(because that seems to me to be the environment that is most
important): the kernel is 3.13 (not one with pessimal scheduling)
and has tuning for THP, the deadline scheduler, the vm.*dirty*
settings, etc. Without knowing even the kernel and what tuning the
OS and pg have had on your box, I could take a lot of shots in the
dark without hitting anything. Oh, and the output of `numactl
--hardware` would be good to have. Thanks for all information you
can provide.
With or without the patch to avoid the locks when off?
Without. Your commit message made it sound like you need unrealistic or
at least unusual numbers of connections, and that's afaics not the case.
It was the only reported case to that point, so the additional data
point is valuable, if I can tell where that point is. And you
don't have any evidence that even with your configuration that any
performance regression remains for those who have the default value
for old_snapshot_threshold?
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Hi,
On 2016-04-12 14:17:12 -0500, Kevin Grittner wrote:
Well, something is different between your environment and mine,
since I saw no difference at scale 100 and 2.2% at scale 200.
In a readonly test or r/w? A lot of this will be different between
single-socket and multi-socket servers; as soon as you have the latter
the likelihood of contention being bad goes up dramatically.
So,
knowing more about your hardware, OS, configuration, etc., might
allow me to duplicate a problem so I can fix
For example, I used a "real" pg config, like I would for a production
machine (because that seems to me to be the environment that is most
important): the kernel is 3.13 (not one with pessimal scheduling) and
has tuning for THP, the deadline scheduler, the vm.*dirty* settings,
etc. Without knowing even the kernel and what tuning the OS and pg
have had on your box, I could take a lot of shots in the dark without
hitting anything.
That shouldn't really matter much for a read-only, shared_buffer
resident, test? There's no IO and THP pretty much plays no role because
there's very few memory allocations (removing the pressure causing the
well known degradations).
Oh, and the output of `numactl --hardware` would be good to have.
Thanks for all information you can provide.
That was on Alexander's/PgPro's machine. Numactl wasn't installed, and I
didn't have root. But it has four numa domains (gathered via /sys/).
It was the only reported case to that point, so the additional data
point is valuable, if I can tell where that point is. And you
don't have any evidence that even with your configuration that any
performance regression remains for those who have the default value
for old_snapshot_threshold?
I haven't tested yet.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Apr 12, 2016 at 2:28 PM, Andres Freund <andres@anarazel.de> wrote:
On 2016-04-12 14:17:12 -0500, Kevin Grittner wrote:
Well, something is different between your environment and mine,
since I saw no difference at scale 100 and 2.2% at scale 200.In a readonly test or r/w?
Readonly with client and job counts matching scale.
A lot of this will be different between
single-socket and multi-socket servers; as soon as you have the latter
the likelihood of contention being bad goes up dramatically.
Yeah, I know, and 4 socket has been at least an order of magnitude
more problematic in my experience than 2 socket. And the problems
are far, far, far worse on kernels prior to 3.8, especially on 3.x
before 3.8, so it's hard to know how to take any report of problems
on a 4 node NUMA machine without knowing the kernel version.
knowing more about your hardware, OS, configuration, etc., might
allow me to duplicate a problem so I can fixFor example, I used a "real" pg config, like I would for a production
machine (because that seems to me to be the environment that is most
important): the kernel is 3.13 (not one with pessimal scheduling) and
has tuning for THP, the deadline scheduler, the vm.*dirty* settings,
etc. Without knowing even the kernel and what tuning the OS and pg
have had on your box, I could take a lot of shots in the dark without
hitting anything.That shouldn't really matter much for a read-only, shared_buffer
resident, test? There's no IO and THP pretty much plays no role because
there's very few memory allocations (removing the pressure causing the
well known degradations).
I hate to assume which differences matter without trying, but some
of them seem less probable than others.
Oh, and the output of `numactl --hardware` would be good to have.
Thanks for all information you can provide.That was on Alexander's/PgPro's machine. Numactl wasn't installed, and I
didn't have root. But it has four numa domains (gathered via /sys/).
On the machines I've used, it will give you the hardware report
without being root. But of course, it can't do that if it's not
installed. I hadn't yet seen a machine with multiple NUMA memory
segments that didn't have the numactl executable installed; I'll
keep in mind that can happen.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On Tue, Apr 12, 2016 at 2:53 PM, Kevin Grittner <kgrittn@gmail.com> wrote:
Readonly with client and job counts matching scale.
Single-socket i7, BTW.
A lot of this will be different between
single-socket and multi-socket servers; as soon as you have the latter
the likelihood of contention being bad goes up dramatically.Yeah, I know, and 4 socket has been at least an order of magnitude
more problematic in my experience than 2 socket. And the problems
are far, far, far worse on kernels prior to 3.8, especially on 3.x
before 3.8, so it's hard to know how to take any report of problems
on a 4 node NUMA machine without knowing the kernel version.
Also, with 4 node NUMA I have seen far better scaling with
hyper-threading turned off. I know there are environments where it
helps, but high-concurrency on multi-node NUMA is not one of them.
So, anyway, mentioning the HT setting is important, too.
Kevin Grittner
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
Andres Freund wrote:
I'm kinda inclined to apply that portion (or just the whole patch with
the spurious #ifdef 0 et al fixed) into 9.6; and add the necessary
checks in a few places. Because I really think this is likely to hit
unsuspecting users.
!!!
Be sure to consult with the RMT before doing anything of the sort.
It might as well decide to revert the whole patch.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2016-04-12 23:52:14 -0300, Alvaro Herrera wrote:
Andres Freund wrote:
I'm kinda inclined to apply that portion (or just the whole patch with
the spurious #ifdef 0 et al fixed) into 9.6; and add the necessary
checks in a few places. Because I really think this is likely to hit
unsuspecting users.!!!
Be sure to consult with the RMT before doing anything of the sort.
I didn't plan to do anything without a few +1's. I don't think we can
release with the state of things as is though. I don't see a less
intrusive way than to get rid of that spinlock on all platforms capable
of significant concurrency.
So, RMT, what are your thoughts on this?
Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Apr 12, 2016 at 11:05 PM, Andres Freund <andres@anarazel.de> wrote:
On 2016-04-12 23:52:14 -0300, Alvaro Herrera wrote:
Andres Freund wrote:
I'm kinda inclined to apply that portion (or just the whole patch with
the spurious #ifdef 0 et al fixed) into 9.6; and add the necessary
checks in a few places. Because I really think this is likely to hit
unsuspecting users.!!!
Be sure to consult with the RMT before doing anything of the sort.
I didn't plan to do anything without a few +1's. I don't think we can
release with the state of things as is though. I don't see a less
intrusive way than to get rid of that spinlock on all platforms capable
of significant concurrency.So, RMT, what are your thoughts on this?
I think that a significant performance regression which affects people
not using snapshot_too_old would be a stop-ship issue, but I disagree
that an issue which only affects people using the feature is a
must-fix. It may be desirable to fix it, but I don't think we should
regard it as a hard requirement. It's reasonable to fix some kinds of
issues after feature freeze, but not at the price of accepting
arbitrary amounts of new code that may have problems of its own.
Every release will have some warts.
My testing yesterday of latest master, specifically
deb71fa9713dfe374a74fc58a5d298b5f25da3f5, last night did not show
evidence of a regression under heavy concurrency, as per
/messages/by-id/CA+TgmobpHAqsOeHc-ooRsjzTKw1H4s4P1VBtwh1KkKO+6Mp8_Q@mail.gmail.com
- that test was of course run without enabling "snapshot too old".
My guess is that 2201d801b03c2d1b0bce4d6580b718dc34d38b3e was
sufficient to put things right, and that we now have a problem only
when "snapshot too old" is enabled.
I have never understood why you didn't include 64-bit atomics in the
original atomics implementation, and I really think we should have
committed a patch to add them long before now. Also noteworthy is the
fact that, by itself, such a patch cannot break anything except
perhaps the build, for, lo!, unused macros and functions do not do
anything. On the whole, I think that putting such a patch into
PostgreSQL 9.6 is likely to save us more pain than it causes us. I
would be disinclined to endorse applying part of it, because that
seems likely to complicate back-patching for no real gain.
Of course, the real fly in the ointment here is what we're going to do
with the atomics once we have them. But AFAICS, there's no patch for
that, yet. I don't think that I wish to take a position on whether a
patch that hasn't been written yet should be applied. So I think the
next step is that you should post the patches that you think should be
applied in final form and those should be reviewed by knowledgeable
people. Then, based on those reviews, the RMT can decide what to do.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
I have never understood why you didn't include 64-bit atomics in the
original atomics implementation, and I really think we should have
committed a patch to add them long before now.
What will you do on 32-bit platforms (or, more generally, anything
lacking 64-bit-wide atomics)?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas wrote:
On Tue, Apr 12, 2016 at 11:05 PM, Andres Freund <andres@anarazel.de> wrote:
I didn't plan to do anything without a few +1's. I don't think we can
release with the state of things as is though. I don't see a less
intrusive way than to get rid of that spinlock on all platforms capable
of significant concurrency.So, RMT, what are your thoughts on this?
I think that a significant performance regression which affects people
not using snapshot_too_old would be a stop-ship issue,
Agreed.
but I disagree that an issue which only affects people using the
feature is a must-fix.
Agreed.
It's reasonable to fix some kinds of
issues after feature freeze, but not at the price of accepting
arbitrary amounts of new code that may have problems of its own.
Every release will have some warts.
Agreed.
The patch being proposed for commit is fiddly architecture-specific
stuff which is likely to destabilize the tree for quite some time, and
cause lots of additional work to Andres and anyone else likely to work
on such low-level details, such as Robert, both of which already have
plenty to do.
The snapshot-too-old feature is said to be great and shows lots of
improvement in certain cases, and no regression can be measured for
those who have it turned off. The regression only seems to show up if
you turn it on and have a crazily high rate of read-only transactions.
I think this can wait for 9.7.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Apr 13, 2016 at 9:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
I have never understood why you didn't include 64-bit atomics in the
original atomics implementation, and I really think we should have
committed a patch to add them long before now.What will you do on 32-bit platforms (or, more generally, anything
lacking 64-bit-wide atomics)?
We fall back to emulating it using spinlocks. This isn't really an
issue in practice because 32-bit x86 has native 64-bit atomics, and
it's hard to point to another 32-bit platform that is likely to be
have enough concurrency for the lack of 64-bit atomics to matter.
Actually, it looks like we have 64-bit atomics in the tree already;
it's only the fallback implementation that is missing (so anything you
do using 64-bit atomics would need an alternate implementation that
did not rely on them).
But the really interesting that the patch to which Andres linked does
is introduce machinery to try to determine whether a platform has
8-byte single-copy atomicity; that is, whether a load or store of an
aligned 8-byte value is guaranteed not to be torn. We currently avoid
assuming that, but this requires additional spinlocks in a significant
number of places; the regression seen using "snapshot too old" at high
concurrency is merely the tip of the iceberg. And the annoying thing
about avoiding that assumption is that it actually is true on pretty
much every modern platform. Look at this gem Andres wrote in that
patch:
+/*
+ * 8 byte reads / writes have single-copy atomicity on 32 bit x86 platforms
+ * since at least the 586. As well as on all x86-64 cpus.
+ */
+#if defined(__i568__) || defined(__i668__) || /* gcc i586+ */ \
+ (defined(_M_IX86) && _M_IX86 >= 500) || /* msvc i586+ */ \
+ defined(__x86_64__) || defined(__x86_64) || defined(_M_X64) /*
gcc, sunpro, msvc */
+#define PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY
+#endif /* 8 byte single-copy atomicity */
I don't know if that test is actually correct, and I wonder about
compile-time environment vs. run-time environment, but I have my
doubts about how well PostgreSQL 9.6 would run on an i486. I doubt
that is the platform for which we should be optimizing.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Apr 13, 2016 at 9:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
I have never understood why you didn't include 64-bit atomics in the
original atomics implementation, and I really think we should have
committed a patch to add them long before now.
What will you do on 32-bit platforms (or, more generally, anything
lacking 64-bit-wide atomics)?
We fall back to emulating it using spinlocks.
That's what I thought you were going to say, and it means that any
"performance improvement" patch that relies on 64-bit atomics in hotspot
code paths is going to be a complete disaster on anything but modern Intel
hardware. I'm not sure that's a direction we want to go in. We need to
stick to a set of atomics that's pretty widely portable.
This isn't really an
issue in practice because 32-bit x86 has native 64-bit atomics, and
it's hard to point to another 32-bit platform that is likely to be
have enough concurrency for the lack of 64-bit atomics to matter.
It's not concurrency I'm worried about, it's the sheer overhead of
going through the spinlock code.
I'd be okay with atomics that were defined as "pointer width", if
we have a need for that, but I'm suspicious of 64-bits-exactly.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Apr 13, 2016 at 10:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Apr 13, 2016 at 9:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
I have never understood why you didn't include 64-bit atomics in the
original atomics implementation, and I really think we should have
committed a patch to add them long before now.What will you do on 32-bit platforms (or, more generally, anything
lacking 64-bit-wide atomics)?We fall back to emulating it using spinlocks.
That's what I thought you were going to say, and it means that any
"performance improvement" patch that relies on 64-bit atomics in hotspot
code paths is going to be a complete disaster on anything but modern Intel
hardware. I'm not sure that's a direction we want to go in. We need to
stick to a set of atomics that's pretty widely portable.
I think 64-bit atomics *are* pretty widely portable. Can you name a
system with more than 4 CPU cores that doesn't support them?
This isn't really an
issue in practice because 32-bit x86 has native 64-bit atomics, and
it's hard to point to another 32-bit platform that is likely to be
have enough concurrency for the lack of 64-bit atomics to matter.It's not concurrency I'm worried about, it's the sheer overhead of
going through the spinlock code.
I'm not sure I understand exactly what the concern is here. I agree
that there is a possibility that any patch which uses 64-bit atomics
could regress performance on platforms that do not support 64-bit
atomics. That's why I argued initially against having fallbacks for
*any* atomic operations; I was of the opinion that we should be
prepared to carry two implementations of anything that was going to
depend on atomics. I lost that argument, perhaps for the best. I
think one of the problems here is that very few of us have any
hardware available which we could even use to test performance on
systems that lack support for both 32 and 64 bit atomics. We can
compile without atomics on the hardware we do have and see how that
goes, but that's not necessarily indicative of what will happen on
some altogether different CPU architecture. In some cases there might
be an emulator, like the VAX emulator Greg Stark was playing with, but
that's not necessarily indicative either, and also, really, who cares?
I think it would be cool if somebody started a project to try to
optimize the performance of PostgreSQL on, say, a Raspberry Pi. Then
we might learn whether any of this stuff actually matters there or
whether the problems are completely elsewhere (like too much
per-backend memory consumption). However, for reasons that are
probably sort of obvious, I doubt I'll have much luck getting
EnterpriseDB to fund work on that project - if it ever happens, it
will probably have to be the work of a dedicated hobbiest, or somebody
who has a tangible need to build an embedded system using PostgreSQL.
I'd be okay with atomics that were defined as "pointer width", if
we have a need for that, but I'm suspicious of 64-bits-exactly.
I think LSNs are an important case, and they are not pointer width.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers