Re: better atomics

Started by Robert Haasabout 12 years ago21 messages
#1Robert Haas
robertmhaas@gmail.com

On Wed, Oct 16, 2013 at 12:52 PM, Andres Freund <andres@2ndquadrant.com> wrote:

I have a related problem, which is that some code I'm currently
working on vis-a-vis parallelism can run lock-free on platforms with
atomic 8 bit assignment but needs a spinlock or two elsewhere. So I'd
want to use pg_atomic_store_u64(), but I'd also need a clean way to
test, at compile time, whether it's available.

Yes, definitely. There should be a couple of #defines that declare
whether non-prerequisite options are supported. I'd guess we want at least:
* 8byte math
* 16byte compare_and_swap

I'm not terribly excited about relying on 16-byte CAS, but I agree
that 8-byte math, at least, is important. I've not been successful in
finding any evidence that gcc has preprocessor symbols to tell us
about the properties of 8-byte loads and stores. The closest thing
that I found is:

http://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html

That page provides intrinsics for 8-byte atomic loads and stores,
among other things. But it seems that the only method for determining
whether they work on a particular target is to compile a test program
and see whether it complains about __atomic_load_n and/or
__atomic_store_n being unresolved symbols. I suppose we could add a
configure test for that. Yuck.

Anyone have a better idea?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Andres Freund
andres@2ndquadrant.com
In reply to: Robert Haas (#1)

On 2013-10-28 14:10:48 -0400, Robert Haas wrote:

On Wed, Oct 16, 2013 at 12:52 PM, Andres Freund <andres@2ndquadrant.com> wrote:

I have a related problem, which is that some code I'm currently
working on vis-a-vis parallelism can run lock-free on platforms with
atomic 8 bit assignment but needs a spinlock or two elsewhere. So I'd
want to use pg_atomic_store_u64(), but I'd also need a clean way to
test, at compile time, whether it's available.

Yes, definitely. There should be a couple of #defines that declare
whether non-prerequisite options are supported. I'd guess we want at least:
* 8byte math
* 16byte compare_and_swap

I'm not terribly excited about relying on 16-byte CAS, but I agree
that 8-byte math, at least, is important. I've not been successful in
finding any evidence that gcc has preprocessor symbols to tell us
about the properties of 8-byte loads and stores. The closest thing
that I found is:

I am talking about making 16byte CAS explicitly optional though? I think
if code wants to optionally make use of it (e.g. on x86 where it's been
available basically forever) that's fine. It just has to be optional.
Or are you saying you're simply not interested in 16byte CAS generally?

Same thing for 8byte math - there's no chance that that is going to be
supported over all platforms anytime soon, so it has to be optional.

http://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html

That page provides intrinsics for 8-byte atomic loads and stores,
among other things. But it seems that the only method for determining
whether they work on a particular target is to compile a test program
and see whether it complains about __atomic_load_n and/or
__atomic_store_n being unresolved symbols. I suppose we could add a
configure test for that. Yuck.

Well, that's pretty easy to integrate into configure - and works on
crossbuilds. So I think that'd be ok.

But I actually think this is going to be a manual thing, atomic 8byte
math will fall back to kernel emulation on several targets, so we
probably want some defines to explicitly declare it supported on targets
where that's not the case.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#2)

On Mon, Oct 28, 2013 at 2:19 PM, Andres Freund <andres@2ndquadrant.com> wrote:

I'm not terribly excited about relying on 16-byte CAS, but I agree
that 8-byte math, at least, is important. I've not been successful in
finding any evidence that gcc has preprocessor symbols to tell us
about the properties of 8-byte loads and stores. The closest thing
that I found is:

I am talking about making 16byte CAS explicitly optional though? I think
if code wants to optionally make use of it (e.g. on x86 where it's been
available basically forever) that's fine. It just has to be optional.
Or are you saying you're simply not interested in 16byte CAS generally?

I am just not interested in it generally. Relying on too many OS or
architecture-specific primitives has several disadvantages. It's
going to be a nuisance on more obscure platforms where they may or may
not be supported and may or may not work right even if supported.
Even once we get things working right everywhere, it'll mean that
performance may suffer on non-mainstream platforms. And I think in
many cases it may suggest that we're using an architecture-specific
quirk to work around a fundamental problem with our algorithms. I'm
more or less convinced of the need for 8-byte atomic reads and writes,
test-and-set, 8-byte compare-and-swap, and 8-byte fetch-and-add. But
beyond that I'm less sold. Most of the academic papers I've read on
implementing lock-free or highly-parallel constructs attempt to
confine themselves to 8-byte operations with 8-byte compare-and-swap,
and I'm a bit disposed to think we ought to try to hew to that as
well. I'm not dead set against going further, but I lean against it,
for all of the reasons mentioned above.

Same thing for 8byte math - there's no chance that that is going to be
supported over all platforms anytime soon, so it has to be optional.

Agreed.

http://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html

That page provides intrinsics for 8-byte atomic loads and stores,
among other things. But it seems that the only method for determining
whether they work on a particular target is to compile a test program
and see whether it complains about __atomic_load_n and/or
__atomic_store_n being unresolved symbols. I suppose we could add a
configure test for that. Yuck.

Well, that's pretty easy to integrate into configure - and works on
crossbuilds. So I think that'd be ok.

But I actually think this is going to be a manual thing, atomic 8byte
math will fall back to kernel emulation on several targets, so we
probably want some defines to explicitly declare it supported on targets
where that's not the case.

Hmm, OK. I had not come across such cases. There are architectures
where it Just Works (like x64_64), architectures where you can make it
work by using special instructions (like x86), and (presumably)
architectures where it doesn't work at all. Places where it works
using really slow kernel emulation are yet another category to worry
about. Unfortunately, I have not found any good source that describes
which architectures fall into which category. Without that, pulling
this together seems intimidating, unless we just declare it working
for x86_64, disable it everywhere else, and wait for people running on
other architectures to complain.

I wonder whether it'd be safe to assume that any machine where
pointers are 8 bytes has 8-byte atomic loads and stores. I bet there
is a counterexample somewhere. :-(

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Andres Freund
andres@2ndquadrant.com
In reply to: Robert Haas (#3)

On 2013-10-28 15:02:41 -0400, Robert Haas wrote:

On Mon, Oct 28, 2013 at 2:19 PM, Andres Freund <andres@2ndquadrant.com> wrote:

I'm not terribly excited about relying on 16-byte CAS, but I agree
that 8-byte math, at least, is important. I've not been successful in
finding any evidence that gcc has preprocessor symbols to tell us
about the properties of 8-byte loads and stores. The closest thing
that I found is:

I am talking about making 16byte CAS explicitly optional though? I think
if code wants to optionally make use of it (e.g. on x86 where it's been
available basically forever) that's fine. It just has to be optional.
Or are you saying you're simply not interested in 16byte CAS generally?

I am just not interested in it generally. Relying on too many OS or
architecture-specific primitives has several disadvantages. It's
going to be a nuisance on more obscure platforms where they may or may
not be supported and may or may not work right even if supported.
Even once we get things working right everywhere, it'll mean that
performance may suffer on non-mainstream platforms.

But it'll suffer against the *increased* performance on modern
platforms, it shouldn't suffer noticeably against the previous
performance on the older platform if we're doing our homework...

Most of the academic papers I've read on
implementing lock-free or highly-parallel constructs attempt to
confine themselves to 8-byte operations with 8-byte compare-and-swap,
and I'm a bit disposed to think we ought to try to hew to that as
well. I'm not dead set against going further, but I lean against it,
for all of the reasons mentioned above.

I think there are quite some algorithms relying on 16byte CAS, that's
why I was thinking about it at all. I think it's easier to add support
for it in the easier trawl through the compilers, but I won't argue much
for it otherwise for now.

But I actually think this is going to be a manual thing, atomic 8byte
math will fall back to kernel emulation on several targets, so we
probably want some defines to explicitly declare it supported on targets
where that's not the case.

Hmm, OK. I had not come across such cases.

E.g. ARM/linux which we probably cannot ignore.

Places where it works
using really slow kernel emulation are yet another category to worry
about. Unfortunately, I have not found any good source that describes
which architectures fall into which category. Without that, pulling
this together seems intimidating, unless we just declare it working
for x86_64, disable it everywhere else, and wait for people running on
other architectures to complain.

Yes, I think whitelisting targs is a sensible approach here. I am pretty
sure we can do better than just x86-64, but that's easily doable in an
incremental fashion.

I wonder whether it'd be safe to assume that any machine where
pointers are 8 bytes has 8-byte atomic loads and stores. I bet there
is a counterexample somewhere. :-(

Sparc64 :(.

Btw, could you quickly give some keywords what you're thinking about
making lockless?
I mainly am thinking about lwlocks and buffer pins so far. Nothing
really ambitious.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#4)

On Mon, Oct 28, 2013 at 3:32 PM, Andres Freund <andres@2ndquadrant.com> wrote:

I wonder whether it'd be safe to assume that any machine where
pointers are 8 bytes has 8-byte atomic loads and stores. I bet there
is a counterexample somewhere. :-(

Sparc64 :(.

Btw, could you quickly give some keywords what you're thinking about
making lockless?
I mainly am thinking about lwlocks and buffer pins so far. Nothing
really ambitious.

Well, I was going to use it for some code I'm working on for
parallelism, but I just tested the overhead of a spinlock, and it was
zero, possibly negative. So I may not have an immediate application.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Andres Freund (#4)

On 28.10.2013 21:32, Andres Freund wrote:

On 2013-10-28 15:02:41 -0400, Robert Haas wrote:

Most of the academic papers I've read on
implementing lock-free or highly-parallel constructs attempt to
confine themselves to 8-byte operations with 8-byte compare-and-swap,
and I'm a bit disposed to think we ought to try to hew to that as
well. I'm not dead set against going further, but I lean against it,
for all of the reasons mentioned above.

I think there are quite some algorithms relying on 16byte CAS, that's
why I was thinking about it at all. I think it's easier to add support
for it in the easier trawl through the compilers, but I won't argue much
for it otherwise for now.

Many algorithms require a 2*(pointer width) CAS instruction. On 64-bit
platforms that's 16 bytes, but on 32-bit platforms an 8 byte version
will suffice.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#6)

Heikki Linnakangas <hlinnakangas@vmware.com> writes:

On 28.10.2013 21:32, Andres Freund wrote:

I think there are quite some algorithms relying on 16byte CAS, that's
why I was thinking about it at all. I think it's easier to add support
for it in the easier trawl through the compilers, but I won't argue much
for it otherwise for now.

Many algorithms require a 2*(pointer width) CAS instruction. On 64-bit
platforms that's 16 bytes, but on 32-bit platforms an 8 byte version
will suffice.

You're both just handwaving. How many is "many", and which ones might
we actually have enough use for to justify dealing with such a dependency?
I don't think we should buy into this without some pretty concrete
justification.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Andres Freund
andres@2ndquadrant.com
In reply to: Tom Lane (#7)

On 2013-10-28 16:06:47 -0400, Tom Lane wrote:

Heikki Linnakangas <hlinnakangas@vmware.com> writes:

On 28.10.2013 21:32, Andres Freund wrote:

I think there are quite some algorithms relying on 16byte CAS, that's
why I was thinking about it at all. I think it's easier to add support
for it in the easier trawl through the compilers, but I won't argue much
for it otherwise for now.

Many algorithms require a 2*(pointer width) CAS instruction. On 64-bit
platforms that's 16 bytes, but on 32-bit platforms an 8 byte version
will suffice.

You're both just handwaving. How many is "many", and which ones might
we actually have enough use for to justify dealing with such a dependency?
I don't think we should buy into this without some pretty concrete
justification.

Well, I don't think either of us is suggesting to make it required. But
it's going to be painful to go through the list of compilers repeatedly
instead of just adding all operations at one. I am willing to do that
for several compilers once, but I don't want to do it in each and every
feature submission using another atomic op.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#8)

Andres Freund <andres@2ndquadrant.com> writes:

On 2013-10-28 16:06:47 -0400, Tom Lane wrote:

You're both just handwaving. How many is "many", and which ones might
we actually have enough use for to justify dealing with such a dependency?
I don't think we should buy into this without some pretty concrete
justification.

Well, I don't think either of us is suggesting to make it required. But
it's going to be painful to go through the list of compilers repeatedly
instead of just adding all operations at one. I am willing to do that
for several compilers once, but I don't want to do it in each and every
feature submission using another atomic op.

Basically I'm arguing that that choice is backwards. We should decide
first what the list of supported atomics is going to be, and each and
every element of that list has to have a convincing concrete argument
why we need to support it. Not "we might want this someday". Because
when someday comes, that's when we'd be paying the price of finding out
which platforms actually support the primitive correctly and with what
performance. Or if someday never comes, we're not ahead either.

Or if you like: no matter what you do, the first feature submission
that makes use of a given atomic op is going to suffer pain. I don't
want to still be suffering that pain two or three years out. The shorter
the list of allowed atomic ops, the sooner we'll be done with debugging
them.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Andres Freund
andres@2ndquadrant.com
In reply to: Tom Lane (#9)

On 2013-10-28 16:29:35 -0400, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

On 2013-10-28 16:06:47 -0400, Tom Lane wrote:

You're both just handwaving. How many is "many", and which ones might
we actually have enough use for to justify dealing with such a dependency?
I don't think we should buy into this without some pretty concrete
justification.

Well, I don't think either of us is suggesting to make it required. But
it's going to be painful to go through the list of compilers repeatedly
instead of just adding all operations at one. I am willing to do that
for several compilers once, but I don't want to do it in each and every
feature submission using another atomic op.

Basically I'm arguing that that choice is backwards. We should decide
first what the list of supported atomics is going to be, and each and
every element of that list has to have a convincing concrete argument
why we need to support it.

The list I have previously suggested was:

* pg_atomic_load_u32(uint32 *)
* uint32 pg_atomic_store_u32(uint32 *)

To be able to write code without spreading volatiles all over while
making it very clear that that part of the code is worthy of attention.

* uint32 pg_atomic_exchange_u32(uint32 *ptr, uint32 val)
* bool pg_atomic_compare_exchange_u32(uint32 *ptr, uint32 *expected, uint32 newval)

Used in most lockfree algorithms, can be used to provide fallbacks for
when any of the other 32bit operations is not implemented for a platform
(so it's easier to bring up a platform).
E.g. used in in the wait-free LW_SHARED patch.

* uint32 pg_atomic_fetch_add_u32(uint32 *ptr, uint32 add)

Plain math.
E.g. used in in the wait-free LW_SHARED patch.

* uint32 pg_atomic_fetch_and_u32(uint32 *ptr, uint32 add)
* uint32 pg_atomic_fetch_or_u32(uint32 *ptr, uint32 add)

Useful for (un-)setting flags atomically. Needed for (my approach to)
spinlock-less Pin/UnpinBuffer.

* u64 variants of the above

lockfree list manipulation need at least pointer width operations of at
least compare_exchange().

I *personally* don't have a case for 8byte math yet, but I am pretty
sure parallel sort might be interested in it.

* bool pg_atomic_test_set(void *ptr)
* void pg_atomic_clear(void *ptr)

Can be used as the implementation for spinlocks based on compiler
intrinsics if no native implementation is existing. Makes it easier to
bring up new platforms.

Wrappers around the above:
* uint32 pg_atomic_fetch_sub_u32(uint32 *ptr, uint32 add)
Generic wrapper that imo makes the code easier to read. No
per-platform/compiler overhead.

* pg_atomic_add_until_u32(uint32 *ptr, uint32 val, uint32 limit)
Useful for managing refcounts like pin counts.

* pg_atomic_(add|sub|and|or)_fetch_u32()
Generic wrappers for more legible code. No
per-platform/compiler overhead.

* pg_atomic_compare_and_swap_128()

Many algorithms are faster (e.g. some lockless hashtables, which'd be
great for the buffer mapping) when a double-pointer-width CAS is
available, but also work with an pointer-width CAS in a less efficient
manner.

Or if you like: no matter what you do, the first feature submission
that makes use of a given atomic op is going to suffer pain. I don't
want to still be suffering that pain two or three years out. The shorter
the list of allowed atomic ops, the sooner we'll be done with debugging
them.

Yea. As, partially, even evidenced by this discussion ;).

Which would you like to see go?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Andres Freund
andres@2ndquadrant.com
In reply to: Andres Freund (#10)

Hi,

(Coming back to this now)

On 2013-10-28 21:55:22 +0100, Andres Freund wrote:

The list I have previously suggested was:

* pg_atomic_load_u32(uint32 *)
* uint32 pg_atomic_store_u32(uint32 *)

To be able to write code without spreading volatiles all over while
making it very clear that that part of the code is worthy of attention.

* uint32 pg_atomic_exchange_u32(uint32 *ptr, uint32 val)
* bool pg_atomic_compare_exchange_u32(uint32 *ptr, uint32 *expected, uint32 newval)

So what I've previously proposed did use plain types for the
to-be-manipulated atomic integers. I am not sure about that anymore
though, C11 includes support for atomic operations, but it assumes that
the to-be-manipulated variable is of a special "atomic" type. While I am
not propose that we try to emulate C11's API completely (basically
impossible as it uses type agnostic functions, it also exposes too
much), it seems to make sense to allow PG's atomics to be implemented by
C11's.

The API is described at: http://en.cppreference.com/w/c/atomic

The fundamental difference to what I've suggested so far is that the
atomic variable isn't a plain integer/type but a distinct datatype that
can *only* be manipulated via the atomic operators. That has the nice
property that it cannot be accidentally manipulated via plain operators.

E.g. it wouldn't be
uint32 pg_atomic_fetch_add_u32(uint32 *ptr, uint32 add_);
but
uint32 pg_atomic_fetch_add_u32(pg_atomic_uint32 *ptr, uint32 add_);

where, for now, atomic_uint32 is something like:
typedef struct pg_atomic_uint32
{
volatile uint32 value;
} pg_atomic_uint32;
a future C11 implementation would just typedef C11's respective atomic
type to pg_atomic_uint32.

Comments?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#11)

On Tue, Nov 5, 2013 at 10:51 AM, Andres Freund <andres@2ndquadrant.com> wrote:

Hi,

(Coming back to this now)

On 2013-10-28 21:55:22 +0100, Andres Freund wrote:

The list I have previously suggested was:

* pg_atomic_load_u32(uint32 *)
* uint32 pg_atomic_store_u32(uint32 *)

To be able to write code without spreading volatiles all over while
making it very clear that that part of the code is worthy of attention.

* uint32 pg_atomic_exchange_u32(uint32 *ptr, uint32 val)
* bool pg_atomic_compare_exchange_u32(uint32 *ptr, uint32 *expected, uint32 newval)

So what I've previously proposed did use plain types for the
to-be-manipulated atomic integers. I am not sure about that anymore
though, C11 includes support for atomic operations, but it assumes that
the to-be-manipulated variable is of a special "atomic" type. While I am
not propose that we try to emulate C11's API completely (basically
impossible as it uses type agnostic functions, it also exposes too
much), it seems to make sense to allow PG's atomics to be implemented by
C11's.

The API is described at: http://en.cppreference.com/w/c/atomic

The fundamental difference to what I've suggested so far is that the
atomic variable isn't a plain integer/type but a distinct datatype that
can *only* be manipulated via the atomic operators. That has the nice
property that it cannot be accidentally manipulated via plain operators.

E.g. it wouldn't be
uint32 pg_atomic_fetch_add_u32(uint32 *ptr, uint32 add_);
but
uint32 pg_atomic_fetch_add_u32(pg_atomic_uint32 *ptr, uint32 add_);

where, for now, atomic_uint32 is something like:
typedef struct pg_atomic_uint32
{
volatile uint32 value;
} pg_atomic_uint32;
a future C11 implementation would just typedef C11's respective atomic
type to pg_atomic_uint32.

Comments?

Sadly, I don't have a clear feeling for how well or poorly that's
going to work out.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Ants Aasma
ants@cybertec.at
In reply to: Andres Freund (#11)

On Tue, Nov 5, 2013 at 5:51 PM, Andres Freund <andres@2ndquadrant.com> wrote:

So what I've previously proposed did use plain types for the
to-be-manipulated atomic integers. I am not sure about that anymore
though, C11 includes support for atomic operations, but it assumes that
the to-be-manipulated variable is of a special "atomic" type. While I am
not propose that we try to emulate C11's API completely (basically
impossible as it uses type agnostic functions, it also exposes too
much), it seems to make sense to allow PG's atomics to be implemented by
C11's.

The API is described at: http://en.cppreference.com/w/c/atomic

The fundamental difference to what I've suggested so far is that the
atomic variable isn't a plain integer/type but a distinct datatype that
can *only* be manipulated via the atomic operators. That has the nice
property that it cannot be accidentally manipulated via plain operators.

E.g. it wouldn't be
uint32 pg_atomic_fetch_add_u32(uint32 *ptr, uint32 add_);
but
uint32 pg_atomic_fetch_add_u32(pg_atomic_uint32 *ptr, uint32 add_);

where, for now, atomic_uint32 is something like:
typedef struct pg_atomic_uint32
{
volatile uint32 value;
} pg_atomic_uint32;
a future C11 implementation would just typedef C11's respective atomic
type to pg_atomic_uint32.

Comments?

C11 atomics need to be initialized through atomic_init(), so a simple
typedef will not be correct here. Also, C11 atomics are designed to
have the compiler take care of memory barriers - so they might not
always be a perfect match to our API's, or any API implementable
without compiler support.

However I'm mildly supportive on having a separate type for variables
accessed by atomics. It can result in some unnecessary code churn, but
in general if an atomic access to a variable is added, then all other
accesses to it need to be audited for memory barriers, even if they
were previously correctly synchronized by a lock.

I guess the best approach for deciding would be to try to convert a
couple of the existing unlocked accesses to the API and see what the
patch looks like.

Regards,
Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Andres Freund
andres@2ndquadrant.com
In reply to: Ants Aasma (#13)

On 2013-11-06 16:48:17 +0200, Ants Aasma wrote:

C11 atomics need to be initialized through atomic_init(), so a simple
typedef will not be correct here. Also, C11 atomics are designed to
have the compiler take care of memory barriers - so they might not
always be a perfect match to our API's, or any API implementable
without compiler support.

Yea, I think we'll always need to have our layer above C11 atomics, but
it still will be useful to allow them to be used to implement our
atomics.

FWIW, I find the requirement for atomic_init() really, really
annoying. Not that that will change anything ;)

However I'm mildly supportive on having a separate type for variables
accessed by atomics. It can result in some unnecessary code churn, but
in general if an atomic access to a variable is added, then all other
accesses to it need to be audited for memory barriers, even if they
were previously correctly synchronized by a lock.

Ok, that's what I am writing right now.

I guess the best approach for deciding would be to try to convert a
couple of the existing unlocked accesses to the API and see what the
patch looks like.

I don't think there really exist any interesting ones? I am using my
lwlock reimplementation as a testbed so far.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Ants Aasma
ants@cybertec.at
In reply to: Andres Freund (#14)

On Wed, Nov 6, 2013 at 4:54 PM, Andres Freund <andres@2ndquadrant.com> wrote:

FWIW, I find the requirement for atomic_init() really, really
annoying. Not that that will change anything ;)

Me too, but I guess this allows them to emulate atomics on platforms
that don't have native support. Whether that is a good idea is a
separate question.

However I'm mildly supportive on having a separate type for variables
accessed by atomics. It can result in some unnecessary code churn, but
in general if an atomic access to a variable is added, then all other
accesses to it need to be audited for memory barriers, even if they
were previously correctly synchronized by a lock.

Ok, that's what I am writing right now.

I guess the best approach for deciding would be to try to convert a
couple of the existing unlocked accesses to the API and see what the
patch looks like.

I don't think there really exist any interesting ones? I am using my
lwlock reimplementation as a testbed so far.

BufferDesc management in src/backend/storage/buffer/bufmgr.c.
The seqlock like thing used to provide consistent reads from
PgBackendStatus src/backend/postmaster/pgstat.c (this code looks like
it's broken on weakly ordered machines)
The unlocked reads that are done from various procarray variables.
The XLogCtl accesses in xlog.c.

IMO the volatile keyword should be confined to the code actually
implementing atomics, locking. The "use volatile pointer to prevent
code rearrangement" comment is evil. If there are data races in the
code, they need comments pointing this out and probably memory
barriers too. If there are no data races, then the volatile
declaration is useless and a pessimization. Currently it's more of a
catch all "here be dragons" declaration for data structures.

Regards,
Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Andres Freund
andres@2ndquadrant.com
In reply to: Robert Haas (#12)
1 attachment(s)

On 2013-11-06 08:15:59 -0500, Robert Haas wrote:

The API is described at: http://en.cppreference.com/w/c/atomic

The fundamental difference to what I've suggested so far is that the
atomic variable isn't a plain integer/type but a distinct datatype that
can *only* be manipulated via the atomic operators. That has the nice
property that it cannot be accidentally manipulated via plain operators.

E.g. it wouldn't be
uint32 pg_atomic_fetch_add_u32(uint32 *ptr, uint32 add_);
but
uint32 pg_atomic_fetch_add_u32(pg_atomic_uint32 *ptr, uint32 add_);

where, for now, atomic_uint32 is something like:
typedef struct pg_atomic_uint32
{
volatile uint32 value;
} pg_atomic_uint32;
a future C11 implementation would just typedef C11's respective atomic
type to pg_atomic_uint32.

Comments?

Sadly, I don't have a clear feeling for how well or poorly that's
going to work out.

I've implemented it that way and it imo looks sensible. I dislike the
pg_atomic_init_(flag|u32|u64) calls that are forced on us by wanting to
be compatible with C11, but I think that doesn't end up being too bad.

So, attached is a very first draft of an atomics implementation. There's
lots to be done, but this is how I think it should roughly look like and
what things we should implement in the beginning.
The API should be understandable from looking at
src/include/storage/atomics.h

I haven't tested the IBM xlc, HPUX IA64 acc, Sunpro fallback at all, but
the important part is that those provide the necessary tools to
implement everything. Also, even the gcc implementation falls back to
compare_and_swap() based implementations for the operations, but that
shouldn't matter for now. I haven't looked for other compilers yet.

Questions:
* Fundamental issues with the API?
* should we split of compiler/arch specific parts of into include/ports
or such? -0.5 from me.
* should we add src/include/storage/atomics subdirectory? +0.5 from me.
* Do we really want to continue to cater to compilers not supporting
PG_USE_INLINE? I've tried to add support for them, but it does make
things considerably more #ifdef-y.
Afaik it's only HPUX's acc that has problems, and it seems to work but
generate warnings, so maybe that's ok?
* Which additional compilers do we want to support? icc (might be gcc
compatible)?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

0002-Very-basic-atomic-ops-implementation.patchtext/x-patch; charset=iso-8859-1Download
>From 04a8dc8516323eb404a7a3392f950763497a76ec Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Wed, 6 Nov 2013 10:32:53 +0100
Subject: [PATCH 2/3] Very basic atomic ops implementation

Only gcc has been tested, stub implementations for
* msvc
* HUPX acc
* IBM xlc
* Sun/Oracle Sunpro compiler
are included.

All implementations only provide the bare minimum of operations for now and
rely on emulating more complex operations via compare_and_swap().
---
 config/c-compiler.m4                         |  77 +++++
 configure.in                                 |  19 +-
 src/backend/storage/lmgr/Makefile            |   2 +-
 src/backend/storage/lmgr/atomics.c           |  23 ++
 src/include/c.h                              |   7 +
 src/include/pg_config.h.in                   |  11 +-
 src/include/storage/atomics-arch-arm.h       |  29 ++
 src/include/storage/atomics-generic-acc.h    |  99 +++++++
 src/include/storage/atomics-generic-gcc.h    | 154 ++++++++++
 src/include/storage/atomics-generic-msvc.h   |  58 ++++
 src/include/storage/atomics-generic-sunpro.h |  66 +++++
 src/include/storage/atomics-generic-xlc.h    |  84 ++++++
 src/include/storage/atomics-generic.h        | 290 +++++++++++++++++++
 src/include/storage/atomics.h                | 415 +++++++++++++++++++++++++++
 src/include/storage/s_lock.h                 |  10 +-
 15 files changed, 1325 insertions(+), 19 deletions(-)
 create mode 100644 src/backend/storage/lmgr/atomics.c
 create mode 100644 src/include/storage/atomics-arch-arm.h
 create mode 100644 src/include/storage/atomics-generic-acc.h
 create mode 100644 src/include/storage/atomics-generic-gcc.h
 create mode 100644 src/include/storage/atomics-generic-msvc.h
 create mode 100644 src/include/storage/atomics-generic-sunpro.h
 create mode 100644 src/include/storage/atomics-generic-xlc.h
 create mode 100644 src/include/storage/atomics-generic.h
 create mode 100644 src/include/storage/atomics.h

diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 4ba3236..ae92281 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -289,3 +289,80 @@ if test x"$Ac_cachevar" = x"yes"; then
 fi
 undefine([Ac_cachevar])dnl
 ])# PGAC_PROG_CC_LDFLAGS_OPT
+
+AC_DEFUN([PGAC_HAVE_GCC__SYNC_INT32_ATOMICS],
+[AC_CACHE_CHECK(for __builtin_constant_p, pgac_cv__builtin_constant_p,
+[AC_TRY_COMPILE([static int x; static int y[__builtin_constant_p(x) ? x : 1];],
+[],
+[pgac_cv__builtin_constant_p=yes],
+[pgac_cv__builtin_constant_p=no])])
+if test x"$pgac_cv__builtin_constant_p" = xyes ; then
+AC_DEFINE(HAVE__BUILTIN_CONSTANT_P, 1,
+          [Define to 1 if your compiler understands __builtin_constant_p.])
+fi])# PGAC_C_BUILTIN_CONSTANT_P
+
+# PGAC_HAVE_GCC__SYNC_INT32_TAS
+# -------------------------
+# Check if the C compiler understands __sync_lock_test_and_set(),
+# and define HAVE_GCC__SYNC_INT32_TAS
+#
+# NB: There are platforms where test_and_set is available but compare_and_swap
+# is not, so test this separa
+AC_DEFUN([PGAC_HAVE_GCC__SYNC_INT32_TAS],
+[AC_CACHE_CHECK(for builtin locking functions, pgac_cv_gcc_sync_int32_tas,
+[AC_TRY_LINK([],
+  [int lock = 0;
+   __sync_lock_test_and_set(&lock, 1);
+   __sync_lock_release(&lock);],
+  [pgac_cv_gcc_sync_int32_tas="yes"],
+  [pgac_cv_gcc_sync_int32_tas="no"])])
+if test x"$pgac_cv_gcc_sync_int32_tas" = x"yes"; then
+  AC_DEFINE(HAVE_GCC__SYNC_INT32_TAS, 1, [Define to 1 if you have __sync_lock_test_and_set(int *) and friends.])
+fi])# PGAC_HAVE_GCC__SYNC_INT32_TAS
+
+# PGAC_HAVE_GCC__SYNC_INT32_CAS
+# -------------------------
+# Check if the C compiler understands __sync_compare_and_swap() for 32bit
+# types, and define HAVE_GCC__SYNC_INT32_CAS if so.
+AC_DEFUN([PGAC_HAVE_GCC__SYNC_INT32_CAS],
+[AC_CACHE_CHECK(for builtin __sync int32 atomic operations, pgac_cv_gcc_sync_int32_cas,
+[AC_TRY_LINK([],
+  [int val = 0;
+   __sync_val_compare_and_swap(&val, 0, 37);],
+  [pgac_cv_gcc_sync_int32_cas="yes"],
+  [pgac_cv_gcc_sync_int32_cas="no"])])
+if test x"$pgac_cv_gcc_sync_int32_cas" = x"yes"; then
+  AC_DEFINE(HAVE_GCC__SYNC_INT32_CAS, 1, [Define to 1 if you have __sync_compare_and_swap(int *, int, int).])
+fi])# PGAC_HAVE_GCC__SYNC_INT32_CAS
+
+# PGAC_HAVE_GCC__SYNC_INT32_CAS
+# -------------------------
+# Check if the C compiler understands __sync_compare_and_swap() for 64bit
+# types, and define HAVE_GCC__SYNC_INT64_CAS if so.
+AC_DEFUN([PGAC_HAVE_GCC__SYNC_INT64_CAS],
+[AC_CACHE_CHECK(for builtin __sync int64 atomic operations, pgac_cv_gcc_sync_int64_cas,
+[AC_TRY_LINK([],
+  [PG_INT64_TYPE lock = 0;
+   __sync_val_compare_and_swap(&lock, 0, (PG_INT64_TYPE) 37);],
+  [pgac_cv_gcc_sync_int64_cas="yes"],
+  [pgac_cv_gcc_sync_int64_cas="no"])])
+if test x"$pgac_cv_gcc_sync_int64_cas" = x"yes"; then
+  AC_DEFINE(HAVE_GCC__SYNC_INT64_CAS, 1, [Define to 1 if you have __sync_compare_and_swap(int64 *, int64, int64).])
+fi])# PGAC_HAVE_GCC__SYNC_INT64_CAS
+
+# PGAC_HAVE_GCC__ATOMIC_INT32_CAS
+# -------------------------
+
+# Check if the C compiler understands __atomic_compare_exchange_n() for 32bit
+# types, and define HAVE_GCC__ATOMIC_INT32_CAS if so.
+AC_DEFUN([PGAC_HAVE_GCC__ATOMIC_INT32_CAS],
+[AC_CACHE_CHECK(for builtin __atomic int32 atomic operations, pgac_cv_gcc_atomic_int32_cas,
+[AC_TRY_LINK([],
+  [int val = 0;
+   int expect = 0;
+   __atomic_compare_exchange_n(&val, &expect, 37, 0, __ATOMIC_SEQ_CST, __ATOMIC_RELAXED);],
+  [pgac_cv_gcc_atomic_int32_cas="yes"],
+  [pgac_cv_gcc_atomic_int32_cas="no"])])
+if test x"$pgac_cv_gcc_atomic_int32_cas" = x"yes"; then
+  AC_DEFINE(HAVE_GCC__ATOMIC_INT32_CAS, 1, [Define to 1 if you have __atomic_compare_exchange_n(int *, int *, int).])
+fi])# PGAC_HAVE_GCC__ATOMIC_INT32_CAS
diff --git a/configure.in b/configure.in
index a4baeaf..43c663a 100644
--- a/configure.in
+++ b/configure.in
@@ -19,7 +19,7 @@ m4_pattern_forbid(^PGAC_)dnl to catch undefined macros
 
 AC_INIT([PostgreSQL], [9.4devel], [pgsql-bugs@postgresql.org])
 
-m4_if(m4_defn([m4_PACKAGE_VERSION]), [2.63], [], [m4_fatal([Autoconf version 2.63 is required.
+m4_if(m4_defn([m4_PACKAGE_VERSION]), [2.69], [], [m4_fatal([Autoconf version 2.63 is required.
 Untested combinations of 'autoconf' and PostgreSQL versions are not
 recommended.  You can remove the check from 'configure.in' but it is then
 your responsibility whether the result works or not.])])
@@ -1453,17 +1453,6 @@ fi
 AC_CHECK_FUNCS([strtoll strtoq], [break])
 AC_CHECK_FUNCS([strtoull strtouq], [break])
 
-AC_CACHE_CHECK([for builtin locking functions], pgac_cv_gcc_int_atomics,
-[AC_TRY_LINK([],
-  [int lock = 0;
-   __sync_lock_test_and_set(&lock, 1);
-   __sync_lock_release(&lock);],
-  [pgac_cv_gcc_int_atomics="yes"],
-  [pgac_cv_gcc_int_atomics="no"])])
-if test x"$pgac_cv_gcc_int_atomics" = x"yes"; then
-  AC_DEFINE(HAVE_GCC_INT_ATOMICS, 1, [Define to 1 if you have __sync_lock_test_and_set(int *) and friends.])
-fi
-
 # Lastly, restore full LIBS list and check for readline/libedit symbols
 LIBS="$LIBS_including_readline"
 
@@ -1738,6 +1727,12 @@ AC_CHECK_TYPES([int8, uint8, int64, uint64], [], [],
 # C, but is missing on some old platforms.
 AC_CHECK_TYPES(sig_atomic_t, [], [], [#include <signal.h>])
 
+# Check for various atomic operations now that we have checked how to declare
+# 64bit integers.
+PGAC_HAVE_GCC__SYNC_INT32_TAS
+PGAC_HAVE_GCC__SYNC_INT32_CAS
+PGAC_HAVE_GCC__SYNC_INT64_CAS
+PGAC_HAVE_GCC__ATOMIC_INT32_CAS
 
 if test "$PORTNAME" != "win32"
 then
diff --git a/src/backend/storage/lmgr/Makefile b/src/backend/storage/lmgr/Makefile
index e12a854..be95d50 100644
--- a/src/backend/storage/lmgr/Makefile
+++ b/src/backend/storage/lmgr/Makefile
@@ -12,7 +12,7 @@ subdir = src/backend/storage/lmgr
 top_builddir = ../../../..
 include $(top_builddir)/src/Makefile.global
 
-OBJS = lmgr.o lock.o proc.o deadlock.o lwlock.o spin.o s_lock.o predicate.o
+OBJS = atomics.o lmgr.o lock.o proc.o deadlock.o lwlock.o spin.o s_lock.o predicate.o
 
 include $(top_srcdir)/src/backend/common.mk
 
diff --git a/src/backend/storage/lmgr/atomics.c b/src/backend/storage/lmgr/atomics.c
new file mode 100644
index 0000000..a00f29a
--- /dev/null
+++ b/src/backend/storage/lmgr/atomics.c
@@ -0,0 +1,23 @@
+/*-------------------------------------------------------------------------
+ *
+ * atomics.c
+ *	   Non-Inline parts of the atomics implementation
+ *
+ * Portions Copyright (c) 2013, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/storage/lmgr/atomics.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+/*
+ * We want the functions below to be inline; but if the compiler doesn't
+ * support that, fall back on providing them as regular functions.	See
+ * STATIC_IF_INLINE in c.h.
+ */
+#define ATOMICS_INCLUDE_DEFINITIONS
+
+#include "storage/atomics.h"
diff --git a/src/include/c.h b/src/include/c.h
index 6e19c6d..347f394 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -866,6 +866,13 @@ typedef NameData *Name;
 #define STATIC_IF_INLINE
 #endif   /* PG_USE_INLINE */
 
+#ifdef PG_USE_INLINE
+#define STATIC_IF_INLINE_DECLARE static inline
+#else
+#define STATIC_IF_INLINE_DECLARE extern
+#endif   /* PG_USE_INLINE */
+
+
 /* ----------------------------------------------------------------
  *				Section 8:	random stuff
  * ----------------------------------------------------------------
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 5eac52d..3dd76a2 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -173,8 +173,17 @@
 /* Define to 1 if your compiler understands __FUNCTION__. */
 #undef HAVE_FUNCNAME__FUNCTION
 
+/* Define to 1 if you have __atomic_compare_exchange_n(int *, int *, int). */
+#undef HAVE_GCC__ATOMIC_INT32_CAS
+
+/* Define to 1 if you have __sync_compare_and_swap(int *, int, int). */
+#undef HAVE_GCC__SYNC_INT32_CAS
+
 /* Define to 1 if you have __sync_lock_test_and_set(int *) and friends. */
-#undef HAVE_GCC_INT_ATOMICS
+#undef HAVE_GCC__SYNC_INT32_TAS
+
+/* Define to 1 if you have __sync_compare_and_swap(int64 *, int64, int64). */
+#undef HAVE_GCC__SYNC_INT64_CAS
 
 /* Define to 1 if you have the `getaddrinfo' function. */
 #undef HAVE_GETADDRINFO
diff --git a/src/include/storage/atomics-arch-arm.h b/src/include/storage/atomics-arch-arm.h
new file mode 100644
index 0000000..a00f8f5
--- /dev/null
+++ b/src/include/storage/atomics-arch-arm.h
@@ -0,0 +1,29 @@
+/*-------------------------------------------------------------------------
+ *
+ * atomics-arch-arm.h
+ *	  Atomic operations considerations specific to ARM
+ *
+ * Portions Copyright (c) 2013, PostgreSQL Global Development Group
+ *
+ * NOTES:
+ *
+ * src/include/storage/atomics-arch-arm.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+/* intentionally no include guards, should only be included by atomics.h */
+#ifndef INSIDE_ATOMICS_H
+#error "should be included via atomics.h"
+#endif
+
+
+/*
+ * FIXME: Disable 32bit atomics for ARMs before v6 or error out
+ */
+
+/*
+ * 64 bit atomics on arm are implemented using kernel fallbacks and might be
+ * slow, so disable entirely for now.
+ */
+#define PG_DISABLE_64_BIT_ATOMICS
diff --git a/src/include/storage/atomics-generic-acc.h b/src/include/storage/atomics-generic-acc.h
new file mode 100644
index 0000000..680d621
--- /dev/null
+++ b/src/include/storage/atomics-generic-acc.h
@@ -0,0 +1,99 @@
+/*-------------------------------------------------------------------------
+ *
+ * atomics-generic-acc.h
+ *	  Atomic operations support when using HPs acc on HPUX
+ *
+ * Portions Copyright (c) 2013, PostgreSQL Global Development Group
+ *
+ * NOTES:
+ *
+ * Documentation:
+ * * inline assembly for Itanium-based HP-UX:
+ *   http://h21007.www2.hp.com/portal/download/files/unprot/Itanium/inline_assem_ERS.pdf
+ * * Implementing Spinlocks on the Intel ® Itanium ® Architecture and PA-RISC
+ *   http://h21007.www2.hp.com/portal/download/files/unprot/itanium/spinlocks.pdf
+ *
+ * Itanium only supports a small set of numbers (6, -8, -4, -1, 1, 4, 8, 16)
+ * for atomic add/sub, so we just implement everything but compare_exchange via
+ * the compare_exchange fallbacks in atomics-generic.h.
+ *
+ * src/include/storage/atomics-generic-acc.h
+ *
+ * -------------------------------------------------------------------------
+ */
+
+#include <machine/sys/inline.h>
+
+/* IA64 always has 32/64 bit atomics */
+
+typedef struct atomic_uint32
+{
+	volatile uint32 value;
+} atomic_uint32;
+
+typedef atomic_uint32 atomic_flag;
+
+typedef struct atomic_uint64
+{
+	volatile uint64 value;
+} atomic_uint64;
+
+/* declare outside the defined(PG_USE_INLINE) for visibility */
+#define PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U32
+#define PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U64
+
+
+#if defined(PG_USE_INLINE) || defined(ATOMICS_INCLUDE_DEFINITIONS)
+
+#define MINOR_FENCE (_Asm_fence) (_UP_CALL_FENCE | _UP_SYS_FENCE | \
+								 _DOWN_CALL_FENCE | _DOWN_SYS_FENCE )
+
+STATIC_IF_INLINE bool
+pg_atomic_compare_exchange_u32_impl(volatile atomic_uint32 *ptr,
+									uint32 *expected, uint32 newval)
+{
+	bool	ret;
+	uint32	current;
+
+	_Asm_mov_to_ar(_AREG_CCV, *expected, MINOR_FENCE);
+	/*
+	 * We want a barrier, not just release/acquire semantics.
+	 */
+	_Asm_mf();
+	/*
+	 * Notes:
+	 * DOWN_MEM_FENCE | _UP_MEM_FENCE prevents reordering by the compiler
+	 */
+	current =  _Asm_cmpxchg(_SZ_W, /* word */
+							_SEM_REL,
+							&ptr->value,
+							newval, _LDHINT_NONE,
+							_DOWN_MEM_FENCE | _UP_MEM_FENCE);
+	ret = current == *expected;
+	*expected = current;
+	return ret;
+}
+
+
+STATIC_IF_INLINE bool
+pg_atomic_compare_exchange_u64_impl(volatile atomic_uint64 *ptr,
+									uint64 *expected, uint64 newval)
+{
+	bool	ret;
+	uint64	current;
+
+	_Asm_mov_to_ar(_AREG_CCV, *expected, MINOR_FENCE);
+	_Asm_mf();
+	current =  _Asm_cmpxchg(_SZ_D, /* doubleword */
+							_SEM_REL,
+							&ptr->value,
+							newval, _LDHINT_NONE,
+							_DOWN_MEM_FENCE | _UP_MEM_FENCE);
+	ret = current == *expected;
+	*expected = current;
+	return ret;
+}
+
+#undef MINOR_FENCE
+
+#endif /* defined(PG_USE_INLINE) || defined(ATOMICS_INCLUDE_DEFINITIONS) */
diff --git a/src/include/storage/atomics-generic-gcc.h b/src/include/storage/atomics-generic-gcc.h
new file mode 100644
index 0000000..323c385
--- /dev/null
+++ b/src/include/storage/atomics-generic-gcc.h
@@ -0,0 +1,154 @@
+/*-------------------------------------------------------------------------
+ *
+ * atomics-generic-gcc.h
+ *	  Atomic operations, implemented using gcc (or compatible) intrinsics.
+ *
+ * Portions Copyright (c) 2013, PostgreSQL Global Development Group
+ *
+ * NOTES:
+ *
+ * Documentation:
+ * * Legacy __sync Built-in Functions for Atomic Memory Access
+ *   http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/_005f_005fsync-Builtins.html
+ * * Built-in functions for memory model aware atomic operations
+ *   http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/_005f_005fatomic-Builtins.html
+ *
+ * src/include/storage/atomics-generic-gcc.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+/* intentionally no include guards, should only be included by atomics.h */
+#ifndef INSIDE_ATOMICS_H
+#error "should be included via atomics.h"
+#endif
+
+/* declare outside the defined(PG_USE_INLINE) for visibility */
+#define PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U32
+
+typedef struct atomic_uint32
+{
+	volatile uint32 value;
+} atomic_uint32;
+
+typedef atomic_uint32 atomic_flag;
+
+#ifdef HAVE_GCC__ATOMIC_INT64_CAS
+/* declare outside the defined(PG_USE_INLINE) for visibility */
+#define PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U64
+typedef struct atomic_uint64
+{
+	volatile uint64 value;
+} atomic_uint64;
+#endif
+
+#define PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U64
+
+
+#if defined(PG_USE_INLINE) || defined(ATOMICS_INCLUDE_DEFINITIONS)
+
+#define PG_HAVE_ATOMIC_TEST_SET_FLAG
+STATIC_IF_INLINE bool
+pg_atomic_test_set_flag_impl(volatile atomic_flag *ptr)
+{
+	uint32 ret;
+	/* some platform only support a 1 here */
+	ret = __sync_lock_test_and_set(&ptr->value, 1);
+	/* XXX: only a acquire barrier, make it a full one for now */
+	pg_memory_barrier();
+	return ret != 0;
+}
+
+#define PG_HAVE_ATOMIC_CLEAR_FLAG
+STATIC_IF_INLINE void
+pg_atomic_clear_flag_impl(volatile atomic_flag *ptr)
+{
+	__sync_lock_release(&ptr->value);
+	/* XXX: only a release barrier, make it a full one for now */
+	pg_memory_barrier();
+}
+
+STATIC_IF_INLINE uint32
+pg_atomic_read_u32(volatile atomic_uint32 *ptr)
+{
+	return ptr->value;
+}
+
+STATIC_IF_INLINE void
+pg_atomic_write_u32(volatile atomic_uint32 *ptr, uint32 val)
+{
+	ptr->value = val;
+}
+
+#if defined(HAVE_GCC__ATOMIC_INT32_CAS)
+STATIC_IF_INLINE bool
+pg_atomic_compare_exchange_u32_impl(volatile atomic_uint32 *ptr,
+									uint32 *expected, uint32 newval)
+{
+	return __atomic_compare_exchange_n(&ptr->value, expected, newval, false,
+									   __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);
+}
+
+#elif defined(HAVE_GCC__SYNC_INT32_CAS)
+#define PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U32
+STATIC_IF_INLINE bool
+pg_atomic_compare_exchange_u32_impl(volatile atomic_uint32 *ptr,
+									uint32 *expected, uint32 newval)
+{
+	bool	ret;
+	uint32	current;
+	current = __sync_val_compare_and_swap(&ptr->value, *expected, newval);
+	ret = current == *expected;
+	*expected = current;
+	return ret;
+}
+
+#else
+#	error "strange configuration"
+#endif
+
+#ifdef HAVE_GCC__ATOMIC_INT64_CAS
+
+STATIC_IF_INLINE uint64
+pg_atomic_read_u64(volatile atomic_uint64 *ptr)
+{
+	return ptr->value;
+}
+
+STATIC_IF_INLINE void
+pg_atomic_write_u64(volatile atomic_uint64 *ptr, uint64 val)
+{
+	ptr->value = val;
+}
+
+/* if __atomic is supported for 32 it's also supported for 64bit */
+#if defined(HAVE_GCC__ATOMIC_INT32_CAS)
+STATIC_IF_INLINE bool
+pg_atomic_compare_exchange_u64_impl(volatile atomic_uint64 *ptr,
+									uint64 *expected, uint64 newval)
+{
+	return __atomic_compare_exchange_n(&ptr->value, expected, newval, false,
+									   __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);
+}
+
+#elif defined(HAVE_GCC__SYNC_INT64_CAS)
+#define PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U64
+STATIC_IF_INLINE bool
+pg_atomic_compare_exchange_u64_impl(volatile atomic_uint64 *ptr,
+									uint64 *expected, uint64 newval)
+{
+	bool	ret;
+	uint64	current;
+	current = __sync_val_compare_and_swap(&ptr->value, *expected, newval);
+	ret = current == *expected;
+	*expected = current;
+	return ret;
+}
+
+#else
+#	error "strange configuration"
+#endif
+
+#endif /* HAVE_GCC__ATOMIC_INT64_CAS */
+
+#endif /* defined(PG_USE_INLINE) || defined(ATOMICS_INCLUDE_DEFINITIONS) */
diff --git a/src/include/storage/atomics-generic-msvc.h b/src/include/storage/atomics-generic-msvc.h
new file mode 100644
index 0000000..1fa17f3
--- /dev/null
+++ b/src/include/storage/atomics-generic-msvc.h
@@ -0,0 +1,58 @@
+/*-------------------------------------------------------------------------
+ *
+ * atomics-generic-msvc.h
+ *	  Atomic operations support when using MSVC
+ *
+ * Portions Copyright (c) 2013, PostgreSQL Global Development Group
+ *
+ * NOTES:
+ *
+ * Documentation:
+ * * Interlocked Variable Access
+ *   http://msdn.microsoft.com/en-us/library/ms684122%28VS.85%29.aspx
+ *
+ * src/include/storage/atomics-generic-msvc.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+/* intentionally no include guards, should only be included by atomics.h */
+#ifndef INSIDE_ATOMICS_H
+#error "should be included via atomics.h"
+#endif
+
+#if defined(PG_USE_INLINE) || defined(ATOMICS_INCLUDE_DEFINITIONS)
+
+#define PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U32
+STATIC_IF_INLINE bool
+pg_atomic_compare_exchange_u32_impl(volatile atomic_uint32 *ptr,
+									uint32 *expected, uint32 newval)
+{
+	bool	ret;
+	uint32	current;
+	current = InterlockedCompareExchange(&ptr->value, newval, expected);
+	ret = current == *expected;
+	*expected = current;
+	return ret;
+}
+
+/*
+ * FIXME: Only available on Vista/2003 onwards, add test for that.
+ *
+ * Only supported on >486, but we require XP as a minimum baseline, which
+ * doesn't support the 486, so we don't need to add checks for that case.
+ */
+#define PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U64
+STATIC_IF_INLINE bool
+pg_atomic_compare_exchange_u64_impl(volatile atomic_uint64 *ptr,
+									uint64 *expected, uint64 newval)
+{
+	bool	ret;
+	uint32	current;
+	current = InterlockedCompareExchange64(&ptr->value, newval, expected);
+	ret = current == *expected;
+	*expected = current;
+	return ret;
+}
+
+#endif /* defined(PG_USE_INLINE) || defined(ATOMICS_INCLUDE_DEFINITIONS) */
diff --git a/src/include/storage/atomics-generic-sunpro.h b/src/include/storage/atomics-generic-sunpro.h
new file mode 100644
index 0000000..c71014b
--- /dev/null
+++ b/src/include/storage/atomics-generic-sunpro.h
@@ -0,0 +1,66 @@
+/*-------------------------------------------------------------------------
+ *
+ * atomics-generic-sunpro.h
+ *	  Atomic operations for solaris' CC
+ *
+ * Portions Copyright (c) 2013, PostgreSQL Global Development Group
+ *
+ * NOTES:
+ *
+ * Documentation:
+ * * manpage for atomic_cas(3C)
+ *   http://www.unix.com/man-page/opensolaris/3c/atomic_cas/
+ *   http://docs.oracle.com/cd/E23824_01/html/821-1465/atomic-cas-3c.html
+ *
+ * src/include/storage/atomics-generic-sunpro.h
+ *
+ * -------------------------------------------------------------------------
+ */
+
+#include <atomic.h>
+
+typedef struct atomic_uint32
+{
+	volatile uint32 value;
+} atomic_uint32;
+
+typedef atomic_uint32 atomic_flag;
+
+typedef struct atomic_uint64
+{
+	volatile uint64 value;
+} atomic_uint64;
+
+/* declare outside the defined(PG_USE_INLINE) for visibility */
+#define PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U32
+#define PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U64
+
+#if defined(PG_USE_INLINE) || defined(ATOMICS_INCLUDE_DEFINITIONS)
+
+STATIC_IF_INLINE bool
+pg_atomic_compare_exchange_u32_impl(volatile atomic_uint32 *ptr,
+									uint32 *expected, uint32 newval)
+{
+	bool	ret;
+	uint64	current;
+
+	current = atomic_cas_32(&ptr->value, *expected, newval);
+	ret = current == *expected;
+	*expected = current;
+	return ret;
+}
+
+STATIC_IF_INLINE bool
+pg_atomic_compare_exchange_u64_impl(volatile atomic_uint64 *ptr,
+									uint64 *expected, uint64 newval)
+{
+	bool	ret;
+	uint64	current;
+
+	current = atomic_cas_64(&ptr->value, *expected, newval);
+	ret = current == *expected;
+	*expected = current;
+	return ret;
+}
+
+#endif
diff --git a/src/include/storage/atomics-generic-xlc.h b/src/include/storage/atomics-generic-xlc.h
new file mode 100644
index 0000000..c3b751c
--- /dev/null
+++ b/src/include/storage/atomics-generic-xlc.h
@@ -0,0 +1,84 @@
+/*-------------------------------------------------------------------------
+ *
+ * atomics-generic-xlc.h
+ *	  Atomic operations for IBM's CC
+ *
+ * Portions Copyright (c) 2013, PostgreSQL Global Development Group
+ *
+ * NOTES:
+ *
+ * Documentation:
+ * * Synchronization and atomic built-in functions
+ *   http://publib.boulder.ibm.com/infocenter/lnxpcomp/v8v101/topic/com.ibm.xlcpp8l.doc/compiler/ref/bif_sync.htm
+ *
+ * src/include/storage/atomics-generic-xlc.h
+ *
+ * -------------------------------------------------------------------------
+ */
+
+#include <atomic.h>
+
+typedef struct atomic_uint32
+{
+	volatile uint32 value;
+} atomic_uint32;
+
+typedef atomic_uint32 atomic_flag;
+
+typedef struct atomic_uint64
+{
+	volatile uint64 value;
+} atomic_uint64;
+
+/* declare outside the defined(PG_USE_INLINE) for visibility */
+#define PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U32
+
+/* FIXME: check for 64bit mode, only supported there */
+#define PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U64
+
+#if defined(PG_USE_INLINE) || defined(ATOMICS_INCLUDE_DEFINITIONS)
+
+STATIC_IF_INLINE bool
+pg_atomic_compare_exchange_u32_impl(volatile atomic_uint32 *ptr,
+									uint32 *expected, uint32 newval)
+{
+	bool	ret;
+	uint64	current;
+
+	/*
+	 * xlc's documentation tells us:
+	 * "If __compare_and_swap is used as a locking primitive, insert a call to
+	 * the __isync built-in function at the start of any critical sections."
+	 */
+	__isync();
+
+	/*
+	 * XXX: __compare_and_swap is defined to take signed parameters, but that
+	 * shouldn't matter since we don't perform any arithmetic operations.
+	 */
+	current = (uint32)__compare_and_swap((volatile int*)ptr->value,
+										 (int)*expected, (int)newval);
+	ret = current == *expected;
+	*expected = current;
+	return ret;
+}
+
+#ifdef PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U64
+STATIC_IF_INLINE bool
+pg_atomic_compare_exchange_u64_impl(volatile atomic_uint64 *ptr,
+									uint64 *expected, uint64 newval)
+{
+	bool	ret;
+	uint64	current;
+
+	__isync();
+
+	current = (uint64)__compare_and_swaplp((volatile long*)ptr->value,
+										   (long)*expected, (long)newval);
+	ret = current == *expected;
+	*expected = current;
+	return ret;
+}
+#endif /* defined(PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U64) */
+
+#endif
diff --git a/src/include/storage/atomics-generic.h b/src/include/storage/atomics-generic.h
new file mode 100644
index 0000000..687dca6
--- /dev/null
+++ b/src/include/storage/atomics-generic.h
@@ -0,0 +1,290 @@
+/*-------------------------------------------------------------------------
+ *
+ * atomics-generic.h
+ *	  Atomic operations.
+ *
+ * Portions Copyright (c) 2013, PostgreSQL Global Development Group
+ *
+ * src/include/storage/atomics-generic.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+/* intentionally no include guards, should only be included by atomics.h */
+#ifndef INSIDE_ATOMICS_H
+#error "should be included via atomics.h"
+#endif
+
+#ifndef PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U32
+#error "postgres requires 32bit CAS to be provided"
+#endif
+
+/* provide fallback */
+#ifndef PG_HAVE_ATOMIC_TEST_SET_FLAG
+typedef atomic_uint32 atomic_flag;
+#endif
+
+#if !defined(PG_DISABLE_64_BIT_ATOMICS) && defined(PG_HAS_ATOMIC_COMPARE_EXCHANGE_U64)
+#define PG_HAVE_64_BIT_ATOMICS
+#endif
+
+#if defined(PG_USE_INLINE) || defined(ATOMICS_INCLUDE_DEFINITIONS)
+
+#ifndef PG_HAVE_ATOMIC_TEST_SET_FLAG
+/* provide compare_exchange based TAS */
+#define PG_HAVE_ATOMIC_TEST_SET_FLAG
+
+STATIC_IF_INLINE bool
+pg_atomic_test_set_flag_impl(volatile atomic_flag *ptr)
+{
+	uint32 value = 0;
+	pg_atomic_compare_exchange_u32_impl(ptr, value, 1);
+	return value == 0;
+}
+
+#define PG_HAVE_ATOMIC_CLEAR_FLAG
+STATIC_IF_INLINE void
+pg_atomic_clear_flag_impl(volatile atomic_flag *ptr)
+{
+	uint32 value = 0;
+	pg_atomic_write_u32_impl(ptr, 1);
+	pg_barrier(); /* FIXME: recursive if no barriers provided */
+}
+#endif /* !defined(PG_HAVE_ATOMIC_TEST_SET_FLAG) */
+
+#ifdef PG_HAVE_ATOMIC_INIT_FLAG
+#define PG_HAVE_ATOMIC_INIT_FLAG
+STATIC_IF_INLINE bool
+pg_atomic_init_flag_impl(volatile atomic_flag *ptr)
+{
+	pg_atomic_clear_flag_impl(ptr);
+}
+#endif
+
+#ifndef PG_HAS_ATOMIC_INIT_U32
+#define PG_HAS_ATOMIC_INIT_U32
+STATIC_IF_INLINE void
+pg_atomic_init_u32_impl(volatile atomic_uint32 *ptr, uint32 val_)
+{
+	pg_atomic_write_u32(ptr, val_);
+}
+#endif
+
+#ifndef PG_HAS_ATOMIC_EXCHANGE_U32
+#define PG_HAS_ATOMIC_EXCHANGE_U32
+STATIC_IF_INLINE uint32
+pg_atomic_exchange_u32_impl(volatile atomic_uint32 *ptr, uint32 xchg_)
+{
+	uint32 old;
+	while (true)
+	{
+		old = pg_atomic_read_u32(ptr);
+		if (pg_atomic_compare_exchange_u32_impl(ptr, &old, xchg_))
+			break;
+	}
+	return old;
+}
+#endif
+
+#ifndef PG_HAS_ATOMIC_FETCH_ADD_U32
+#define PG_HAS_ATOMIC_FETCH_ADD_U32
+STATIC_IF_INLINE uint32
+pg_atomic_fetch_add_u32_impl(volatile atomic_uint32 *ptr, uint32 add_)
+{
+	uint32 old;
+	while (true)
+	{
+		old = pg_atomic_read_u32(ptr);
+		if (pg_atomic_compare_exchange_u32_impl(ptr, &old, old + add_))
+			break;
+	}
+	return old;
+}
+#endif
+
+#ifndef PG_HAS_ATOMIC_FETCH_SUB_U32
+#define PG_HAS_ATOMIC_FETCH_SUB_U32
+STATIC_IF_INLINE uint32
+pg_atomic_fetch_sub_u32_impl(volatile atomic_uint32 *ptr, uint32 sub_)
+{
+	uint32 old;
+	while (true)
+	{
+		old = pg_atomic_read_u32(ptr);
+		if (pg_atomic_compare_exchange_u32_impl(ptr, &old, old - sub_))
+			break;
+	}
+	return old;
+}
+#endif
+
+#ifndef PG_HAS_ATOMIC_FETCH_AND_U32
+#define PG_HAS_ATOMIC_FETCH_AND_U32
+STATIC_IF_INLINE uint32
+pg_atomic_fetch_and_u32_impl(volatile atomic_uint32 *ptr, uint32 and_)
+{
+	uint32 old;
+	while (true)
+	{
+		old = pg_atomic_read_u32(ptr);
+		if (pg_atomic_compare_exchange_u32_impl(ptr, &old, old & and_))
+			break;
+	}
+	return old;
+}
+#endif
+
+#ifndef PG_HAS_ATOMIC_FETCH_OR_U32
+#define PG_HAS_ATOMIC_FETCH_OR_U32
+STATIC_IF_INLINE uint32
+pg_atomic_fetch_or_u32_impl(volatile atomic_uint32 *ptr, uint32 or_)
+{
+	uint32 old;
+	while (true)
+	{
+		old = pg_atomic_read_u32(ptr);
+		if (pg_atomic_compare_exchange_u32_impl(ptr, &old, old | or_))
+			break;
+	}
+	return old;
+}
+#endif
+
+#ifndef PG_HAS_ATOMIC_ADD_FETCH_U32
+#define PG_HAS_ATOMIC_ADD_FETCH_U32
+STATIC_IF_INLINE uint32
+pg_atomic_add_fetch_u32_impl(volatile atomic_uint32 *ptr, uint32 add_)
+{
+	return pg_atomic_fetch_add_u32_impl(ptr, add_) + add_;
+}
+#endif
+
+#ifndef PG_HAS_ATOMIC_SUB_FETCH_U32
+#define PG_HAS_ATOMIC_SUB_FETCH_U32
+STATIC_IF_INLINE uint32
+pg_atomic_sub_fetch_u32_impl(volatile atomic_uint32 *ptr, uint32 sub_)
+{
+	return pg_atomic_fetch_sub_u32_impl(ptr, sub_) - sub_;
+}
+#endif
+
+#ifndef PG_HAS_ATOMIC_FETCH_ADD_UNTIL_U32
+#define PG_HAS_ATOMIC_FETCH_ADD_UNTIL_U32
+STATIC_IF_INLINE uint32
+pg_atomic_fetch_add_until_u32_impl(volatile atomic_uint32 *ptr, uint32 add_, uint32 until)
+{
+	uint32 old;
+	while (true)
+	{
+		uint32 new_val;
+		old = pg_atomic_read_u32(ptr);
+		if (old >= until)
+			break;
+
+		new_val = old + add_;
+		if (new_val > until)
+			new_val = until;
+
+		if (pg_atomic_compare_exchange_u32_impl(ptr, &old, new_val))
+			break;
+	}
+	return old;
+}
+#endif
+
+#ifdef PG_HAS_ATOMIC_COMPARE_EXCHANGE_U64
+
+#ifndef PG_HAS_ATOMIC_INIT_U64
+#define PG_HAS_ATOMIC_INIT_U64
+STATIC_IF_INLINE void
+pg_atomic_init_u64_impl(volatile atomic_uint64 *ptr, uint64 val_)
+{
+	pg_atomic_write_u64(ptr, val_);
+}
+#endif
+
+#ifndef PG_HAS_ATOMIC_FETCH_ADD_U64
+#define PG_HAS_ATOMIC_FETCH_ADD_U64
+STATIC_IF_INLINE uint64
+pg_atomic_fetch_add_u64_impl(volatile atomic_uint64 *ptr, uint64 add_)
+{
+	uint64 old;
+	while (true)
+	{
+		old = pg_atomic_read_u64(ptr);
+		if (pg_atomic_compare_exchange_64_impl(ptr, &old, old + add_))
+			break;
+	}
+	return old;
+}
+#endif
+
+#ifndef PG_HAS_ATOMIC_FETCH_SUB_U64
+#define PG_HAS_ATOMIC_FETCH_SUB_U64
+STATIC_IF_INLINE uint64
+pg_atomic_fetch_sub_u64_impl(volatile atomic_uint64 *ptr, uint64 sub_)
+{
+	uint64 old;
+	while (true)
+	{
+		old = pg_atomic_read_u64(ptr);
+		if (pg_atomic_compare_exchange_64_impl(ptr, &old, old - sub_))
+			break;
+	}
+	return old;
+}
+#endif
+
+#ifndef PG_HAS_ATOMIC_FETCH_AND_U64
+#define PG_HAS_ATOMIC_FETCH_AND_U64
+STATIC_IF_INLINE uint64
+pg_atomic_fetch_and_u64_impl(volatile atomic_uint64 *ptr, uint64 and_)
+{
+	uint64 old;
+	while (true)
+	{
+		old = pg_atomic_read_64_impl(ptr);
+		if (pg_atomic_compare_exchange_u64(ptr, &old, old & and_))
+			break;
+	}
+	return old;
+}
+#endif
+
+#ifndef PG_HAS_ATOMIC_FETCH_OR_U64
+#define PG_HAS_ATOMIC_FETCH_OR_U64
+STATIC_IF_INLINE uint64
+pg_atomic_fetch_or_u64_impl(volatile atomic_uint64 *ptr, uint64 or_)
+{
+	uint64 old;
+	while (true)
+	{
+		old = pg_atomic_read_u64(ptr);
+		if (pg_atomic_compare_exchange_64_impl(ptr, &old, old | or_))
+			break;
+	}
+	return old;
+}
+#endif
+
+#ifndef PG_HAS_ATOMIC_ADD_FETCH_U64
+#define PG_HAS_ATOMIC_ADD_FETCH_U64
+STATIC_IF_INLINE
+pg_atomic_add_fetch_u64_impl(volatile atomic_uint64 *ptr, uint64 add_)
+{
+	return pg_atomic_fetch_add_u64_impl(ptr, add_) + add_;
+}
+#endif
+
+#ifndef PG_HAS_ATOMIC_SUB_FETCH_U64
+#define PG_HAS_ATOMIC_SUB_FETCH_U64
+STATIC_IF_INLINE
+pg_atomic_sub_fetch_u64_impl(volatile atomic_uint64 *ptr, uint64 sub_)
+{
+	return pg_atomic_fetch_sub_u64_impl(ptr, sub_) - sub_;
+}
+#endif
+
+#endif /* PG_HAS_ATOMIC_COMPARE_EXCHANGE_U64 */
+
+#endif /* defined(PG_USE_INLINE) || defined(ATOMICS_INCLUDE_DEFINITIONS) */
diff --git a/src/include/storage/atomics.h b/src/include/storage/atomics.h
new file mode 100644
index 0000000..574fc50c
--- /dev/null
+++ b/src/include/storage/atomics.h
@@ -0,0 +1,415 @@
+/*-------------------------------------------------------------------------
+ *
+ * atomics.h
+ *	  Atomic operations.
+ *
+ * Portions Copyright (c) 2013, PostgreSQL Global Development Group
+ *
+ * src/include/storage/atomics.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef ATOMICS_H
+#define ATOMICS_H
+
+#include "storage/barrier.h"
+
+#define INSIDE_ATOMICS_H
+
+/*
+ * Architecture specific implementations/considerations.
+ */
+#if defined(__arm__) || defined(__arm) || \
+	defined(__aarch64__) || defined(__aarch64)
+#	include "storage/atomics-arch-arm.h"
+#endif
+
+/*
+ * Compiler specific, but architecture independent implementations
+ */
+/* gcc or compatible */
+#if defined(HAVE_GCC__ATOMIC_INT32_CAS) || defined(HAVE_GCC__SYNC_INT32_CAS)
+#	include "storage/atomics-generic-gcc.h"
+/* msvc FIXME: check against mingw */
+#elif defined(WIN32)
+#	include "storage/atomics-generic-msvc.h"
+#elif defined(__hpux) && defined(__ia64) && !defined(__GNUC__)
+#	include "storage/atomics-generic-acc.h"
+#elif defined(__SUNPRO_C) && defined(__ia64) && !defined(__GNUC__)
+#	include "storage/atomics-generic-sunpro.h"
+#endif
+
+/*
+ * Provide fallback implementations for operations that aren't provided if
+ * possible.
+ */
+#include "storage/atomics-generic.h"
+
+/*
+ * pg_atomic_init_flag - initialize lock
+ *
+ * No barrier semantics.
+ */
+STATIC_IF_INLINE_DECLARE void
+pg_atomic_init_flag(volatile atomic_flag *ptr);
+
+/*
+ * pg_atomic_test_and_set_flag - TAS()
+ *
+ * Full barrier semantics. (XXX? Only acquire?)
+ */
+STATIC_IF_INLINE_DECLARE bool
+pg_atomic_test_set_flag(volatile atomic_flag *ptr);
+
+/*
+ * pg_atomic_clear_flag - release lock set by TAS()
+ *
+ * Full barrier semantics. (XXX? Only release?)
+ */
+STATIC_IF_INLINE_DECLARE void
+pg_atomic_clear_flag(volatile atomic_flag *ptr);
+
+/*
+ * pg_atomic_init_u32 - initialize atomic variable
+ *
+ * Has to be done before any usage except when the atomic variable is declared
+ * statically in which case the variable is initialized to 0.
+ *
+ * No barrier semantics.
+ */
+STATIC_IF_INLINE_DECLARE void
+pg_atomic_init_u32(volatile atomic_uint32 *ptr, uint32 val);
+
+/*
+ * pg_atomic_read_u32 - read from atomic variable
+ *
+ * No barrier semantics.
+ */
+STATIC_IF_INLINE_DECLARE uint32
+pg_atomic_read_u32(volatile atomic_uint32 *ptr);
+
+/*
+ * pg_atomic_write_u32 - write to atomic variable
+ *
+ * No barrier semantics.
+ */
+STATIC_IF_INLINE_DECLARE void
+pg_atomic_write_u32(volatile atomic_uint32 *ptr, uint32 val);
+
+/*
+ * pg_atomic_exchange_u32 - exchange newval with current value
+ *
+ * Returns the old value of 'ptr' before the swap.
+ *
+ * Full barrier semantics.
+ */
+STATIC_IF_INLINE_DECLARE uint32
+pg_atomic_exchange_u32(volatile atomic_uint32 *ptr, uint32 newval);
+
+/*
+ * pg_atomic_compare_exchange_u32 - CAS operation
+ *
+ * Atomically compare the current value of ptr with *expected and store newval
+ * iff ptr and *expected have the same value. The current value of *ptr will
+ * always be stored in *expected.
+ *
+ * Return whether the values have been exchanged.
+ *
+ * Full barrier semantics.
+ */
+STATIC_IF_INLINE_DECLARE bool
+pg_atomic_compare_exchange_u32(volatile atomic_uint32 *ptr,
+							   uint32 *expected, uint32 newval);
+
+/*
+ * pg_atomic_fetch_add_u32 - atomically add to variable
+ *
+ * Returns the the value of ptr before the arithmetic operation.
+ *
+ * Full barrier semantics.
+ */
+STATIC_IF_INLINE_DECLARE uint32
+pg_atomic_fetch_add_u32(volatile atomic_uint32 *ptr, uint32 add_);
+
+/*
+ * pg_atomic_fetch_sub_u32 - atomically subtract from variable
+ *
+ * Returns the the value of ptr before the arithmetic operation.
+ *
+ * Full barrier semantics.
+ */
+STATIC_IF_INLINE_DECLARE uint32
+pg_atomic_fetch_sub_u32(volatile atomic_uint32 *ptr, uint32 sub_);
+
+/*
+ * pg_atomic_fetch_and_u32 - atomically bit-and and_ with variable
+ *
+ * Returns the the value of ptr before the arithmetic operation.
+ *
+ * Full barrier semantics.
+ */
+STATIC_IF_INLINE_DECLARE uint32
+pg_atomic_fetch_and_u32(volatile atomic_uint32 *ptr, uint32 and_);
+
+/*
+ * pg_atomic_fetch_or_u32 - atomically bit-or or_ with variable
+ *
+ * Returns the the value of ptr before the arithmetic operation.
+ *
+ * Full barrier semantics.
+ */
+STATIC_IF_INLINE_DECLARE uint32
+pg_atomic_fetch_or_u32(volatile atomic_uint32 *ptr, uint32 or_);
+
+/*
+ * pg_atomic_add_fetch_u32 - atomically add to variable
+ *
+ * Returns the the value of ptr after the arithmetic operation.
+ *
+ * Full barrier semantics.
+ */
+STATIC_IF_INLINE_DECLARE uint32
+pg_atomic_add_fetch_u32(volatile atomic_uint32 *ptr, uint32 add_);
+
+/*
+ * pg_atomic_sub_fetch_u32 - atomically subtract from variable
+ *
+ * Returns the the value of ptr after the arithmetic operation.
+ *
+ * Full barrier semantics.
+ */
+STATIC_IF_INLINE_DECLARE uint32
+pg_atomic_sub_fetch_u32(volatile atomic_uint32 *ptr, uint32 sub_);
+
+/*
+ * pg_fetch_add_until_u32 - saturated addition to variable
+ *
+ * Returns the the value of ptr after the arithmetic operation.
+ *
+ * Full barrier semantics.
+ */
+STATIC_IF_INLINE_DECLARE uint32
+pg_atomic_fetch_add_until_u32(volatile atomic_uint32 *ptr, uint32 add_,
+							  uint32 until);
+
+/* ----
+ * The 64 bit operations have the same semantics as their 32bit counterparts if
+ * they are available.
+ * ---
+ */
+#ifdef PG_HAVE_64_BIT_ATOMICS
+
+STATIC_IF_INLINE_DECLARE void
+pg_atomic_init_u64(volatile atomic_uint64 *ptr, uint64 val_);
+
+STATIC_IF_INLINE_DECLARE uint32
+pg_atomic_read_u64(volatile atomic_uint64 *ptr);
+
+STATIC_IF_INLINE_DECLARE void
+pg_atomic_write_u64(volatile atomic_uint64 *ptr, uint64 val);
+
+STATIC_IF_INLINE_DECLARE uint64
+pg_atomic_exchange_u64(volatile atomic_uint64 *ptr, uint64 newval);
+
+STATIC_IF_INLINE_DECLARE bool
+pg_atomic_compare_exchange_u64(volatile atomic_uint64 *ptr,
+							   uint32 *expected, uint64 newval);
+
+STATIC_IF_INLINE_DECLARE uint64
+pg_atomic_fetch_add_u64(volatile atomic_uint64 *ptr, uint64 add_);
+
+STATIC_IF_INLINE_DECLARE uint64
+pg_atomic_fetch_sub_u64(volatile atomic_uint64 *ptr, uint64 sub_);
+
+STATIC_IF_INLINE_DECLARE uint64
+pg_atomic_fetch_and_u64(volatile atomic_uint64 *ptr, uint64 and_);
+
+STATIC_IF_INLINE_DECLARE uint64
+pg_atomic_fetch_or_u64(volatile atomic_uint64 *ptr, uint64 or_);
+
+STATIC_IF_INLINE_DECLARE uint64
+pg_atomic_add_fetch_u64(volatile atomic_uint64 *ptr, uint64 add_);
+
+STATIC_IF_INLINE_DECLARE uint64
+pg_atomic_sub_fetch_u64(volatile atomic_uint64 *ptr, uint64 sub_);
+
+#endif /* PG_HAVE_64_BIT_ATOMICS */
+
+/*
+ * The following functions are wrapper functions around the platform specific
+ * implementation of the atomic operations performing common checks.
+ */
+#if defined(PG_USE_INLINE) || defined(ATOMICS_INCLUDE_DEFINITIONS)
+
+#define CHECK_POINTER_ALIGNMENT(ptr, bndr) \
+	Assert(TYPEALIGN((uintptr_t)(ptr), bndr))
+
+STATIC_IF_INLINE_DECLARE bool
+pg_atomic_test_set_flag(volatile atomic_flag *ptr)
+{
+	return pg_atomic_test_set_flag_impl(ptr);
+}
+
+STATIC_IF_INLINE_DECLARE void
+pg_atomic_clear_flag(volatile atomic_flag *ptr)
+{
+	pg_atomic_test_set_flag_impl(ptr);
+}
+
+STATIC_IF_INLINE_DECLARE void
+pg_atomic_init_flag(volatile atomic_flag *ptr)
+{
+	return pg_atomic_clear_flag_impl(ptr);
+}
+
+STATIC_IF_INLINE_DECLARE void
+pg_atomic_init_u32(volatile atomic_uint32 *ptr, uint32 val)
+{
+	CHECK_POINTER_ALIGNMENT(ptr, 4);
+
+	pg_atomic_init_u32_impl(ptr, val);
+}
+
+STATIC_IF_INLINE uint32
+pg_atomic_exchange_u32(volatile atomic_uint32 *ptr, uint32 newval)
+{
+	CHECK_POINTER_ALIGNMENT(ptr, 4);
+
+	return pg_atomic_exchange_u32_impl(ptr, newval);
+}
+
+STATIC_IF_INLINE bool
+pg_atomic_compare_exchange_u32(volatile atomic_uint32 *ptr,
+							   uint32 *expected, uint32 newval)
+{
+	CHECK_POINTER_ALIGNMENT(ptr, 4);
+	CHECK_POINTER_ALIGNMENT(expected, 4);
+
+	return pg_atomic_compare_exchange_u32_impl(ptr, expected, newval);
+}
+
+STATIC_IF_INLINE uint32
+pg_atomic_fetch_add_u32(volatile atomic_uint32 *ptr, uint32 add_)
+{
+	CHECK_POINTER_ALIGNMENT(ptr, 4);
+	return pg_atomic_fetch_add_u32_impl(ptr, add_);
+}
+
+STATIC_IF_INLINE uint32
+pg_atomic_fetch_sub_u32(volatile atomic_uint32 *ptr, uint32 sub_)
+{
+	CHECK_POINTER_ALIGNMENT(ptr, 4);
+	return pg_atomic_fetch_sub_u32_impl(ptr, sub_);
+}
+
+STATIC_IF_INLINE uint32
+pg_atomic_fetch_and_u32(volatile atomic_uint32 *ptr, uint32 and_)
+{
+	CHECK_POINTER_ALIGNMENT(ptr, 4);
+	return pg_atomic_fetch_and_u32_impl(ptr, and_);
+}
+
+STATIC_IF_INLINE uint32
+pg_atomic_fetch_or_u32(volatile atomic_uint32 *ptr, uint32 or_)
+{
+	CHECK_POINTER_ALIGNMENT(ptr, 4);
+	return pg_atomic_fetch_or_u32_impl(ptr, or_);
+}
+
+STATIC_IF_INLINE uint32
+pg_atomic_add_fetch_u32(volatile atomic_uint32 *ptr, uint32 add_)
+{
+	CHECK_POINTER_ALIGNMENT(ptr, 4);
+	return pg_atomic_add_fetch_u32_impl(ptr, add_);
+}
+
+STATIC_IF_INLINE uint32
+pg_atomic_sub_fetch_u32(volatile atomic_uint32 *ptr, uint32 sub_)
+{
+	CHECK_POINTER_ALIGNMENT(ptr, 4);
+	return pg_atomic_sub_fetch_u32_impl(ptr, sub_);
+}
+
+STATIC_IF_INLINE uint32
+pg_atomic_fetch_add_until_u32(volatile atomic_uint32 *ptr, uint32 add_,
+							  uint32 until)
+{
+	CHECK_POINTER_ALIGNMENT(ptr, 4);
+	return pg_atomic_fetch_add_until_u32_impl(ptr, add_, until);
+}
+
+#ifdef PG_HAVE_64_BIT_ATOMICS
+
+STATIC_IF_INLINE_DECLARE void
+pg_atomic_init_u64(volatile atomic_uint64 *ptr, uint64 val)
+{
+	CHECK_POINTER_ALIGNMENT(ptr, 8);
+
+	pg_atomic_init_u64_impl(ptr, val);
+}
+
+STATIC_IF_INLINE uint64
+pg_atomic_read_u64(volatile atomic_uint64 *ptr)
+{
+	CHECK_POINTER_ALIGNMENT(ptr, 8);
+	return ptr->value;
+}
+
+STATIC_IF_INLINE void
+pg_atomic_write_u64(volatile atomic_uint64 *ptr, uint64 val)
+{
+	CHECK_POINTER_ALIGNMENT(ptr, 8);
+	ptr->value = val;
+}
+
+STATIC_IF_INLINE uint64
+pg_atomic_exchange_u64(volatile atomic_uint64 *ptr, uint64 newval)
+{
+	CHECK_POINTER_ALIGNMENT(ptr, 8);
+
+	return pg_atomic_exchange_u64_impl(ptr, newval);
+}
+
+STATIC_IF_INLINE bool
+pg_atomic_compare_exchange_u64(volatile atomic_uint64 *ptr,
+							   uint64 *expected, uint64 newval)
+{
+	CHECK_POINTER_ALIGNMENT(ptr, 8);
+	CHECK_POINTER_ALIGNMENT(expected, 8);
+	return pg_atomic_compare_exchange_u64_impl(ptr, expected, newval);
+}
+
+STATIC_IF_INLINE uint64
+pg_atomic_fetch_add_u64(volatile atomic_uint64 *ptr, uint64 add_)
+{
+	CHECK_POINTER_ALIGNMENT(ptr, 8);
+	return pg_atomic_fetch_add_u64_impl(ptr, add_);
+}
+
+STATIC_IF_INLINE uint64
+pg_atomic_fetch_sub_u64(volatile atomic_uint64 *ptr, uint64 sub_)
+{
+	CHECK_POINTER_ALIGNMENT(ptr, 8);
+	return pg_atomic_fetch_sub_u64_impl(ptr, sub_);
+}
+
+STATIC_IF_INLINE uint64
+pg_atomic_fetch_and_u64(volatile atomic_uint64 *ptr, uint64 and_)
+{
+	CHECK_POINTER_ALIGNMENT(ptr, 8);
+	return pg_atomic_fetch_and_u64_impl(ptr, and_);
+}
+
+STATIC_IF_INLINE uint64
+pg_atomic_fetch_or_u64(volatile atomic_uint64 *ptr, uint64 or_)
+{
+	CHECK_POINTER_ALIGNMENT(ptr, 8);
+	return pg_atomic_fetch_or_u64_impl(ptr, or_);
+}
+#endif /* PG_HAVE_64_BIT_ATOMICS */
+
+#endif /* defined(PG_USE_INLINE) || defined(ATOMICS_INCLUDE_DEFINITIONS) */
+
+#undef INSIDE_ATOMICS_H
+
+#endif /* ATOMICS_H */
diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
index 7dcd5d9..203e1f6 100644
--- a/src/include/storage/s_lock.h
+++ b/src/include/storage/s_lock.h
@@ -315,7 +315,7 @@ tas(volatile slock_t *lock)
 
 #define TAS(lock) tas(lock)
 
-#ifdef HAVE_GCC_INT_ATOMICS
+#ifdef HAVE_GCC__SYNC_INT32_TAS
 
 typedef int slock_t;
 
@@ -327,7 +327,7 @@ tas(volatile slock_t *lock)
 
 #define S_UNLOCK(lock) __sync_lock_release(lock)
 
-#else /* !HAVE_GCC_INT_ATOMICS */
+#else /* !HAVE_GCC__SYNC_INT32_TAS */
 
 typedef unsigned char slock_t;
 
@@ -344,7 +344,7 @@ tas(volatile slock_t *lock)
 	return (int) _res;
 }
 
-#endif	 /* HAVE_GCC_INT_ATOMICS */
+#endif	 /* HAVE_GCC__SYNC_INT32_TAS */
 #endif	 /* __arm__ */
 
 
@@ -352,7 +352,7 @@ tas(volatile slock_t *lock)
  * On ARM64, we use __sync_lock_test_and_set(int *, int) if available.
  */
 #if defined(__aarch64__) || defined(__aarch64)
-#ifdef HAVE_GCC_INT_ATOMICS
+#ifdef HAVE_GCC__SYNC_INT32_TAS
 #define HAS_TEST_AND_SET
 
 #define TAS(lock) tas(lock)
@@ -367,7 +367,7 @@ tas(volatile slock_t *lock)
 
 #define S_UNLOCK(lock) __sync_lock_release(lock)
 
-#endif	 /* HAVE_GCC_INT_ATOMICS */
+#endif	 /* HAVE_GCC__SYNC_INT32_TAS */
 #endif	 /* __aarch64__ */
 
 
-- 
1.8.3.251.g1462b67

#17Andres Freund
andres@2ndquadrant.com
In reply to: Ants Aasma (#15)

On 2013-11-06 17:47:45 +0200, Ants Aasma wrote:

On Wed, Nov 6, 2013 at 4:54 PM, Andres Freund <andres@2ndquadrant.com> wrote:

FWIW, I find the requirement for atomic_init() really, really
annoying. Not that that will change anything ;)

Me too, but I guess this allows them to emulate atomics on platforms
that don't have native support. Whether that is a good idea is a
separate question.

Since static initialization is supported that would require quite some
dirty hacks, but well, we're talking about compiler makers ;)

I don't think there really exist any interesting ones? I am using my
lwlock reimplementation as a testbed so far.

BufferDesc management in src/backend/storage/buffer/bufmgr.c.
The unlocked reads that are done from various procarray variables.
The XLogCtl accesses in xlog.c.

Those are actually only modified under spinlocks afair, so there isn't
too much interesting happening. But yes, it'd be better if we used more
explicit means to manipulate/query them.

The seqlock like thing used to provide consistent reads from
PgBackendStatus src/backend/postmaster/pgstat.c (this code looks like
it's broken on weakly ordered machines)

Whoa. Never noticed that bit. The consequences of races are quite low,
but still...

IMO the volatile keyword should be confined to the code actually
implementing atomics, locking. The "use volatile pointer to prevent
code rearrangement" comment is evil. If there are data races in the
code, they need comments pointing this out and probably memory
barriers too. If there are no data races, then the volatile
declaration is useless and a pessimization. Currently it's more of a
catch all "here be dragons" declaration for data structures.

Agreed. But I don't think we can replace them all at once. Or well, I
won't ;)

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Andres Freund
andres@2ndquadrant.com
In reply to: Andres Freund (#16)
1 attachment(s)
Re: better atomics - require PG_USE_INLINE support?

On 2013-11-06 17:24:57 +0100, Andres Freund wrote:

Questions:
* Do we really want to continue to cater to compilers not supporting
PG_USE_INLINE? I've tried to add support for them, but it does make
things considerably more #ifdef-y.
Afaik it's only HPUX's acc that has problems, and it seems to work but
generate warnings, so maybe that's ok?

Maybe we can simply silence that specific warning for HPUX when using
aC++ like in the attached patch? It's not like somebody really looks at
those anyway given how bleaty it is.
Or we can just generally remove the "quiet inline" check.

I haven't tested the patch since I don't have a HPUX machine but that's the option to use according to
http://h21007.www2.hp.com/portal/download/files/unprot/aCxx/Online_Help/options.htm#opt+Wargs

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

hpux-warnings.difftext/x-diff; charset=us-asciiDownload
diff --git a/src/template/hpux b/src/template/hpux
index ce4d93c..6349ae5 100644
--- a/src/template/hpux
+++ b/src/template/hpux
@@ -3,7 +3,9 @@
 CPPFLAGS="$CPPFLAGS -D_XOPEN_SOURCE_EXTENDED"
 
 if test "$GCC" != yes ; then
-  CC="$CC -Ae"
+  # -Ae enables C89/C99
+  # +W2177 should disable warnings about unused static inlines
+  CC="$CC -Ae +W2177"
   CFLAGS="+O2"
 fi
 
#19Andres Freund
andres@2ndquadrant.com
In reply to: Andres Freund (#16)
Re: better atomics - spinlock fallback?

Hi,

Instead of de-supporting platforms that don't have CAS support or
providing parallel implementations we could relatively easily build a
spinlock based fallback using the already existing requirement for
tas().
Something like an array of 16 spinlocks, indexed by a more advanced
version of ((char *)(&atomics) >> sizeof(char *)) % 16. The platforms
that would fallback aren't that likely to be used under heavy
concurrency, so the price for that shouldn't be too high.

The only real problem with that would be that we'd need to remove the
spinnlock fallback for barriers, but that seems to be pretty much
disliked.

Thoughts?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#19)
Re: better atomics - spinlock fallback?

On Mon, Nov 11, 2013 at 1:57 PM, Andres Freund <andres@2ndquadrant.com> wrote:

Instead of de-supporting platforms that don't have CAS support or
providing parallel implementations we could relatively easily build a
spinlock based fallback using the already existing requirement for
tas().
Something like an array of 16 spinlocks, indexed by a more advanced
version of ((char *)(&atomics) >> sizeof(char *)) % 16. The platforms
that would fallback aren't that likely to be used under heavy
concurrency, so the price for that shouldn't be too high.

The only real problem with that would be that we'd need to remove the
spinnlock fallback for barriers, but that seems to be pretty much
disliked.

I think this is worth considering. I'm not too clear what to do about
the barriers problem, though. I feel like we've dug ourselves into a
bit of a hole, there, and I'm not sure I understand the issues well
enough to dig us back out of it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Andres Freund
andres@2ndquadrant.com
In reply to: Robert Haas (#20)
Re: better atomics - spinlock fallback?

On 2013-11-12 13:21:30 -0500, Robert Haas wrote:

The only real problem with that would be that we'd need to remove the
spinnlock fallback for barriers, but that seems to be pretty much
disliked.

I think this is worth considering.

Ok, cool. The prototype patch I have for that is pretty small, so it doesn't
look too bad.

What currently scares me is the amount of code I have to write that I can't
test... I really can't see me being able to provide a patch that doesn't
require some buildfarm cycles to really work on all platforms.

I'm not too clear what to do about
the barriers problem, though. I feel like we've dug ourselves into a
bit of a hole, there, and I'm not sure I understand the issues well
enough to dig us back out of it.

I think any platform where we aren't able to provide a proper compiler/memory
barrier will also have broken spinlock relase semantics (as in missing release
memory barrier). So arguably removing the fallback is a good idea anyway.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers