dynamic shared memory and locks
One of the things that you might want to do with dynamic shared memory
is store a lock in it. In fact, my bet is that almost everything that
uses dynamic shared memory will want to do precisely that, because, of
course, it's dynamic *shared* memory, which means that it is
concurrently accessed by multiple processes, which tends to require
locking. Typically, what you're going to want are either spinlocks
(for very short critical sections) or lwlocks (for longer ones). It
doesn't really make sense to talk about storing heavyweight locks in
dynamic shared memory, because we're talking about storing locks with
the data structures that they protect, and heavyweight locks are used
to protect database or shared objects, not shared memory structures.
Of course, someone might think of trying to provide a mechanism for
the heavyweight lock manager to overflow to dynamic shared memory, but
that's a different thing altogether and not what I'm talking about
here.
Right now, storing spinlocks in dynamic shared memory *almost* works,
but there are problems with --disable-spinlocks. In that
configuration, we use semaphores to simulate spinlocks. Every time
someone calls SpinLockInit(), it's going to allocate a new semaphore
which will never be returned to the operating system, so you're pretty
quickly going to run out. There are a couple of things we could do
about this:
1. Decide we don't care. If you compile with --disable-spinlocks, and
then you try to use dynamic shared memory, it's going to leak
semaphores until none remain, and then start failing from there until
the postmaster is restarted. If you don't like that, provide a
working spinlock implementation for your platform.
2. Forbid the use of dynamic shared memory when compiling with
--disable-spinlocks. This is a more polite version of #1. It seems
likely to me that nearly every piece of code that uses dynamic shared
memory will require locking. Instead of letting people allocate
dynamic shared memory segments anyway and then having them start
failing shortly after postmaster startup, we could just head the
problem off at the pass by denying the request for dynamic shared
memory in the first place. Dynamic shared memory allocation can
always fail (e.g. because we're out of memory) and also has an
explicit off switch that will make all requests fail
(dynamic_shared_memory_type=none), so any code that uses dynamic
shared memory has to be prepared for a failure at that point, whereas
a failure in SpinLockInit() might be more surprising.
3. Provide an inverse for SpinLockInit, say SpinLockDestroy, and
require all code written for dynamic shared memory to invoke this
function on every spinlock before the shared memory segment is
destroyed. I initially thought that this could be done using the
on_dsm_detach infrastructure, but it turns out that doesn't really
work. The on_dsm_detach infrastructure is designed to make sure that
you *release* all of your locks when detaching - i.e. those hooks get
invoked for each process that detaches. For this, you'd need an
on_dsm_final_detach callback that gets called only for the very last
detach (and after prohibiting any other processes from attaching). I
can certainly engineer all that, but it's a decent amount of extra
work for everyone who wants to use dynamic shared memory to write the
appropriate callback, and because few people actually use
--disable-spinlocks, I think those callbacks will tend to be rather
lightly tested and thus a breeding ground for marginal bugs that
nobody's terribly excited about fixing.
4. Drop support for --disable-spinlocks.
For what it's worth, my vote is currently for #2. I can't think of
many interesting to do with dynamic shared memory without having at
least spinlocks, so I don't think we'd be losing much. #1 seems
needlessly unfriendly, #3 seems like a lot of work for not much, and
#4 seems excessive at least as a solution to this particular problem,
though there may be other arguments for it. What do others think?
I think we're also going to want to be able to create LWLocks in
dynamic shared memory: some critical sections won't be short enough or
safe enough to be protected by spinlocks. At some level this seems
easy: change LWLockAcquire and friends to accept an LWLock * rather
than an LWLockId, and similarly change held_lwlocks[] to hold LWLock
pointers rather than LWLockIds. One tricky point is that you'd better
try not to detach a shared memory segment while you're holding lwlocks
inside that segment, but I think just making that a coding rule
shouldn't cause any great problem, and conversely you'd better release
all lwlocks in the segment before detaching it, but this seems mostly
OK: throwing an error will call LWLockReleaseAll before doing the
resource manager cleanups that will unmap the dynamic shared memory
segment, so that's probably OK too. There may be corner cases I
haven't thought about, though. A bigger problem is that I think we
want to avoid having a large amount of notational churn. The obvious
way to do that is to get rid of the LWLockId array and instead declare
each fixed LWLock separately as e.g. LWLock *ProcArrayLock. However,
creating a large number of new globals that will need to be
initialized in every new EXEC_BACKEND process seems irritating. So
maybe a better idea is to do something like this:
#define BufFreelistLock (&fixedlwlocks[0])
#define ShmemIndexLock (&fixedlwlocks[1])
...
#define AutoFileLock (&fixedlwlocks[36])
#define NUM_FIXED_LWLOCKS 37
Comments, suggestions?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2014-01-05 12:56:05 -0500, Robert Haas wrote:
Right now, storing spinlocks in dynamic shared memory *almost* works,
but there are problems with --disable-spinlocks. In that
configuration, we use semaphores to simulate spinlocks. Every time
someone calls SpinLockInit(), it's going to allocate a new semaphore
which will never be returned to the operating system, so you're pretty
quickly going to run out. There are a couple of things we could do
about this:
4. Drop support for --disable-spinlocks.
I very strongly vote 4). I think we're going to hit this more and more
often and it's a facility that benefits almost nobody. Just about every
new platform will be/is on gcc or clang and you can just duplicate the
compiler provided generic implementation we have for arm for there.
The atomics implementation make this an automatic fallback if there's
no compiler specific variant around.
I think we're also going to want to be able to create LWLocks in
dynamic shared memory: some critical sections won't be short enough or
safe enough to be protected by spinlocks.
Agreed.
At some level this seems easy: change LWLockAcquire and friends to
accept an LWLock * rather than an LWLockId, and similarly change
held_lwlocks[] to hold LWLock pointers rather than LWLockIds.
My primary reason isn't dsm TBH but wanting to embed the buffer lwlocks
in the bufferdesc, on the same cacheline as the buffer headers
spinlock. All the embedded ones can be allocated without padding, while
the relatively low number of non-embedded ones can be padded to the full
cacheline size.
A bigger problem is that I think we
want to avoid having a large amount of notational churn. The obvious
way to do that is to get rid of the LWLockId array and instead declare
each fixed LWLock separately as e.g. LWLock *ProcArrayLock. However,
creating a large number of new globals that will need to be
initialized in every new EXEC_BACKEND process seems irritating. So
maybe a better idea is to do something like this:
#define BufFreelistLock (&fixedlwlocks[0])
#define ShmemIndexLock (&fixedlwlocks[1])
...
#define AutoFileLock (&fixedlwlocks[36])
#define NUM_FIXED_LWLOCKS 37Comments, suggestions?
My idea here was to just have two APIs, a legacy one that works like the
current one, and a new one that locks the lwlocks passed in by a
pointer. After all, LWLockAssign() which you use in extensions currently
returns a LWLockId. Seems ugly to turn that into a pointer.
But perhaps your idea is better anyway, no matter the hackery of turning
LWLockId into a pointer.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
For what it's worth, my vote is currently for #2. I can't think of
many interesting to do with dynamic shared memory without having at
least spinlocks, so I don't think we'd be losing much. #1 seems
needlessly unfriendly, #3 seems like a lot of work for not much, and
#4 seems excessive at least as a solution to this particular problem,
though there may be other arguments for it. What do others think?
I agree with this position. There may be some good reason to drop
--disable-spinlocks altogether in future, but DSM isn't a sufficient
excuse.
I think we're also going to want to be able to create LWLocks in
dynamic shared memory: some critical sections won't be short enough or
safe enough to be protected by spinlocks. At some level this seems
easy: change LWLockAcquire and friends to accept an LWLock * rather
than an LWLockId, and similarly change held_lwlocks[] to hold LWLock
pointers rather than LWLockIds.
I seem to recall that there was some good reason for keeping all the
LWLocks in an array, back when the facility was first designed.
I'm too lazy to research the point right now, but you might want to
go back and look at the archives around when lwlock.c was written.
creating a large number of new globals that will need to be
initialized in every new EXEC_BACKEND process seems irritating.
This might've been the good reason, but not sure --- I think LWLocks
predate our Windows support.
In the end, though, that decision was taken before we were concerned
about being able to add new LWLocks on the fly, which is what this is
really about (whether they're stored in DSM or not is a secondary point).
The pressure for that has gotten strong enough that it may be time to
change the tradeoff.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2014-01-05 14:06:52 -0500, Tom Lane wrote:
Robert Haas <robertmhaas@gmail.com> writes:
For what it's worth, my vote is currently for #2. I can't think of
many interesting to do with dynamic shared memory without having at
least spinlocks, so I don't think we'd be losing much. #1 seems
needlessly unfriendly, #3 seems like a lot of work for not much, and
#4 seems excessive at least as a solution to this particular problem,
though there may be other arguments for it. What do others think?I agree with this position. There may be some good reason to drop
--disable-spinlocks altogether in future, but DSM isn't a sufficient
excuse.
Agreed that DSM isn't sufficient cause. The reasons for removing it for
future reasons I see are:
* It's not tested at all and it has been partially broken for
significants of time. Afair Heikki just noticed the breakage
accidentally when adding enough new spinlocks recently.
* It's showed up as problematic in a couple of patches while adding not
much value (at least dsm, atomic ops, afair some others)
* It's slow enough that it's not of a practical value.
* Implementing simple support for spinlocks on realistic platforms isn't
hard these days due to compiler intrinsics.
* The platforms that don't have a barrier implementation will rely on
spinlocks. And for correctness those spinlocks should employ
barriers. That might be more of an argument for getting rid of that
fallback tho.
I think we're also going to want to be able to create LWLocks in
dynamic shared memory: some critical sections won't be short enough or
safe enough to be protected by spinlocks. At some level this seems
easy: change LWLockAcquire and friends to accept an LWLock * rather
than an LWLockId, and similarly change held_lwlocks[] to hold LWLock
pointers rather than LWLockIds.I seem to recall that there was some good reason for keeping all the
LWLocks in an array, back when the facility was first designed.
I'm too lazy to research the point right now, but you might want to
go back and look at the archives around when lwlock.c was written.
Your proposal is at
/messages/by-id/1054.1001520600@sss.pgh.pa.us -
afaics there hasn't been much discussion about implementation details at
that level.
In the end, though, that decision was taken before we were concerned
about being able to add new LWLocks on the fly, which is what this is
really about (whether they're stored in DSM or not is a secondary point).
The pressure for that has gotten strong enough that it may be time to
change the tradeoff.
I personally find the sharing of a cacheline between a buffer headers
spinlock and the lwlock much more interesting than DSM...
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Jan 5, 2014 at 2:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I seem to recall that there was some good reason for keeping all the
LWLocks in an array, back when the facility was first designed.
I'm too lazy to research the point right now, but you might want to
go back and look at the archives around when lwlock.c was written.
To some extent it's an orthogonal question. It's true that there
isn't much point in using LWLock * rather than LWLockId to refer to
LWLocks unless we wish to be able to store them outside the shared
memory segment, but the reverse is not true: just because we have the
ability to move things outside the main array in the general case
doesn't make it a good idea in any particular case. Andres's email
seems to indicate that he sees performance advantages in moving buffer
locks elsewhere, and I have a sneaking suspicion that we'll find that
it's more convenient to move some other things around as well, but
that's policy, not mechanism. Very little of the core LWLock
machinery actually cares how the locks are stored, so the attached
patch to make it use LWLock * rather than LWLockId as a handle is
pretty straightforward.
The only real problem I see here is that we occasionally *print out*
LWLockIds as a way of identifying lwlocks. This is not a great
system, but printing out pointers is almost certainly much worse (e.g.
because of ASLR). The cases where this is currently an issue are:
- You try to release a lwlock you haven't acquired. We elog(ERROR) the ID.
- You define LWLOCK_STATS. The resulting reports are print the lock ID.
- You define LOCK_DEBUG and set trace_lwlocks=true. We print the lock
ID in the trace messages.
- You compile with --enable-dtrace. We pass the lock IDs to the dtrace probes.
In the attached patch I handled the first case by printing the pointer
(which I don't feel too bad about) and the remaining three cases by
leaving them broken. I wouldn't shed a tear about ripping out
trace_lwlocks, but LWLOCK_STATS is useful and I bet somebody is using
--enable-dtrace, so we probably need to fix those cases at least. I
suppose one option is to make LWLOCK_STATS and the dtrace probes only
look at locks in the main array and just ignore everything else. But
that's kind of crappy, especially if we're going to soon move buffer
locks out of the main array.
Another idea is to include some identifying information in the lwlock.
For example, each lwlock could have a char *name in it, and we could
print the name. In theory, this could be a big step forward in terms
of usability, because it'd spare us all needing to remember that 4 ==
ProcArrayLock. But it's awkward for buffer locks, of which there
might be a great many, and we surely don't want to allocate a
dynamically-generated string in shared memory for each one. You could
do a bit better by making the identifying information a string plus an
integer, because then all the buffer locks could set the string to a
static constant like "buffer content lock" and the integer to the
buffer number, and similarly for lock manager partition locks and so
on. This is appealing, but would increase the size of LWLockPadded
from 16 bytes to 32 on 64-bit platforms where slock_t is four bytes or
less, which I'm not that excited about even though it would reduce
cache line contention in some cases.
Yet a third idea is to try to reverse-engineer a name for a given
lwlock from the pointer address. If it's an offset into the main
array, this is easy enough to do, and even if we ended up with several
arrays (like one for bufmgr locks) it wouldn't be too hard to write
code to figure out which array contains it and emit the appropriate
string. The only real problem that I see with this is that it might
cause a performance hit. A performance hit when running with
trace_lwlocks or LWLOCK_STATS is not really a problem, but people
won't like if --enable-dtrace slow things up.
Preferences, other ideas?
None of these ideas are a complete solution for LWLOCK_STATS. In the
other three cases noted above, we only need an identifier for the lock
"instantaneously", so that we can pass it off to the logger or dtrace
or whatever. But LWLOCK_STATS wants to hold on to data about the
locks that were visited until the end of the session, and it does that
using an array that is *indexed* by lwlockid. I guess we could
replace that with a hash table. Ugh. Any suggestions?
(Incidentally, while developing this patch I found a bug in the
current code: lock.c iterates over all PGPROCs from 0 to
ProcGlobal->allProcCount and takes the backendLock for each, but if
max_prepared_transactions>0 then the PGPROCs for prepared transactions
do not have a backendLock and so we take and release BufFreelistLock -
i.e. 0 - a number of times equal to max_prepared_transactions. That's
not cool.)
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
lwlock-pointers.patchtext/x-patch; charset=US-ASCII; name=lwlock-pointers.patchDownload+261-189
On 01/05/2014 07:56 PM, Robert Haas wrote:
Right now, storing spinlocks in dynamic shared memory *almost* works,
but there are problems with --disable-spinlocks. In that
configuration, we use semaphores to simulate spinlocks. Every time
someone calls SpinLockInit(), it's going to allocate a new semaphore
which will never be returned to the operating system, so you're pretty
quickly going to run out. There are a couple of things we could do
about this:
5. Allocate a fixed number of semaphores, and multiplex them to emulate
any number of spinlocks. For example, allocate 100 semaphores, and use
the address of the spinlock modulo 100 to calculate which semaphore to
use to emulate that spinlock.
That assumes that you never hold more than one spinlock at a time,
otherwise you can get deadlocks. I think that assumptions holds
currently, because acquiring two spinlocks at a time would be bad on
performance grounds anyway.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2014-01-06 10:35:59 +0200, Heikki Linnakangas wrote:
On 01/05/2014 07:56 PM, Robert Haas wrote:
Right now, storing spinlocks in dynamic shared memory *almost* works,
but there are problems with --disable-spinlocks. In that
configuration, we use semaphores to simulate spinlocks. Every time
someone calls SpinLockInit(), it's going to allocate a new semaphore
which will never be returned to the operating system, so you're pretty
quickly going to run out. There are a couple of things we could do
about this:5. Allocate a fixed number of semaphores, and multiplex them to emulate any
number of spinlocks. For example, allocate 100 semaphores, and use the
address of the spinlock modulo 100 to calculate which semaphore to use to
emulate that spinlock.That assumes that you never hold more than one spinlock at a time, otherwise
you can get deadlocks. I think that assumptions holds currently, because
acquiring two spinlocks at a time would be bad on performance grounds
anyway.
I think that's a pretty dangerous assumption - and one which would only
break without just about anybody noticing because nobody tests
--disable-spinlock builds regularly. We could improve on that by adding
cassert checks against nested spinlocks, but that'd be a far too high
price for little benefit imo.
I am not convinced by the performance argument - there's lots of
spinlock that are rarely if ever contended. If you lock two such locks
in a consistent nested fashion there won't be a performance problem.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
* Robert Haas (robertmhaas@gmail.com) wrote:
Another idea is to include some identifying information in the lwlock.
That was my immediate reaction to this issue...
For example, each lwlock could have a char *name in it, and we could
print the name. In theory, this could be a big step forward in terms
of usability, because it'd spare us all needing to remember that 4 ==
ProcArrayLock. But it's awkward for buffer locks, of which there
might be a great many, and we surely don't want to allocate a
dynamically-generated string in shared memory for each one. You could
do a bit better by making the identifying information a string plus an
integer, because then all the buffer locks could set the string to a
static constant like "buffer content lock" and the integer to the
buffer number, and similarly for lock manager partition locks and so
on. This is appealing, but would increase the size of LWLockPadded
from 16 bytes to 32 on 64-bit platforms where slock_t is four bytes or
less, which I'm not that excited about even though it would reduce
cache line contention in some cases.
I'm not thrilled with this either but at the same time, it may well be
worth it.
Yet a third idea is to try to reverse-engineer a name for a given
lwlock from the pointer address. If it's an offset into the main
array, this is easy enough to do, and even if we ended up with several
arrays (like one for bufmgr locks) it wouldn't be too hard to write
code to figure out which array contains it and emit the appropriate
string. The only real problem that I see with this is that it might
cause a performance hit. A performance hit when running with
trace_lwlocks or LWLOCK_STATS is not really a problem, but people
won't like if --enable-dtrace slow things up.
This seems like an interesting idea- but this and my other thought
(having a defined array of strings) seem like they might fall over in
the face of DSMs.
Preferences, other ideas?
My preference would generally be "use more memory rather than CPU time";
so I'd vote for adding identifying info rather than having the code hunt
through arrays to try and find it.
None of these ideas are a complete solution for LWLOCK_STATS. In the
other three cases noted above, we only need an identifier for the lock
"instantaneously", so that we can pass it off to the logger or dtrace
or whatever. But LWLOCK_STATS wants to hold on to data about the
locks that were visited until the end of the session, and it does that
using an array that is *indexed* by lwlockid. I guess we could
replace that with a hash table. Ugh. Any suggestions?
Yeah, that's not fun. No good suggestions here offhand.
Thanks,
Stephen
Andres Freund <andres@2ndquadrant.com> writes:
On 2014-01-06 10:35:59 +0200, Heikki Linnakangas wrote:
That assumes that you never hold more than one spinlock at a time, otherwise
you can get deadlocks. I think that assumptions holds currently, because
acquiring two spinlocks at a time would be bad on performance grounds
anyway.
I think that's a pretty dangerous assumption
I think it's a fine assumption. Code that could possibly do that should
never get within hailing distance of being committed, because you are only
supposed to have short straight-line bits of code under a spinlock.
Taking another spinlock breaks both the "straight line code" and the "no
loops" aspects of that coding rule.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2014-01-06 09:59:49 -0500, Tom Lane wrote:
Andres Freund <andres@2ndquadrant.com> writes:
On 2014-01-06 10:35:59 +0200, Heikki Linnakangas wrote:
That assumes that you never hold more than one spinlock at a time, otherwise
you can get deadlocks. I think that assumptions holds currently, because
acquiring two spinlocks at a time would be bad on performance grounds
anyway.I think that's a pretty dangerous assumption
I think it's a fine assumption. Code that could possibly do that should
never get within hailing distance of being committed, because you are only
supposed to have short straight-line bits of code under a spinlock.
Taking another spinlock breaks both the "straight line code" and the "no
loops" aspects of that coding rule.
I think it's fair to say that no such code should be accepted - but I
personally don't trust the review process to prevent such cases and not
all spinlock usages are obvious (e.g. LockBufferHdr). And having rules
that only cause breakage in configurations that nobody ever uses doesn't
inspire much trust in that configuration.
If we go that way, we should add an Assert preventing recursive spinlock
acquiration...
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jan 6, 2014 at 9:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@2ndquadrant.com> writes:
On 2014-01-06 10:35:59 +0200, Heikki Linnakangas wrote:
That assumes that you never hold more than one spinlock at a time, otherwise
you can get deadlocks. I think that assumptions holds currently, because
acquiring two spinlocks at a time would be bad on performance grounds
anyway.I think that's a pretty dangerous assumption
I think it's a fine assumption. Code that could possibly do that should
never get within hailing distance of being committed, because you are only
supposed to have short straight-line bits of code under a spinlock.
Taking another spinlock breaks both the "straight line code" and the "no
loops" aspects of that coding rule.
OK, so we could do this. But should we? If we're going to get rid of
--disable-spinlocks for other reasons, then there's not much point in
spending the time to make it work with dynamic shared memory segments
that contain locks.
In fact, even if we *aren't* going to get rid of it, it's not 100%
clear to me that there's much point in supporting that intersection: a
big point of the point of dsm is to enable parallelism, and the point
of parallelism is to be fast, and if you want things to be fast, you
should probably have working spinlocks. Of course, there are some
other applications for dynamic shared memory as well, so this isn't a
watertight argument, and in a quick test on my laptop, the performance
of --disable-spinlocks wasn't horrible, so this argument isn't
watertight.
I guess the question boils down to: why are we keeping
--disable-spinlocks around? If we're expecting that people might
really use it for serious work, then it needs to remain and it needs
to work with dynamic shared memory. If we're expecting that people
might use it while porting PostgreSQL to a new platform but only as an
aid to get things up and running, then it needs to remain, but it's
probably OK for dynamic shared memory not to work when this option is
in use. If nobody uses it at all, then we can just forget the whole
thing.
I guess my bottom line here is that if --disable-spinlocks is
important to somebody, then I'm willing to expend some effort on
keeping it working. But if it's just inertia, then I'd rather not
spend a lot of time on it.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
I guess the question boils down to: why are we keeping
--disable-spinlocks around? If we're expecting that people might
really use it for serious work, then it needs to remain and it needs
to work with dynamic shared memory. If we're expecting that people
might use it while porting PostgreSQL to a new platform but only as an
aid to get things up and running, then it needs to remain, but it's
probably OK for dynamic shared memory not to work when this option is
in use. If nobody uses it at all, then we can just forget the whole
thing.
I think we can eliminate the first of those. Semaphores for spinlocks
were a performance disaster fifteen years ago, and the situation has
surely only gotten worse since then. I do, however, believe that
--disable-spinlocks has some use while porting to a new platform.
(And I don't believe the argument advanced elsewhere in this thread
that everybody uses gcc; much less that gcc's atomic intrinsics are
of uniformly high quality even on oddball architectures.)
Heikki's idea has some attraction for porting work whether or not
you believe that DSM needs to work during initial porting. By
default, PG will try to create upwards of ten thousand spinlocks
just for buffer headers. It's likely that that will fail unless
you whack around the kernel's SysV parameters. Being able to
constrain the number of semaphores to something sane would probably
be useful.
Having said all that, I'm not personally going to take the time to
implement it, and I don't think the DSM patch should be required to
either. Somebody who actually needs it can go make it work.
But I think Heikki's idea points the way forward, so I'm now against
having the DSM code reject --disable-spinlocks. I think benign
neglect is the way for now.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jan 6, 2014 at 11:22 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I think we can eliminate the first of those. Semaphores for spinlocks
were a performance disaster fifteen years ago, and the situation has
surely only gotten worse since then. I do, however, believe that
--disable-spinlocks has some use while porting to a new platform.
(And I don't believe the argument advanced elsewhere in this thread
that everybody uses gcc; much less that gcc's atomic intrinsics are
of uniformly high quality even on oddball architectures.)Heikki's idea has some attraction for porting work whether or not
you believe that DSM needs to work during initial porting. By
default, PG will try to create upwards of ten thousand spinlocks
just for buffer headers. It's likely that that will fail unless
you whack around the kernel's SysV parameters. Being able to
constrain the number of semaphores to something sane would probably
be useful.Having said all that, I'm not personally going to take the time to
implement it, and I don't think the DSM patch should be required to
either. Somebody who actually needs it can go make it work.
Well, I took a look at this and it turns out not to be very hard, so
here's a patch. Currently, we allocate 3 semaphore per shared buffer
and a bunch of others, but the 3 per shared buffer dominates, so you
end up with ~49k spinlocks for the default of 128MB shared_buffers. I
chose to peg the number of semaphores at 1024, which is quite small
compared to the current allocation, but the number of spinlock
allocations that can be in progress at any given time is limited by
the number of running backends. Even allowing for the birthday
paradox, that should be enough to run at least a few dozen backends
without suffering serious problems due to the multiplexing -
especially because in real workloads, contention is usually
concentrated around a small number of spinlocks that are unlikely to
all be mapped to the same underlying semaphore.
I'm happy enough with this way forward. Objections?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
spin-multiplex.patchtext/x-patch; charset=US-ASCII; name=spin-multiplex.patchDownload+98-28
Andres Freund <andres@2ndquadrant.com> writes:
On 2014-01-05 14:06:52 -0500, Tom Lane wrote:
I seem to recall that there was some good reason for keeping all the
LWLocks in an array, back when the facility was first designed.
I'm too lazy to research the point right now, but you might want to
go back and look at the archives around when lwlock.c was written.
Your proposal is at
/messages/by-id/1054.1001520600@sss.pgh.pa.us -
afaics there hasn't been much discussion about implementation details at
that level.
Yeah, there wasn't :-(. On reflection I believe that my reason for
building it like this was partially to reduce the number of pointer
variables needed, and partially to avoid exposing the contents of the
LWLock struct type to the rest of the backend.
Robert's idea of continuing to keep the "fixed" LWLocks in an array
would solve the first problem, but I don't think there's any way to
avoid exposing the struct type if we want to embed the things into
other structs, or even just rely on array indexing.
OTOH, the LWLock mechanism has been stable for long enough now that
we can probably suppose this struct is no more subject to churn than
any other widely-known one, so maybe that consideration is no longer
significant.
Another nice benefit of switching to separate allocations that we could
get rid of the horrid kluge that lwlock.h is the place that defines
NUM_BUFFER_PARTITIONS and some other constants that don't belong there.
So on the whole I'm in favor of this, as long as we can avoid any
notational churn at most call sites.
Should we get rid of RequestAddinLWLocks() and LWLockAssign(), in
favor of telling extensions to just allocate some space and do
LWLockInit() on it?
BTW, NumLWLocks() also becomes a big problem if lwlock creation
gets decentralized. This is only used by SpinlockSemas() ...
so maybe we need to use Heikki's modulo idea to get rid of the
need for that to know the number of LWLocks, independently of DSM.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
Well, I took a look at this and it turns out not to be very hard, so
here's a patch. Currently, we allocate 3 semaphore per shared buffer
and a bunch of others, but the 3 per shared buffer dominates, so you
end up with ~49k spinlocks for the default of 128MB shared_buffers. I
chose to peg the number of semaphores at 1024, which is quite small
compared to the current allocation, but the number of spinlock
allocations that can be in progress at any given time is limited by
the number of running backends. Even allowing for the birthday
paradox, that should be enough to run at least a few dozen backends
without suffering serious problems due to the multiplexing -
especially because in real workloads, contention is usually
concentrated around a small number of spinlocks that are unlikely to
all be mapped to the same underlying semaphore.
I'm happy enough with this way forward. Objections?
-1 for the any_spinlock_held business (useless overhead IMO, as it doesn't
have anything whatsoever to do with enforcing the actual coding rule).
And I'd suggest defining NUM_SPINLOCK_SEMAPHORES in pg_config_manual.h,
and maybe dropping SpinlockSemas() altogether in favor of just referencing
the constant. Otherwise this seems reasonable.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jan 6, 2014 at 2:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
Well, I took a look at this and it turns out not to be very hard, so
here's a patch. Currently, we allocate 3 semaphore per shared buffer
and a bunch of others, but the 3 per shared buffer dominates, so you
end up with ~49k spinlocks for the default of 128MB shared_buffers. I
chose to peg the number of semaphores at 1024, which is quite small
compared to the current allocation, but the number of spinlock
allocations that can be in progress at any given time is limited by
the number of running backends. Even allowing for the birthday
paradox, that should be enough to run at least a few dozen backends
without suffering serious problems due to the multiplexing -
especially because in real workloads, contention is usually
concentrated around a small number of spinlocks that are unlikely to
all be mapped to the same underlying semaphore.I'm happy enough with this way forward. Objections?
-1 for the any_spinlock_held business (useless overhead IMO, as it doesn't
have anything whatsoever to do with enforcing the actual coding rule).
Hmm. I thought that was a pretty well-aimed bullet myself; why do you
think that it isn't? I don't particularly mind ripping it out, but it
seemed like a good automated test to me.
And I'd suggest defining NUM_SPINLOCK_SEMAPHORES in pg_config_manual.h,
and maybe dropping SpinlockSemas() altogether in favor of just referencing
the constant. Otherwise this seems reasonable.
As far as pg_config_manual.h is concerned, is this the sort of thing
you have in mind?
#ifndef HAVE_SPINLOCKS
#define NUM_SPINLOCK_SEMAPHORES 1024
#endif
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jan 6, 2014 at 1:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
OTOH, the LWLock mechanism has been stable for long enough now that
we can probably suppose this struct is no more subject to churn than
any other widely-known one, so maybe that consideration is no longer
significant.
On the whole, I'd say it's been more stable than most. But even if we
do decide to change it, I'm not sure that really matters very much.
We whack widely-used data structures around fairly regularly, and I
haven't observed that to cause many problems. Just as one concrete
example, PGPROC changes pretty regularly, and I'm not aware of much
code that's been broken by that.
Should we get rid of RequestAddinLWLocks() and LWLockAssign(), in
favor of telling extensions to just allocate some space and do
LWLockInit() on it?
I don't really see any reason to deprecate that mechanism - it'll just
make it harder for people who want to write code that works on
multiple PostgreSQL releases to do so easily, and it's my belief that
there's a decent amount of third-party code that uses those APIs.
EnterpriseDB has some, for example. Broadly, I don't see any problem
with presuming that most code that uses lwlocks will be happy to keep
those lwlocks in the main array while allowing them to be stored
elsewhere in those cases where that's not convenient. Perhaps in the
long run that won't be true, but then again perhaps it will. Either
way, I don't see a real reason to rush into a change in policy.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
On Mon, Jan 6, 2014 at 2:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
-1 for the any_spinlock_held business (useless overhead IMO, as it doesn't
have anything whatsoever to do with enforcing the actual coding rule).
Hmm. I thought that was a pretty well-aimed bullet myself; why do you
think that it isn't? I don't particularly mind ripping it out, but it
seemed like a good automated test to me.
The coding rule is "only short straight-line code while holding a
spinlock". That could be violated in any number of nasty ways without
trying to take another spinlock. And since the whole point of the rule is
that spinlock-holding code segments should be quick, adding more overhead
to them doesn't seem very nice, even if it is only done in Assert builds.
I agree it'd be nicer if we had some better way than mere manual
inspection to enforce proper use of spinlocks; but this change doesn't
seem to me to move the ball downfield by any meaningful distance.
And I'd suggest defining NUM_SPINLOCK_SEMAPHORES in pg_config_manual.h,
and maybe dropping SpinlockSemas() altogether in favor of just referencing
the constant. Otherwise this seems reasonable.
As far as pg_config_manual.h is concerned, is this the sort of thing
you have in mind?
#ifndef HAVE_SPINLOCKS
#define NUM_SPINLOCK_SEMAPHORES 1024
#endif
I think we can just define it unconditionally, don't you? It shouldn't
get referenced in HAVE_SPINLOCK builds. Or is the point that you want
to enforce that?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
On Mon, Jan 6, 2014 at 1:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
OTOH, the LWLock mechanism has been stable for long enough now that
we can probably suppose this struct is no more subject to churn than
any other widely-known one, so maybe that consideration is no longer
significant.
On the whole, I'd say it's been more stable than most. But even if we
do decide to change it, I'm not sure that really matters very much.
Actually, the real value of a module-local struct definition is that you
can be pretty darn sure that nothing except the code in that file is
manipulating the struct contents. I would've preferred that we expose
only an abstract struct definition, but don't quite see how to do that
if we're going to embed the things in buffer headers.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jan 6, 2014 at 3:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Mon, Jan 6, 2014 at 2:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
-1 for the any_spinlock_held business (useless overhead IMO, as it doesn't
have anything whatsoever to do with enforcing the actual coding rule).Hmm. I thought that was a pretty well-aimed bullet myself; why do you
think that it isn't? I don't particularly mind ripping it out, but it
seemed like a good automated test to me.The coding rule is "only short straight-line code while holding a
spinlock". That could be violated in any number of nasty ways without
trying to take another spinlock. And since the whole point of the rule is
that spinlock-holding code segments should be quick, adding more overhead
to them doesn't seem very nice, even if it is only done in Assert builds.I agree it'd be nicer if we had some better way than mere manual
inspection to enforce proper use of spinlocks; but this change doesn't
seem to me to move the ball downfield by any meaningful distance.
Well, my thought was mainly that, while it may be a bad idea to take
another spinlock while holding a spinlock under any circumstances,
somebody might do it and it might appear to work just fine. The most
likely sequences seems to me to be something like SpinLockAcquire(...)
followed by LWLockConditionalAcquire(), thinking that things are OK
because the lock acquisition is conditional - but in fact the
conditional acquire still takes the spinlock unconditionally. Even
so, without --disable-spinlocks, this will probably work just fine.
Most likely there won't even be a performance problem, because a
lwlock has to be *very* hot for the spinlock acquisitions to become a
problem. But with --disable-spinlocks, it will unpredictably
self-deadlock. And since few developers are likely to test with
--disable-spinlocks, the problem most likely won't be noticed. When
someone does then try to use --disable-spinlocks to port to a new
problem and starts hitting the deadlocks, they may not know enough to
attribute them to the real cause. All in all it doesn't seem like
unduly expensive insurance, but I can remove it if the consensus is
against it.
And I'd suggest defining NUM_SPINLOCK_SEMAPHORES in pg_config_manual.h,
and maybe dropping SpinlockSemas() altogether in favor of just referencing
the constant. Otherwise this seems reasonable.As far as pg_config_manual.h is concerned, is this the sort of thing
you have in mind?#ifndef HAVE_SPINLOCKS
#define NUM_SPINLOCK_SEMAPHORES 1024
#endifI think we can just define it unconditionally, don't you? It shouldn't
get referenced in HAVE_SPINLOCK builds. Or is the point that you want
to enforce that?
I don't think it really matters much; seems like a style question to
me. That's why I asked.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers