CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

Started by Robert Haasover 15 years ago93 messageshackers
Jump to latest
#1Robert Haas
robertmhaas@gmail.com

Kevin didn't send out an official gavel-banging announcement of the
end of CommitFest 2009-07 (possibly because I neglected until today to
give him privileges to actually change it in the web application), but
I'd just like to take a minute to thank him publicly for his efforts.
We started this CommitFest with something like 60 patches, which is
definitely on the larger side for a CommitFest, and Kevin did a great
job staying on top of what was going on with all of them and, I felt,
really helped keep us on track. At the same time, I felt he did this
with a very light touch that made the whole thing go very smoothly.
So -- thanks, Kevin!

I also appreciate the efforts of all those who reviewed. Good reviews
are really critical to keep the burden from building up on committers,
and I appreciate the efforts of everyone who contributed, in many
cases probably on their own time. I'm particularly grateful to the
people who were vigilant about spelling, grammar, coding style,
whitespace, and other nitpicky little issues that are not much fun,
but which at least for me are a major time sink if they're still
lingering when it comes time to do the actual commit.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#2Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Robert Haas (#1)
Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

Robert Haas <robertmhaas@gmail.com> wrote:

I'd just like to take a minute to thank him publicly for his
efforts. We started this CommitFest with something like 60
patches, which is definitely on the larger side for a CommitFest,
and Kevin did a great job staying on top of what was going on with
all of them and, I felt, really helped keep us on track. At the
same time, I felt he did this with a very light touch that made
the whole thing go very smoothly. So -- thanks, Kevin!

You're welcome. It was educational for me. I don't think I want to
try to handle two in a row, and I think your style is better suited
than mine to the final CF for a release, but I might be able to take
on the 2010-11 CF if people want that.

My hand was not always so light behind the scenes, though -- I sent
or received about 100 off-list emails to try to keep things moving.
Hopefully nobody was too offended by my nagging. :-)

Oh, and thanks for putting together the CF web application. Without
that, I couldn't have done half as well as I did.

I also appreciate the efforts of all those who reviewed.

Yes, I'll second that! I've always been impressed with the
PostgreSQL community, and managing this CF gave me new insights and
appreciation for the intelligence, professionalism, and community
spirit of its members -- authors, reviewers, and committers.

-Kevin

#3Greg Smith
gsmith@gregsmith.com
In reply to: Kevin Grittner (#2)
Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

Kevin Grittner wrote:

I don't think I want to try to handle two in a row, and I think your style is better suited
than mine to the final CF for a release, but I might be able to take on the 2010-11 CF if people want that

Ha, you just put yourself right back on the hook with that comment, and
Robert does seem like the right guy for CF-4 @ 2011-01. Leaving the
question of what's going to happen with CF-2 next month.

I think the crucial thing with the 2010-09 CF is that we have to get
serious progress made sorting out all the sync rep ideas before/during
that one. The review Yeb did and subsequent discussion was really
helpful, but the scope on that needs to actually get nailed down to
*something* concrete if it's going to get built early enough in the 9.1
release to be properly reviewed and tested for more than one round.
Parts of the design and scope still feel like they're expanding to me,
and I think having someone heavily involved in the next CF who is
willing to push on nailing down that particular area is pretty
important. Will volunteer myself if I can stay on schedule to make it
past the major time commitment sink I've had so far this year by then.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.us

#4Robert Haas
robertmhaas@gmail.com
In reply to: Greg Smith (#3)
Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

On Wed, Aug 18, 2010 at 7:46 PM, Greg Smith <greg@2ndquadrant.com> wrote:

Kevin Grittner wrote:

I don't think I want to try to handle two in a row, and I think your style
is better suited
than mine to the final CF for a release, but I might be able to take on
the 2010-11 CF if people want that

Ha, you just put yourself right back on the hook with that comment, and
Robert does seem like the right guy for CF-4 @ 2011-01.  Leaving the
question of what's going to happen with CF-2 next month.

My reputation precedes me, apparently. Although I appreciate everyone
so far being willing to avoid mentioning exactly what that reputation
might be. :-)

I think the crucial thing with the 2010-09 CF is that we have to get serious
progress made sorting out all the sync rep ideas before/during that one.
 The review Yeb did and subsequent discussion was really helpful, but the
scope on that needs to actually get nailed down to *something* concrete if
it's going to get built early enough in the 9.1 release to be properly
reviewed and tested for more than one round.  Parts of the design and scope
still feel like they're expanding to me, and I think having someone heavily
involved in the next CF who is willing to push on nailing down that
particular area is pretty important.  Will volunteer myself if I can stay on
schedule to make it past the major time commitment sink I've had so far this
year by then.

Sitting on Sync Rep is a job and a half by itself, without adding all
the other CF work on top of it. Maybe we should try to find two
vi^Holunteers: a CommitFest Manager (CFM) and a Major Feature
Babysitter (MBS). At any rate, we should definitely NOT wait another
month to start thinking about Sync Rep again. I haven't actually
looked at any of the Sync Rep code AT ALL but IIRC Heikki expressed
the view that the biggest thing standing in the way of a halfway
decent Sync Rep implementation was a number of polling loops that
needed to be replaced with something that wouldn't introduce
up-to-100ms delays. And so far we haven't seen a patch for that.
Somebody write one. And then let's get it reviewed and committed RSN.
It may seem like we're early in the release cycle yet, but for a
feature of this magnitude we are not. We committed way too much big
stuff at the very end of the last release cycle; Hot Standby was still
being cleaned up in May after commit in November. We'll be lucky to
commit sync rep that early.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#5Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Robert Haas (#4)
Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

On 19/08/10 04:46, Robert Haas wrote:

At any rate, we should definitely NOT wait another
month to start thinking about Sync Rep again.

Agreed. EnterpriseDB is interested in having that feature, so I'm on the
hook to spend time on it regardless of commitfests.

I haven't actually
looked at any of the Sync Rep code AT ALL but IIRC Heikki expressed
the view that the biggest thing standing in the way of a halfway
decent Sync Rep implementation was a number of polling loops that
needed to be replaced with something that wouldn't introduce
up-to-100ms delays.

Well, that's the only uncontroversial thing about it that doesn't
require any fighting over the UI or desired behavior. That's why I've
focused on that first, and also because it's useful regardless of
synchronous replication. But once that's done, we'll have to nail down
how synchronous replication is supposed to behave, and how to configure it.

And so far we haven't seen a patch for that.
Somebody write one. And then let's get it reviewed and committed RSN.

Fujii is on vacation, but I've started working on it. The two issues
with Fujii's latest patch are that it would not respond promptly on
platforms where signals don't interrupt sleep, and it suffers the
classic race condition that pselect() was invented for. I'm going to
replace pg_usleep() with select(), and use the so called "self-pipe
trick" to get over the race condition. I have that written up but I want
to do some testing and cleanup before posting the patch.

It may seem like we're early in the release cycle yet, but for a
feature of this magnitude we are not. We committed way too much big
stuff at the very end of the last release cycle; Hot Standby was still
being cleaned up in May after commit in November. We'll be lucky to
commit sync rep that early.

Agreed. We need to decide the scope and minimum set of features real
soon to get something concrete finished.

BTW, on what platforms signals don't interrupt sleep? Although that
issue has been discussed many times before, I couldn't find any
reference to a real platform in the archives.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#5)
Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

BTW, on what platforms signals don't interrupt sleep? Although that
issue has been discussed many times before, I couldn't find any
reference to a real platform in the archives.

I've got one in captivity (my old HPUX box). Happy to test whatever you
come up with.

Considering that pg_usleep is implemented with select, I'm not following
what you mean by "replace pg_usleep() with select()"?

regards, tom lane

#7Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Heikki Linnakangas (#5)
Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

Excerpts from Heikki Linnakangas's message of jue ago 19 02:02:34 -0400 2010:

On 19/08/10 04:46, Robert Haas wrote:

And so far we haven't seen a patch for that.
Somebody write one. And then let's get it reviewed and committed RSN.

Fujii is on vacation, but I've started working on it. The two issues
with Fujii's latest patch are that it would not respond promptly on
platforms where signals don't interrupt sleep, and it suffers the
classic race condition that pselect() was invented for. I'm going to
replace pg_usleep() with select(), and use the so called "self-pipe
trick" to get over the race condition. I have that written up but I want
to do some testing and cleanup before posting the patch.

Hmm, IIRC the self-pipe trick doesn't work on Windows, mainly because
select() doesn't handle pipes, only sockets. You may need some extra
hack to make it work there.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#8Magnus Hagander
magnus@hagander.net
In reply to: Alvaro Herrera (#7)
Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

On Thu, Aug 19, 2010 at 17:08, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

Excerpts from Heikki Linnakangas's message of jue ago 19 02:02:34 -0400 2010:

On 19/08/10 04:46, Robert Haas wrote:

 And so far we haven't seen a patch for that.
Somebody write one.  And then let's get it reviewed and committed RSN.

Fujii is on vacation, but I've started working on it. The two issues
with Fujii's latest patch are that it would not respond promptly on
platforms where signals don't interrupt sleep, and it suffers the
classic race condition that pselect() was invented for. I'm going to
replace pg_usleep() with select(), and use the so called "self-pipe
trick" to get over the race condition. I have that written up but I want
to do some testing and cleanup before posting the patch.

Hmm, IIRC the self-pipe trick doesn't work on Windows, mainly because
select() doesn't handle pipes, only sockets.  You may need some extra
hack to make it work there.

We have a pipe implementation using sockets that is used on Windows
for just this reason, IIRC.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#9Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Alvaro Herrera (#7)
Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

On 19/08/10 18:08, Alvaro Herrera wrote:

Excerpts from Heikki Linnakangas's message of jue ago 19 02:02:34 -0400 2010:

On 19/08/10 04:46, Robert Haas wrote:

And so far we haven't seen a patch for that.
Somebody write one. And then let's get it reviewed and committed RSN.

Fujii is on vacation, but I've started working on it. The two issues
with Fujii's latest patch are that it would not respond promptly on
platforms where signals don't interrupt sleep, and it suffers the
classic race condition that pselect() was invented for. I'm going to
replace pg_usleep() with select(), and use the so called "self-pipe
trick" to get over the race condition. I have that written up but I want
to do some testing and cleanup before posting the patch.

Hmm, IIRC the self-pipe trick doesn't work on Windows, mainly because
select() doesn't handle pipes, only sockets. You may need some extra
hack to make it work there.

That's fine, we can do the naive set-a-flag implementation on Windows
because on Windows our "signals" are only delivered at
CHECK_FOR_INTERRUPTS(), so we don't have the race condition to begin with.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#10Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#6)
Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

On 19/08/10 16:38, Tom Lane wrote:

Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes:

BTW, on what platforms signals don't interrupt sleep? Although that
issue has been discussed many times before, I couldn't find any
reference to a real platform in the archives.

I've got one in captivity (my old HPUX box). Happy to test whatever you
come up with.

Considering that pg_usleep is implemented with select, I'm not following
what you mean by "replace pg_usleep() with select()"?

Instead of using pg_usleep(), call select() directly, waiting not only
for the timeout, but also for data to arrive on the "self-pipe". The
signal handler writes a byte to the self-pipe, waking up the select().
That way the select() is interupted by the signal arriving, even if
signals per se don't interrupt it. And it closes the race condition
involved with setting a flag in the signal handler and checking that in
the main loop.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#10)
Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

On 19/08/10 16:38, Tom Lane wrote:

Considering that pg_usleep is implemented with select, I'm not following
what you mean by "replace pg_usleep() with select()"?

Instead of using pg_usleep(), call select() directly, waiting not only
for the timeout, but also for data to arrive on the "self-pipe". The
signal handler writes a byte to the self-pipe, waking up the select().
That way the select() is interupted by the signal arriving, even if
signals per se don't interrupt it. And it closes the race condition
involved with setting a flag in the signal handler and checking that in
the main loop.

Hmm, but couldn't you still do that inside pg_usleep? Signal handlers
that do that couldn't know if they were interrupting a sleep per se,
so this would have to be a backend-wide convention.

regards, tom lane

#12Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#11)
Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

On 19/08/10 19:57, Tom Lane wrote:

Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes:

On 19/08/10 16:38, Tom Lane wrote:

Considering that pg_usleep is implemented with select, I'm not following
what you mean by "replace pg_usleep() with select()"?

Instead of using pg_usleep(), call select() directly, waiting not only
for the timeout, but also for data to arrive on the "self-pipe". The
signal handler writes a byte to the self-pipe, waking up the select().
That way the select() is interupted by the signal arriving, even if
signals per se don't interrupt it. And it closes the race condition
involved with setting a flag in the signal handler and checking that in
the main loop.

Hmm, but couldn't you still do that inside pg_usleep? Signal handlers
that do that couldn't know if they were interrupting a sleep per se,
so this would have to be a backend-wide convention.

I don't understand, do what inside pg_usleep()?

We only need to respond quickly to one signal, the one that tells
walsender "there's some new WAL that you should send". We can rely on
polling for all the other signals like SIGHUP for config reload or
shutdown request, like we do today.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#12)
Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

On 19/08/10 19:57, Tom Lane wrote:

Hmm, but couldn't you still do that inside pg_usleep? Signal handlers
that do that couldn't know if they were interrupting a sleep per se,
so this would have to be a backend-wide convention.

I don't understand, do what inside pg_usleep()?

I was envisioning still using pg_usleep, and having this interaction
between signal handlers and the delay be private to pg_usleep, rather
than putting such ugly code into forty-nine different places. There
are a *lot* of places where we have loops that break down delays
into at-most-one-second pg_usleep calls, and if we're going to have a
hack like this we should fix them all, not just two or three that SR
cares about.

But I'm still not seeing how this self-pipe hack avoids a race
condition. If the signal handler is sending a byte whenever it
executes, then you could have bytes already stacked up in the pipe
from previous interrupts that didn't happen to come while inside
pg_usleep. If you clear those before sleeping, you have a race
condition, and if you don't, then you fail to sleep the intended
amount of time even though there was no interrupt this time.

regards, tom lane

#14Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#13)
Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

On 19/08/10 20:18, Tom Lane wrote:

Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes:

On 19/08/10 19:57, Tom Lane wrote:

Hmm, but couldn't you still do that inside pg_usleep? Signal handlers
that do that couldn't know if they were interrupting a sleep per se,
so this would have to be a backend-wide convention.

I don't understand, do what inside pg_usleep()?

I was envisioning still using pg_usleep, and having this interaction
between signal handlers and the delay be private to pg_usleep, rather
than putting such ugly code into forty-nine different places. There
are a *lot* of places where we have loops that break down delays
into at-most-one-second pg_usleep calls, and if we're going to have a
hack like this we should fix them all, not just two or three that SR
cares about.

Hmm, will need to think about a suitable API for that. The nice thing
would be that we could implement it using pselect() where available.
(And reliable - the Linux select() man page says that glibc's pselect()
is emulated using select(), and suffers from the very same race
condition pselect() was invented to solve. How awful is that!?)

But I'm still not seeing how this self-pipe hack avoids a race
condition. If the signal handler is sending a byte whenever it
executes, then you could have bytes already stacked up in the pipe
from previous interrupts that didn't happen to come while inside
pg_usleep. If you clear those before sleeping, you have a race
condition, and if you don't, then you fail to sleep the intended
amount of time even though there was no interrupt this time.

You clear the pipe after waking up. Before sending all the pending WAL
and deciding that you're fully caught up again:

for(;;)
{
1. select()
2. clear pipe
3. send any pending WAL
}

There's more information on the self-pipe trick at e.g
http://lwn.net/Articles/177897/

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#14)
Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

On 19/08/10 20:18, Tom Lane wrote:

But I'm still not seeing how this self-pipe hack avoids a race
condition. If the signal handler is sending a byte whenever it
executes, then you could have bytes already stacked up in the pipe
from previous interrupts that didn't happen to come while inside
pg_usleep. If you clear those before sleeping, you have a race
condition, and if you don't, then you fail to sleep the intended
amount of time even though there was no interrupt this time.

You clear the pipe after waking up.

Hmm ... that doesn't answer my second objection about failing to sleep
the expected amount of time, but on reflection I guess that can't be
done: we have to be able to cope with interrupts occurring just before
the sleep actually begins, and there's no way to define "just before"
except by reference to the calling code having done whatever it might
need to do before/after sleeping.

Hmm, will need to think about a suitable API for that.

Offhand I'd suggest something like

SetSleepInterrupt() -- called by signal handlers, writes pipe
ClearSleepInterrupt() -- called by sleep-and-do-something loops, clears pipe

pg_usleep() itself remains the same, but it is now guaranteed to return
immediately if SetSleepInterrupt is called, or has been called since the
last ClearSleepInterrupt.

The nice thing
would be that we could implement it using pselect() where available.
(And reliable - the Linux select() man page says that glibc's pselect()
is emulated using select(), and suffers from the very same race
condition pselect() was invented to solve. How awful is that!?)

Ick. So how would you tell if it's trustworthy?

regards, tom lane

#16tomas@tuxteam.de
tomas@tuxteam.de
In reply to: Heikki Linnakangas (#14)
Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thu, Aug 19, 2010 at 08:36:09PM +0300, Heikki Linnakangas wrote:

[...]

Hmm, will need to think about a suitable API for that. The nice thing would
be that we could implement it using pselect() where available. (And
reliable - the Linux select() man page says that glibc's pselect() is
emulated using select(), and suffers from the very same race condition
pselect() was invented to solve. How awful is that!?)

It is indeed. It seems, though, that from Linux kernel 2.6.16 and glibc
2.4 on, things look better [1]<http://lwn.net/Articles/176911/&gt;. As a reference, Debian stable (not known
to adventure too far into the present ;-) is libc 2.7 on kernel 2.6.26.

Of course, "enterprise" GNU/Linux distros are said to be even more
conservative...

[1]: <http://lwn.net/Articles/176911/&gt;

Regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFMbiauBcgs9XrR2kYRAhiwAJ41f29jSIy409epTH0eJRXW17oByACeIkRo
CRg2BCw8tn3PkdnNR1i/MCY=
=GVMT
-----END PGP SIGNATURE-----

#17Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#15)
Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

On 19/08/10 20:59, Tom Lane wrote:

Offhand I'd suggest something like

SetSleepInterrupt() -- called by signal handlers, writes pipe
ClearSleepInterrupt() -- called by sleep-and-do-something loops, clears pipe

pg_usleep() itself remains the same, but it is now guaranteed to return
immediately if SetSleepInterrupt is called, or has been called since the
last ClearSleepInterrupt.

Hmm, we have pg_usleep() calls in some fairly low-level functions, like
mdunlink() and s_lock(). If someone has called SetSleepInterrupt(), we
don't want those pg_usleep()s to return immediately. And pg_usleep() is
used in some client code too. I think we need a separate sleep function
for this.

Another idea is to not use unix signals at all, but ProcSendSignal() and
ProcWaitForSignal(). We would not need the signal handler at all.
Walsender would use ProcWaitForSignal() instead of pg_usleep() and
backends that want to wake it up would use ProcSendSignal(). The problem
is that there is currently no way to specify a timeout, but I presume
the underlying semaphore operations have that capability, and we could
expose it.

Actually ProcSendSignal()/ProcWaitForSignal() won't work as is, because
walsender doesn't have a PGPROC entry, but you could easily build a
similar mechanism, using PGSemaphoreLock/Unlock like
ProcSendSignal()/WaitForSignal() does.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#17)
Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

Hmm, we have pg_usleep() calls in some fairly low-level functions, like
mdunlink() and s_lock(). If someone has called SetSleepInterrupt(), we
don't want those pg_usleep()s to return immediately. And pg_usleep() is
used in some client code too. I think we need a separate sleep function
for this.

Well, we'd need some careful thought about which sleeps need what, but I
don't necessarily have an objection to a separate interruptable sleep
function.

Another idea is to not use unix signals at all, but ProcSendSignal() and
ProcWaitForSignal(). We would not need the signal handler at all.
Walsender would use ProcWaitForSignal() instead of pg_usleep() and
backends that want to wake it up would use ProcSendSignal().

You keep on proposing solutions that only work for walsender :-(.
Most of the other places where we have pg_usleep actually do want
a timed sleep, I believe. It's also unclear that we can always expect
ProcSendSignal to be usable --- for example, stuff like SIGHUP would
be sent by processes that might not be connected to shared memory.

The problem
is that there is currently no way to specify a timeout, but I presume
the underlying semaphore operations have that capability, and we could
expose it.

They don't, or at least the semop-based implementation doesn't.

regards, tom lane

#19Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#18)
Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

On 20/08/10 16:24, Tom Lane wrote:

Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes:

Hmm, we have pg_usleep() calls in some fairly low-level functions, like
mdunlink() and s_lock(). If someone has called SetSleepInterrupt(), we
don't want those pg_usleep()s to return immediately. And pg_usleep() is
used in some client code too. I think we need a separate sleep function
for this.

Well, we'd need some careful thought about which sleeps need what, but I
don't necessarily have an objection to a separate interruptable sleep
function.

If we have to, we could also support multiple interrupts with multiple
self-pipes, so that you can choose at pg_usleep() which ones to wake up on.

Another idea is to not use unix signals at all, but ProcSendSignal() and
ProcWaitForSignal(). We would not need the signal handler at all.
Walsender would use ProcWaitForSignal() instead of pg_usleep() and
backends that want to wake it up would use ProcSendSignal().

You keep on proposing solutions that only work for walsender :-(.

Well yes, the other places where we use pg_usleep() are not really a
problem as is. If as a side-effect we can make them respond more quickly
to signals, with small changes, that's good, but walsender is the only
one that's performance critical.

That said, a select() based solution is my current favorite.

Most of the other places where we have pg_usleep actually do want
a timed sleep, I believe. It's also unclear that we can always expect
ProcSendSignal to be usable --- for example, stuff like SIGHUP would
be sent by processes that might not be connected to shared memory.

The problem
is that there is currently no way to specify a timeout, but I presume
the underlying semaphore operations have that capability, and we could
expose it.

They don't, or at least the semop-based implementation doesn't.

There's semtimedop(). I don't know how portable it is, but it seems to
exist at least on Linux, Solaris, HPUX and AIX. On what platforms do we
use sysv semaphores?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#19)
Interruptible sleeps (was Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)

[ It's way past time to change the thread title ]

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

On 20/08/10 16:24, Tom Lane wrote:

You keep on proposing solutions that only work for walsender :-(.

Well yes, the other places where we use pg_usleep() are not really a
problem as is.

Well, yes they are. They cause unnecessary process wakeups and thereby
consume cycles even when the database is idle. See for example a
longstanding complaint here:
https://bugzilla.redhat.com/show_bug.cgi?id=252129

If we're going to go to the trouble of having a mechanism like this,
I'd like it to fix that problem so I can close out that bug.

There's semtimedop(). I don't know how portable it is, but it seems to
exist at least on Linux, Solaris, HPUX and AIX.

It's not on my HPUX, and I don't see it in the Single Unix Spec.

On what platforms do we use sysv semaphores?

AFAIK, everything except Windows and extremely old versions of OS X.

regards, tom lane

#21Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#20)
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#21)
#23Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#22)
#24Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Alvaro Herrera (#23)
#25Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#22)
#26Fujii Masao
masao.fujii@gmail.com
In reply to: Heikki Linnakangas (#25)
#27Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Fujii Masao (#26)
#28Greg Smith
gsmith@gregsmith.com
In reply to: Tom Lane (#20)
#29Robert Haas
robertmhaas@gmail.com
In reply to: Greg Smith (#28)
#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#29)
#31Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Smith (#28)
#32Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#31)
#33Greg Smith
gsmith@gregsmith.com
In reply to: Tom Lane (#31)
#34Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Heikki Linnakangas (#27)
#35Fujii Masao
masao.fujii@gmail.com
In reply to: Heikki Linnakangas (#34)
#36Fujii Masao
masao.fujii@gmail.com
In reply to: Fujii Masao (#26)
#37Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#31)
#38Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Fujii Masao (#36)
#39Fujii Masao
masao.fujii@gmail.com
In reply to: Heikki Linnakangas (#38)
#40Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Fujii Masao (#39)
#41Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#40)
#42Fujii Masao
masao.fujii@gmail.com
In reply to: Tom Lane (#41)
#43Tom Lane
tgl@sss.pgh.pa.us
In reply to: Fujii Masao (#42)
#44Fujii Masao
masao.fujii@gmail.com
In reply to: Tom Lane (#43)
#45Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#41)
#46Tom Lane
tgl@sss.pgh.pa.us
In reply to: Fujii Masao (#44)
#47Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#45)
#48Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#46)
#49Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#48)
#50Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#49)
#51Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#47)
#52Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#51)
#53Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#52)
#54Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#53)
#55Ron Mayer
rm_pg@cheapcomplexdevices.com
In reply to: Tom Lane (#49)
#56Robert Haas
robertmhaas@gmail.com
In reply to: Ron Mayer (#55)
#57Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Robert Haas (#50)
#58Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#57)
#59Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#57)
#60Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#58)
#61Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#60)
#62Bruce Momjian
bruce@momjian.us
In reply to: Heikki Linnakangas (#57)
#63Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#60)
#64Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#54)
#65Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#64)
#66Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#65)
#67Markus Wanner
markus@bluegap.ch
In reply to: Heikki Linnakangas (#66)
#68Tom Lane
tgl@sss.pgh.pa.us
In reply to: Markus Wanner (#67)
#69Markus Wanner
markus@bluegap.ch
In reply to: Tom Lane (#68)
#70Tom Lane
tgl@sss.pgh.pa.us
In reply to: Markus Wanner (#69)
#71Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Markus Wanner (#69)
#72Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Markus Wanner (#67)
#73Markus Wanner
markus@bluegap.ch
In reply to: Heikki Linnakangas (#72)
#74Markus Wanner
markus@bluegap.ch
In reply to: Tom Lane (#70)
#75Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Markus Wanner (#74)
#76Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#75)
#77Martijn van Oosterhout
kleptog@svana.org
In reply to: Markus Wanner (#67)
#78Tom Lane
tgl@sss.pgh.pa.us
In reply to: Martijn van Oosterhout (#77)
#79Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Martijn van Oosterhout (#77)
#80Markus Wanner
markus@bluegap.ch
In reply to: Tom Lane (#76)
#81Markus Wanner
markus@bluegap.ch
In reply to: Heikki Linnakangas (#75)
#82Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Heikki Linnakangas (#66)
#83Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#82)
#84Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#83)
#85Jeff Davis
pgsql@j-davis.com
In reply to: Heikki Linnakangas (#84)
#86Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Davis (#85)
#87Jeff Davis
pgsql@j-davis.com
In reply to: Tom Lane (#86)
#88Jeff Davis
pgsql@j-davis.com
In reply to: Jeff Davis (#87)
#89Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Davis (#87)
#90Jeff Davis
pgsql@j-davis.com
In reply to: Tom Lane (#89)
#91Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Jeff Davis (#87)
#92Jeff Davis
pgsql@j-davis.com
In reply to: Heikki Linnakangas (#91)
#93Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Jeff Davis (#92)