Streaming replication and postmaster signaling
Looking at the latest streaming replication patch, I don't much like the
signaling between WAL sender and postmaster. It seems complicated, and
as a rule of thumb postmaster shouldn't be accessing shared memory. The
current signaling is:
1. A new connection arrives. A new backend process is forked forked like
for a normal connection.
2. When the new process is done with the initialization, it allocates
itself a slot from WalSndCtlData shared memory array. It marks its pid
there, sets registered = false, and signals postmaster with
PMSIGNAL_REGISTER_WALSENDER
3. Upon receiving that signal, postmaster scans the WalSndCtlData array
looking for entries with registered==false. For such entries, it scans
the postmaster-private backend list for a matching entry with the same
pid, marks the entry in the list as a walsender, and sets
registered=true in the shared memory entry.
This way postmaster knows which child processes are walsenders, when
it's time to signal them.
I think it would be better to utilize the existing array of child
processes in pmsignal.c. Instead of having postmaster peek into
WalSndCtlData, let's add a new state to PMChildFlags,
PM_CHILD_WALSENDER, which is just like PM_CHILD_ACTIVE but tells
postmaster that the child is not a normal backend but a walsender.
I've done that in my git branch.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas escribi�:
Looking at the latest streaming replication patch, I don't much like the
signaling between WAL sender and postmaster. It seems complicated, and
as a rule of thumb postmaster shouldn't be accessing shared memory. The
current signaling is:1. A new connection arrives. A new backend process is forked forked like
for a normal connection.
This was probably discussed to death earlier, but: why was it decided to
not simply use a different port for listening for walsender
connections?
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
On Tue, Jan 5, 2010 at 11:07 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
I think it would be better to utilize the existing array of child
processes in pmsignal.c. Instead of having postmaster peek into
WalSndCtlData, let's add a new state to PMChildFlags,
PM_CHILD_WALSENDER, which is just like PM_CHILD_ACTIVE but tells
postmaster that the child is not a normal backend but a walsender.
Seems good.
I've done that in my git branch.
Could you push that git branch to a public place?
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Tue, Jan 5, 2010 at 11:29 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
This was probably discussed to death earlier, but: why was it decided to
not simply use a different port for listening for walsender
connections?
I believe that using a different port would make the setup
of replication messier; look for the unused port number,
open that port for replication in the firewall, etc.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Fujii Masao wrote:
I've done that in my git branch.
Could you push that git branch to a public place?
Ahh, sorry, forgot that again. It's there now, at
git://git.postgresql.org/git/users/heikki/postgres.git, branch
'replication'.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Fujii Masao wrote:
On Tue, Jan 5, 2010 at 11:29 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:This was probably discussed to death earlier, but: why was it decided to
not simply use a different port for listening for walsender
connections?I believe that using a different port would make the setup
of replication messier; look for the unused port number,
open that port for replication in the firewall, etc.
Actually, being able to firewall walsender traffic separately might be
rather handy.
Having to assign a different port wouldn't be fun for packagers, though,
especially those (like the Debian-derived Linux distros) who already try
to support more than one Pg version installed in parallel.
--
Craig Ringer
Craig Ringer <craig@postnewspapers.com.au> writes:
Fujii Masao wrote:
On Tue, Jan 5, 2010 at 11:29 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:This was probably discussed to death earlier, but: why was it decided to
not simply use a different port for listening for walsender
connections?I believe that using a different port would make the setup
of replication messier; look for the unused port number,
open that port for replication in the firewall, etc.
Actually, being able to firewall walsender traffic separately might be
rather handy.
Having to assign a different port wouldn't be fun for packagers, though,
Well, we'd have to get a port number officially assigned by IANA.
I tend to agree that the management overhead of a second port isn't
worth it.
regards, tom lane
On Wed, Jan 6, 2010 at 3:03 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
Fujii Masao wrote:
I've done that in my git branch.
Could you push that git branch to a public place?
Ahh, sorry, forgot that again. It's there now, at
git://git.postgresql.org/git/users/heikki/postgres.git, branch
'replication'.
I'm feeling like we're running out of time to get this committed.
Committing large patches late in the release cycle is a recipe for a
buggy beta, possibly a long beta, and a buggy release, and we're now
down to 8 days before the start of the final CommitFest, after which
our schedule indicates that we expect to put out an alpha and a beta
relatively quickly. If this isn't ready to go, maybe we need to
postpone it to 8.6. We've already had a bunch of bug reports (some of
which have been fixed) as a result of HS, and I don't see any reason
to believe that this isn't going to have the same problem.
Personally, I would rather have a release without SR in June or July
than a release with SR in August or September. We already have too
many good features in the tree to hold up the whole process for
patches that aren't ready yet - though like everyone else, I think
this is a killer feature.
Thoughts?
...Robert
On Thu, 2010-01-07 at 11:55 -0500, Robert Haas wrote:
Personally, I would rather have a release without SR in June or July
than a release with SR in August or September.
If SR will be ready until then, I'd like to see a release in September
which has SR in it. We already postponed SR a lot. Many of advocacy
people including me already mentioned about SR, and many people are
lookig after it. BTW, July probably won't be a good time for a new
release, because of people's holidays.
...and maybe then we can start 8.5 -> 9.0 thread.
Regards,
--
Devrim GÜNDÜZ, RHCE
Command Prompt - http://www.CommandPrompt.com
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org Twitter: http://twitter.com/devrimgunduz
2010/1/7 Devrim GÜNDÜZ <devrim@gunduz.org>:
On Thu, 2010-01-07 at 11:55 -0500, Robert Haas wrote:
Personally, I would rather have a release without SR in June or July
than a release with SR in August or September.
June, yes. July, frankly, no, because July == September, when it comes
to any such scheduling. At least in the countries where my clients are
:)
If SR will be ready until then, I'd like to see a release in September
which has SR in it. We already postponed SR a lot. Many of advocacy
people including me already mentioned about SR, and many people are
lookig after it. BTW, July probably won't be a good time for a new
release, because of people's holidays.
-1. Frankly, if advocacy people said it would be there, they didn't
tell the truth, and that's their problem. If they said "hopefully it
will be there, but we don't know yet", then they don't have a problem
either way.
Not having our release schedule driven by marketing is a *strength* of
our project!
We made the mistake last time to delay the release significantly for a
single feature. It turned out said feature didn't make it *anyway*.
Let's not repeat that mistake.
...and maybe then we can start 8.5 -> 9.0 thread.
....
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On Thursday 07 January 2010 18:10:43 Magnus Hagander wrote:
Not having our release schedule driven by marketing is a *strength* of
our project!
Yes.
We made the mistake last time to delay the release significantly for a
single feature. It turned out said feature didn't make it *anyway*.
Let's not repeat that mistake.
I would consider SR to be significantly less complex than HS though.
What about giving it two weeks from now on to be in a comittable state? Last
time the main discussion started a good while *after* the last commitfest...
Andres
2010/1/7 Magnus Hagander <magnus@hagander.net>:
2010/1/7 Devrim GÜNDÜZ <devrim@gunduz.org>:
On Thu, 2010-01-07 at 11:55 -0500, Robert Haas wrote:
Personally, I would rather have a release without SR in June or July
than a release with SR in August or September.June, yes. July, frankly, no, because July == September, when it comes
to any such scheduling. At least in the countries where my clients are
:)
In terms of when the release comes out, maybe. In terms of the NEXT
release, it still matters. If the release is delayed, the first
CommitFest of the next release will be that much later. If we put out
a release by July 1 of this year, we can repeat the same schedule for
the next release that we are using for this release and I will be
happy with that. If we don't put out a release until September, our
first CommitFest will be at least 2 months later than it was for the
last one, which means that (1) we will have a gap of 8 months without
a CommitFest and (2) 8.6 will have no chance of coming out before
September 2011, and may end up being more like Thanksgiving if that
one also slips.
I really don't want to go 8 months with no CommitFest. That leads to
too many patches in the queue, too many merge conflicts, too many
patch authors who just plain give up, and no feedback to anyone for a
very, very long time.
If SR will be ready until then, I'd like to see a release in September
which has SR in it. We already postponed SR a lot. Many of advocacy
people including me already mentioned about SR, and many people are
lookig after it. BTW, July probably won't be a good time for a new
release, because of people's holidays.-1. Frankly, if advocacy people said it would be there, they didn't
tell the truth, and that's their problem. If they said "hopefully it
will be there, but we don't know yet", then they don't have a problem
either way.Not having our release schedule driven by marketing is a *strength* of
our project!We made the mistake last time to delay the release significantly for a
single feature. It turned out said feature didn't make it *anyway*.
Let's not repeat that mistake.
Indeed.
...Robert
Magnus Hagander <magnus@hagander.net> writes:
We made the mistake last time to delay the release significantly for a
single feature. It turned out said feature didn't make it *anyway*.
Let's not repeat that mistake.
Yeah, we've certainly learned that lesson often enough, or should I say
failed to learn that lesson?
However, HS is already in the tree, and HS without SR is a whole lot
less compelling than HS with SR. So it's going to be pretty
unsatisfying if we can't get SR in there.
I read Robert's original question not so much as a proposal to slip the
schedule to accommodate SR as a question about whether SR could still
meet the current schedule. I think we ought to get that answered before
we start debating schedule changes.
regards, tom lane
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160
However, HS is already in the tree, and HS without SR is a whole lot
less compelling than HS with SR. So it's going to be pretty
unsatisfying if we can't get SR in there.
I don't think that's the case. Having HS alone would be a huge win,
and the sooner we can get it out there the better. Those that are
waiting for SR might have to wait one more version, but my intuition
tells me that's a small minority compared to those waiting for HS.
- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 201001071231
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----
iEYEAREDAAYFAktGGoMACgkQvJuQZxSWSsj74ACgmjeQgRIAncQiCeQ5aaEeWI3y
UHMAoOFWsCldiRzC0GJygwDdYXLGjE4O
=YAwK
-----END PGP SIGNATURE-----
Greg Sabino Mullane wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160However, HS is already in the tree, and HS without SR is a whole lot
less compelling than HS with SR. So it's going to be pretty
unsatisfying if we can't get SR in there.I don't think that's the case. Having HS alone would be a huge win,
and the sooner we can get it out there the better. Those that are
waiting for SR might have to wait one more version, but my intuition
tells me that's a small minority compared to those waiting for HS.
while I agree that HS is very useful without SR, I think that it's
mostly the well known powerusers inthe community are actively waiting
for HS and not so much for SR. For the typical user outside of -hackers
or even -general I'm not so sure about that...
Stefan
On Thu, Jan 7, 2010 at 12:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Magnus Hagander <magnus@hagander.net> writes:
We made the mistake last time to delay the release significantly for a
single feature. It turned out said feature didn't make it *anyway*.
Let's not repeat that mistake.Yeah, we've certainly learned that lesson often enough, or should I say
failed to learn that lesson?
I think the latter phrasing is more accurate.
However, HS is already in the tree, and HS without SR is a whole lot
less compelling than HS with SR. So it's going to be pretty
unsatisfying if we can't get SR in there.I read Robert's original question not so much as a proposal to slip the
schedule to accommodate SR as a question about whether SR could still
meet the current schedule. I think we ought to get that answered before
we start debating schedule changes.
Unfortunately, we've also discovered from hard experience that the
timing of commits is difficult to predict unless the answer is
something like "today" or "tomorrow". I'm not terribly interested in
an estimate of when this will be committed if it's much more distant
than that because experience indicates that such estimates are
typically inaccurate, usually on the optimistic side. I seem to
recall Heikki estimating two weeks for SR about this time last year,
and of course it took a lot longer than that, even if you subtract out
the breaks in the action. That's not because Heikki is a bad
estimator; it's just that estimating how long a particular piece of
code will take to finish is extremely difficult and almost no one can
do it with any degree of accuracy. It is the things the programmer
can't foresee that push out the end date, and of course you can't know
how many of those there will be.
I like Andres' suggestion upthread of setting a deadline and
determining to bounce the patch if it's not committed by that date.
If it turns out we have to bounce it, that stinks, but I don't think
it makes sense to go to beta with a huge, barely-tested pile of code
in the tree. Not that the testing Heikki and Fujii Masao have been
doing until now hasn't been good, but it's not nearly as rigorous as
what we will get when all of our users start banging on it.
The problem with even TALKING about changing the schedule is that we
will have no idea what to change it TO. If we add two months to the
schedule today, that will probably increase the chances of SR getting
committed within that time frame (unless, of course, Heikki's employer
uses that as an excuse to take him off the project for two months...)
but we don't know how much because we can't predict how long it's
going to take to be ready. If someone could show us a curve with
probability on one axis and commit date on the other axis we could
probably make a good decision about where to slice it off, but that
isn't possible.
...Robert
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160
while I agree that HS is very useful without SR, I think that it's
mostly the well known powerusers inthe community are actively waiting
for HS and not so much for SR. For the typical user outside of -hackers
or even -general I'm not so sure about that...
Well, I can state that we have plenty of clients that would be very
interested in HS, but none that would really care if it came without
SR. This power user knows a lot of people outside of -hackers and
- -general and they are what I'm basing my opinion on. :)
- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201001071303
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----
iEYEAREDAAYFAktGIjEACgkQvJuQZxSWSsgT8gCgsfgjp+1ND312KXtExdqtlDRy
tcYAnigTw1L+m4hFeT+qQ8mPHRitn78V
=b+Vn
-----END PGP SIGNATURE-----
Robert Haas <robertmhaas@gmail.com> writes:
I like Andres' suggestion upthread of setting a deadline and
determining to bounce the patch if it's not committed by that date.
If it turns out we have to bounce it, that stinks, but I don't think
it makes sense to go to beta with a huge, barely-tested pile of code
in the tree. Not that the testing Heikki and Fujii Masao have been
doing until now hasn't been good, but it's not nearly as rigorous as
what we will get when all of our users start banging on it.
This argument would hold more water if there weren't *already* a huge,
barely-tested pile of code in the tree, namely HS. If you think that's
anywhere near ready to go to beta, I'm afraid I'd better disillusion
you immediately.
regards, tom lane
On Thu, Jan 7, 2010 at 1:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
I like Andres' suggestion upthread of setting a deadline and
determining to bounce the patch if it's not committed by that date.
If it turns out we have to bounce it, that stinks, but I don't think
it makes sense to go to beta with a huge, barely-tested pile of code
in the tree. Not that the testing Heikki and Fujii Masao have been
doing until now hasn't been good, but it's not nearly as rigorous as
what we will get when all of our users start banging on it.This argument would hold more water if there weren't *already* a huge,
barely-tested pile of code in the tree, namely HS. If you think that's
anywhere near ready to go to beta, I'm afraid I'd better disillusion
you immediately.
That may well be so, but adding another one is not going to improve
the situation even a little bit. I don't think what you're saying
weakens in the slightest the argument that I was making, namely, that
if this isn't committed RSN it should be postponed to 8.6. Do you
disagree?
...Robert
Tom Lane wrote:
Robert Haas <robertmhaas@gmail.com> writes:
I like Andres' suggestion upthread of setting a deadline and
determining to bounce the patch if it's not committed by that date.
If it turns out we have to bounce it, that stinks, but I don't think
it makes sense to go to beta with a huge, barely-tested pile of code
in the tree. Not that the testing Heikki and Fujii Masao have been
doing until now hasn't been good, but it's not nearly as rigorous as
what we will get when all of our users start banging on it.This argument would hold more water if there weren't *already* a huge,
barely-tested pile of code in the tree, namely HS. If you think that's
anywhere near ready to go to beta, I'm afraid I'd better disillusion
you immediately.
I agree with Tom's analysis. HS is very complex, while SR is more
mechanical. We might find that in the end SR was stable before HS.
I think we should stay on course and see where we are when Heikki is
ready for a commit of SR.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +