is sync rep stalled?
So we've got two patches that implement synchronous replication, and
no agreement on which one, if either, should be committed. We have no
agreement on how synchronous replication should be configured, and at
most a tenuous agreement that it should involve standby registration.
This is bad.
This feature is important, and we need to get it done. How do we get
the ball rolling again?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
On Wed, Sep 29, 2010 at 11:47 AM, Robert Haas <robertmhaas@gmail.com> wrote:
So we've got two patches that implement synchronous replication, and
no agreement on which one, if either, should be committed. We have no
agreement on how synchronous replication should be configured, and at
most a tenuous agreement that it should involve standby registration.This is bad.
This feature is important, and we need to get it done. How do we get
the ball rolling again?
ISTM that it still takes long to make consensus on standby registration.
So, how about putting the per-standby parameters in recovery.conf, and
focusing on the basic features in synchronous replication at first?
During that time, we can deepen discussion on standby registration, and
then we can implement that.
The basic features that I mean is for most basic use case, that is, one
master and one synchronous standby case. In detail,
* Support multiple standbys with various synchronization levels.
Not required for that case.
* What happens if a synchronous standby isn't connected at the moment? Return immediately vs. wait forever.
The wait-forever option is not required for that case. Let's implement
the return-immediately at first.
* Per-transaction control. Some transactions are important, others are not.
Not required for that case.
* Quorum commit. Wait until n standbys acknowledge. n=1 and n=all servers can be seen as important special cases of this.
Not required for that case.
* async, recv, fsync and replay levels of synchronization.
At least one of three synchronous levels should be included in the first
commit. I think that either recv or fsync is suitable for first try
because those don't require wake-up signaling from startup process to
walreceiver and are relatively easy to implement.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Wed, Sep 29, 2010 at 3:56 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
On Wed, Sep 29, 2010 at 11:47 AM, Robert Haas <robertmhaas@gmail.com> wrote:
So we've got two patches that implement synchronous replication, and
no agreement on which one, if either, should be committed. We have no
agreement on how synchronous replication should be configured, and at
most a tenuous agreement that it should involve standby registration.This is bad.
This feature is important, and we need to get it done. How do we get
the ball rolling again?ISTM that it still takes long to make consensus on standby registration.
So, how about putting the per-standby parameters in recovery.conf, and
focusing on the basic features in synchronous replication at first?
During that time, we can deepen discussion on standby registration, and
then we can implement that.The basic features that I mean is for most basic use case, that is, one
master and one synchronous standby case. In detail,* Support multiple standbys with various synchronization levels.
Not required for that case.
* What happens if a synchronous standby isn't connected at the moment? Return immediately vs. wait forever.
The wait-forever option is not required for that case. Let's implement
the return-immediately at first.* Per-transaction control. Some transactions are important, others are not.
Not required for that case.
* Quorum commit. Wait until n standbys acknowledge. n=1 and n=all servers can be seen as important special cases of this.
Not required for that case.
* async, recv, fsync and replay levels of synchronization.
At least one of three synchronous levels should be included in the first
commit. I think that either recv or fsync is suitable for first try
because those don't require wake-up signaling from startup process to
walreceiver and are relatively easy to implement.
I'm not sure this really gets us anywhere. We already have two
patches; writing a third one won't fix anything. We need to decide
which patch can be the basis for future work. According to my
understanding, the most significant difference between the patches is
the way that ACKs get sent from standby to master. Whose idea is
better, yours or Simon's? And why? Are there other reasons to prefer
one patch to the other?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
On 29.09.2010 10:56, Fujii Masao wrote:
On Wed, Sep 29, 2010 at 11:47 AM, Robert Haas<robertmhaas@gmail.com> wrote:
So we've got two patches that implement synchronous replication, and
no agreement on which one, if either, should be committed. We have no
agreement on how synchronous replication should be configured, and at
most a tenuous agreement that it should involve standby registration.This is bad.
This feature is important, and we need to get it done. How do we get
the ball rolling again?
Agreed. Actually, given the lack of people jumping in and telling us
what they'd like to do with the feature, maybe it's not that important
after all.
ISTM that it still takes long to make consensus on standby registration.
So, how about putting the per-standby parameters in recovery.conf, and
focusing on the basic features in synchronous replication at first?
During that time, we can deepen discussion on standby registration, and
then we can implement that.The basic features that I mean is for most basic use case, that is, one
master and one synchronous standby case. In detail,
ISTM the problem is exactly that there is no consensus on what the basic
use case is. I'm sure there's several things you can accomplish with
synchronous replication, perhaps you could describe what the important
use case for you is?
* Support multiple standbys with various synchronization levels.
Not required for that case.
IMHO at least we'll still need to support asynchronous standbys in the
same mix, that's an existing feature.
* What happens if a synchronous standby isn't connected at the moment? Return immediately vs. wait forever.
The wait-forever option is not required for that case. Let's implement
the return-immediately at first...-
* async, recv, fsync and replay levels of synchronization.
At least one of three synchronous levels should be included in the first
commit. I think that either recv or fsync is suitable for first try
because those don't require wake-up signaling from startup process to
walreceiver and are relatively easy to implement.
What is the use case for that combination? For zero data loss, you
*must* wait forever if a standby isn't connected. For keeping a hot
standby server up-to-date so that you can freely query the standby
instead of the master, you need replay level synchronization.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Thu, 2010-09-30 at 09:09 +0300, Heikki Linnakangas wrote:
On 29.09.2010 10:56, Fujii Masao wrote:
On Wed, Sep 29, 2010 at 11:47 AM, Robert Haas<robertmhaas@gmail.com> wrote:
This feature is important, and we need to get it done. How do we get
the ball rolling again?Agreed. Actually, given the lack of people jumping in and telling us
what they'd like to do with the feature, maybe it's not that important
after all.
I don't see anything has stalled. I've been busy for a few days, so
haven't had a chance to follow up on the use cases, as suggested. I'm
busy again today, so cannot reply further. Anyway, taking a few days to
let us think some more about the technical comments is no bad thing.
I think we need to relax about this feature some more because trying to
get something actually done when basic issues need analysis is hard and
that creates tension. Between us we can work out the code in a few days,
once we know which code to write.
What we actually need to do is talk and listen. I'd like to suggest that
we have an online "focus day" (onlist) on Sync Rep on Oct 5 and maybe 6
as well?. Meeting in person is possible, but probably impractical. But a
design sprint, not a code sprint.
This is important and I'm sure we'll work something out.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services
On Thu, Sep 30, 2010 at 09:14:42AM +0100, Simon Riggs wrote:
On Thu, 2010-09-30 at 09:09 +0300, Heikki Linnakangas wrote:
On 29.09.2010 10:56, Fujii Masao wrote:
On Wed, Sep 29, 2010 at 11:47 AM, Robert Haas<robertmhaas@gmail.com> wrote:
This feature is important, and we need to get it done. How do
we get the ball rolling again?Agreed. Actually, given the lack of people jumping in and telling
us what they'd like to do with the feature, maybe it's not that
important after all.I don't see anything has stalled.
I do. We're half way through this commitfest, so if no one's actually
ready to commit one of the patches, I kinda have to bounce them both,
at least to the next CF.
The very likely outcome of that, given that it's a pretty enormous
feature that involves even more enormous amounts of testing on various
hardware, networks, etc., is that we don't get SR in 9.1, and you
among others will be very unhappy.
So yes, it is stalled, and yes, there's a real urgency to actually
getting a baseline something in there in the next couple of weeks.
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> writes:
On Thu, Sep 30, 2010 at 09:14:42AM +0100, Simon Riggs wrote:
I don't see anything has stalled.
I do. We're half way through this commitfest, so if no one's actually
ready to commit one of the patches, I kinda have to bounce them both,
at least to the next CF.
[ raised eyebrow ] You seem to be in an awfully big hurry to bounce
stuff. The CF end is still two weeks away.
But while I'm thinking about that...
The actual facts on the ground are that practically no CF work has
gotten done yet (at least not in my house) due to the git move and the
9.0.0 release and the upcoming back-branch releases. Maybe we shouldn't
have started the CF while all that was going on, but that's water over
the dam now. What we can do is rethink the scheduled end date. IMHO
we should push out the end date by at least a week to reflect the lack
of time spent on the CF so far.
regards, tom lane
On Thu, Sep 30, 2010 at 2:09 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
Agreed. Actually, given the lack of people jumping in and telling us what
they'd like to do with the feature, maybe it's not that important after all.
The basic features that I mean is for most basic use case, that is, one
master and one synchronous standby case. In detail,ISTM the problem is exactly that there is no consensus on what the basic use
case is. I'm sure there's several things you can accomplish with synchronous
replication, perhaps you could describe what the important use case for you
is?
OK, So I'll throw in my ideal use case. I'm starting to play with
Magnus's "streaming -> archive".
*that's* what I want, with synchronous. Yes, again, I'm looking for
"data durability", not "server query-ability", and I'ld like to rely
on the PG user-space side of things instead of praying that replicated
block-devices hold together....
If my master flips out, I'm quite happy to do a normal archive
restore. Except I don't want that last 16MB (or archive timeout) of
transactions lost. The streaming -> archive in it's current state
get's me pretty close, but I'ld love to be able to guarantee that my
recovery from that archive has *every* transaction that the master
committed...
a.
a.
On Thu, Sep 30, 2010 at 09:52:46AM -0400, Tom Lane wrote:
David Fetter <david@fetter.org> writes:
On Thu, Sep 30, 2010 at 09:14:42AM +0100, Simon Riggs wrote:
I don't see anything has stalled.
I do. We're half way through this commitfest, so if no one's
actually ready to commit one of the patches, I kinda have to
bounce them both, at least to the next CF.[ raised eyebrow ] You seem to be in an awfully big hurry to bounce
stuff. The CF end is still two weeks away.
If people are still wrangling over the design, I'd say two weeks is
a ludicrously short time, not a long one.
But while I'm thinking about that...
The actual facts on the ground are that practically no CF work has
gotten done yet (at least not in my house)
Your non-involvement in the first half or more--I'd say maybe 3 weeks
or so--is precisely what commitfests are for. The point is that
people who are *not* committers need to do a bunch of QA on patches,
review them, get or create new patches as needed. Only then should a
committer get involved.
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
Aidan Van Dyk <aidan@highrise.ca> wrote:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:
I'm sure there's several things you can accomplish with
synchronous replication, perhaps you could describe what the
important use case for you is?
I'm looking for "data durability", not "server query-ability"
Same here. If we used synchronous replication, the important thing
for us would be to hold up the master for the minimum time required
to ensure remote persistence -- not actual application to the remote
database. We could tolerate some WAL replay time on recovery better
than poor commit performance on the master.
-Kevin
On 30.09.2010 17:09, Kevin Grittner wrote:
Aidan Van Dyk<aidan@highrise.ca> wrote:
Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> wrote:I'm sure there's several things you can accomplish with
synchronous replication, perhaps you could describe what the
important use case for you is?I'm looking for "data durability", not "server query-ability"
Same here. If we used synchronous replication, the important thing
for us would be to hold up the master for the minimum time required
to ensure remote persistence -- not actual application to the remote
database. We could tolerate some WAL replay time on recovery better
than poor commit performance on the master.
You do realize that to be able to guarantee zero data loss, the master
will have to stop committing new transactions if the streaming stops for
any reason, like a network glitch. Maybe that's a tradeoff you want, but
I'm asking because that point isn't clear to many people.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:
You do realize that to be able to guarantee zero data loss, the
master will have to stop committing new transactions if the
streaming stops for any reason, like a network glitch. Maybe
that's a tradeoff you want, but I'm asking because that point
isn't clear to many people.
Yeah, I get that. I do think the quorum approach or some simplified
special case of it would be important for us -- possibly even a
requirement -- for that reason.
-Kevin
Heikki Linnakangas wrote:
On 30.09.2010 17:09, Kevin Grittner wrote:
Aidan Van Dyk<aidan@highrise.ca> wrote:
Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> wrote:I'm sure there's several things you can accomplish with
synchronous replication, perhaps you could describe what the
important use case for you is?I'm looking for "data durability", not "server query-ability"
Same here. If we used synchronous replication, the important thing
for us would be to hold up the master for the minimum time required
to ensure remote persistence -- not actual application to the remote
database. We could tolerate some WAL replay time on recovery better
than poor commit performance on the master.You do realize that to be able to guarantee zero data loss, the master
will have to stop committing new transactions if the streaming stops
for any reason, like a network glitch. Maybe that's a tradeoff you
want, but I'm asking because that point isn't clear to many people.
If there's a network glitch, it'd probably affect networked client
connections as well, so it would mean no extra degration of service.
-- Yeb
On Thu, 2010-09-30 at 07:06 -0700, David Fetter wrote:
On Thu, Sep 30, 2010 at 09:52:46AM -0400, Tom Lane wrote:
David Fetter <david@fetter.org> writes:
On Thu, Sep 30, 2010 at 09:14:42AM +0100, Simon Riggs wrote:
I don't see anything has stalled.
I do. We're half way through this commitfest, so if no one's
actually ready to commit one of the patches, I kinda have to
bounce them both, at least to the next CF.[ raised eyebrow ] You seem to be in an awfully big hurry to bounce
stuff. The CF end is still two weeks away.If people are still wrangling over the design, I'd say two weeks is
a ludicrously short time, not a long one.
Yes, there is design work still to do.
What purpose would be served by "bouncing" these patches?
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services
On Thu, Sep 30, 2010 at 12:52 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Thu, 2010-09-30 at 07:06 -0700, David Fetter wrote:
On Thu, Sep 30, 2010 at 09:52:46AM -0400, Tom Lane wrote:
David Fetter <david@fetter.org> writes:
On Thu, Sep 30, 2010 at 09:14:42AM +0100, Simon Riggs wrote:
I don't see anything has stalled.
I do. We're half way through this commitfest, so if no one's
actually ready to commit one of the patches, I kinda have to
bounce them both, at least to the next CF.[ raised eyebrow ] You seem to be in an awfully big hurry to bounce
stuff. The CF end is still two weeks away.If people are still wrangling over the design, I'd say two weeks is
a ludicrously short time, not a long one.Yes, there is design work still to do.
What purpose would be served by "bouncing" these patches?
None whatsoever, IMHO. That having been said, I would like to see us
make some forward progress. I'm open to your ideas expressed
up-thread, but I'm not sure whether they'll be sufficient to resolve
the problem. Seems worth a try, though.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
On Thu, Sep 30, 2010 at 3:09 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
* Support multiple standbys with various synchronization levels.
Not required for that case.
IMHO at least we'll still need to support asynchronous standbys in the same
mix, that's an existing feature.
My intention is to commit the core part of synchronous replication (which would
be used for every use cases) at first. Then we can implement the
feature for each
use case.
I agree that 9.1 should support asynchronous standbys in the same mix, but this
seems to be extended feature rather than very core.
* What happens if a synchronous standby isn't connected at the moment?
Return immediately vs. wait forever.The wait-forever option is not required for that case. Let's implement
the return-immediately at first...-
* async, recv, fsync and replay levels of synchronization.
At least one of three synchronous levels should be included in the first
commit. I think that either recv or fsync is suitable for first try
because those don't require wake-up signaling from startup process to
walreceiver and are relatively easy to implement.What is the use case for that combination? For zero data loss, you *must*
wait forever if a standby isn't connected. For keeping a hot standby server
up-to-date so that you can freely query the standby instead of the master,
you need replay level synchronization.
For high availability, and zero data loss unless the disk on one of master
and standby gets corrupted after the other goes down. It's the same use case
that cluster with shared disk covers.
I proposed to implement the "return-immediately" at first because it doesn't
require standby registration. But if many people think that the "wait-forever"
is the core rather than the "return-immediately", I'll follow them. We can
implement the "return-immediately" after that.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Fri, Oct 01, 2010 at 07:48:25PM +0900, Fujii Masao wrote:
I proposed to implement the "return-immediately" at first because it
doesn't require standby registration. But if many people think that
the "wait-forever" is the core rather than the "return-immediately",
I'll follow them. We can implement the "return-immediately" after
that.
In my experience, most people who want "synchronous" behavior are
willing to put up with "wait forever," especially when asynchronous
behavior is already available.
In short, +1 for "push 'wait forever' soonest."
Anybody who's got a Secret Base, Hidden in a Hollowed-Out Mountain,
Making Grand Plans While Stroking a Long-Haired Cat[1]While the Hollowed-Out Mountain trick worked back in the 60s, it's gotten a little trite. The cool kids are keeping things pretty public these days when they plan to go public. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics, should please
to update their public repository, or create a public repository if it
doesn't already exist, and in either case keep it current.
Cheers,
David
[1]: While the Hollowed-Out Mountain trick worked back in the 60s, it's gotten a little trite. The cool kids are keeping things pretty public these days when they plan to go public. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
it's gotten a little trite. The cool kids are keeping things pretty
public these days when they plan to go public.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
Fujii Masao <masao.fujii@gmail.com> writes:
I proposed to implement the "return-immediately" at first because it doesn't
require standby registration. But if many people think that the "wait-forever"
is the core rather than the "return-immediately", I'll follow them. We can
implement the "return-immediately" after that.
Wait forever can be done without standby registration, with quorum commit.
--
dim
On 09/30/2010 10:52 PM, Tom Lane wrote:
IMHO
we should push out the end date by at least a week to reflect the lack
of time spent on the CF so far.
I agree that we should postpone the end of the CF by one week to deal
with the distractions people have had.
--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com
What we actually need to do is talk and listen. I'd like to suggest that
we have an online "focus day" (onlist) on Sync Rep on Oct 5 and maybe 6
as well?. Meeting in person is possible, but probably impractical. But a
design sprint, not a code sprint.
I'd suggest something even simpler:
(1) Create a wiki page which lists all of the design descisions we need
to make in order to finish the specification for synch rep.
(2) Link each item to any prior discussion we've had about the item.
(3) Invite people to comment on the wiki by leaving per-item comments
and suggestions with their own names.
I believe that right now only a handful of people (Simon, Heikki, Fujii,
Zoltan) are really acquainted with all of the decisions which need to be
made. No wonder the rest of us fly off on minutia like file formats; we
really have no sense of scope.
--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com