Re: standby registration (was: is sync rep stalled?)

Started by Robert Haasover 15 years ago46 messageshackers
Jump to latest
#1Robert Haas
robertmhaas@gmail.com

On Mon, Oct 4, 2010 at 3:08 AM, Markus Wanner <markus@bluegap.ch> wrote:

On 10/01/2010 05:06 PM, Dimitri Fontaine wrote:

Wait forever can be done without standby registration, with quorum commit.

Yeah, I also think the only reason for standby registration is ease of
configuration (if at all). There's no technical requirement for standby
registration, AFAICS. Or does anybody know of a realistic use case
that's possible with standby registration, but not with quorum commit?

Quorum commit, even with configurable vote weights, can't handle a
requirement that a particular commit be replicated to (A || B) && (C
|| D).

The use case is something like "we want to make sure we've replicated
to at least one of the two servers in the Berlin datacenter and at
least one of the two servers in the Hong Kong datacenter".

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#2Markus Wanner
markus@bluegap.ch
In reply to: Robert Haas (#1)

On 10/04/2010 05:20 PM, Robert Haas wrote:

Quorum commit, even with configurable vote weights, can't handle a
requirement that a particular commit be replicated to (A || B) && (C
|| D).

Good point.

Can the proposed standby registration configuration format cover such a
requirement?

Regards

Markus Wanner

#3Robert Haas
robertmhaas@gmail.com
In reply to: Markus Wanner (#2)

On Mon, Oct 4, 2010 at 1:57 PM, Markus Wanner <markus@bluegap.ch> wrote:

On 10/04/2010 05:20 PM, Robert Haas wrote:

Quorum commit, even with configurable vote weights, can't handle a
requirement that a particular commit be replicated to (A || B) && (C
|| D).

Good point.

Can the proposed standby registration configuration format cover such a
requirement?

Well, if you can name the standbys, there's no reason there couldn't
be a parameter that takes a string that looks pretty much like the
above. There are, of course, some situations that could be handled
more elegantly by quorum commit ("any 3 of 5 available standbys") but
the above is more general and not unreasonably longwinded for
reasonable numbers of standbys.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#4David Christensen
david@endpoint.com
In reply to: Robert Haas (#3)

On Oct 4, 2010, at 2:02 PM, Robert Haas wrote:

On Mon, Oct 4, 2010 at 1:57 PM, Markus Wanner <markus@bluegap.ch> wrote:

On 10/04/2010 05:20 PM, Robert Haas wrote:

Quorum commit, even with configurable vote weights, can't handle a
requirement that a particular commit be replicated to (A || B) && (C
|| D).

Good point.

Can the proposed standby registration configuration format cover such a
requirement?

Well, if you can name the standbys, there's no reason there couldn't
be a parameter that takes a string that looks pretty much like the
above. There are, of course, some situations that could be handled
more elegantly by quorum commit ("any 3 of 5 available standbys") but
the above is more general and not unreasonably longwinded for
reasonable numbers of standbys.

Is there any benefit to be had from having standby roles instead of individual names? For instance, you could integrate this into quorum commit to express 3 of 5 "reporting" standbys, 1 "berlin" standby and 1 "tokyo" standby from a group of multiple per data center, or even just utilize role sizes of 1 if you wanted individual standbys to be "named" in this fashion. This role could be provided on connect of the standby is more-or-less tangential to the specific registration issue.

Regards,

David
--
David Christensen
End Point Corporation
david@endpoint.com

#5Mike Rylander
mrylander@gmail.com
In reply to: David Christensen (#4)

On Mon, Oct 4, 2010 at 3:25 PM, David Christensen <david@endpoint.com> wrote:

On Oct 4, 2010, at 2:02 PM, Robert Haas wrote:

On Mon, Oct 4, 2010 at 1:57 PM, Markus Wanner <markus@bluegap.ch> wrote:

On 10/04/2010 05:20 PM, Robert Haas wrote:

Quorum commit, even with configurable vote weights, can't handle a
requirement that a particular commit be replicated to (A || B) && (C
|| D).

Good point.

Can the proposed standby registration configuration format cover such a
requirement?

Well, if you can name the standbys, there's no reason there couldn't
be a parameter that takes a string that looks pretty much like the
above.  There are, of course, some situations that could be handled
more elegantly by quorum commit ("any 3 of 5 available standbys") but
the above is more general and not unreasonably longwinded for
reasonable numbers of standbys.

Is there any benefit to be had from having standby roles instead of individual names?  For instance, you could integrate this into quorum commit to express 3 of 5 "reporting" standbys, 1 "berlin" standby and 1 "tokyo" standby from a group of multiple per data center, or even just utilize role sizes of 1 if you wanted individual standbys to be "named" in this fashion.  This role could be provided on connect of the standby is more-or-less tangential to the specific registration issue.

Big +1 FWIW.

--
Mike Rylander

#6Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#3)

Quorum commit, even with configurable vote weights, can't handle a
requirement that a particular commit be replicated to (A || B) && (C
|| D).

Good point.

If this is the only feature which standby registration is needed for,
has anyone written the code for it yet? Is anyone planning to?

If not, it seems like standby registration is not *required* for 9.1. I
still tend to think it would be nice to have from a DBA perspective, but
we should separate required from "nice to have".

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

#7Robert Haas
robertmhaas@gmail.com
In reply to: David Christensen (#4)

On Mon, Oct 4, 2010 at 3:25 PM, David Christensen <david@endpoint.com> wrote:

On Oct 4, 2010, at 2:02 PM, Robert Haas wrote:

On Mon, Oct 4, 2010 at 1:57 PM, Markus Wanner <markus@bluegap.ch> wrote:

On 10/04/2010 05:20 PM, Robert Haas wrote:

Quorum commit, even with configurable vote weights, can't handle a
requirement that a particular commit be replicated to (A || B) && (C
|| D).

Good point.

Can the proposed standby registration configuration format cover such a
requirement?

Well, if you can name the standbys, there's no reason there couldn't
be a parameter that takes a string that looks pretty much like the
above.  There are, of course, some situations that could be handled
more elegantly by quorum commit ("any 3 of 5 available standbys") but
the above is more general and not unreasonably longwinded for
reasonable numbers of standbys.

Is there any benefit to be had from having standby roles instead of individual names?  For instance, you could integrate this into quorum commit to express 3 of 5 "reporting" standbys, 1 "berlin" standby and 1 "tokyo" standby from a group of multiple per data center, or even just utilize role sizes of 1 if you wanted individual standbys to be "named" in this fashion.  This role could be provided on connect of the standby is more-or-less tangential to the specific registration issue.

It's possible to construct a commit rule that is sufficiently complex
that this can't handle it, but it has to be pretty hairy. The
simplest example I can think of is A || ((B || C) && (D || E)). And
you could even handle that if you allow standbys to belong to multiple
roles; in fact, I think you can handle arbitrary Boolean formulas that
way by converting to conjunctive normal form. The use cases for such
complex formulas are fairly thin, though, so I'm not sure that's a
very compelling argument one way or the other. I think in the end
this is not much different from standby registration; you still have
registrations, they just represent groups of machines instead of
single machines.

I think from a reporting point of view it's a little nicer to have
individual registrations rather than group registrations. For
example, you might ask the master which slaves are connected and where
they are in the WAL stream, or something like that, and with
individual standby names that's a bit easier to puzzle out. Of
course, you could have individual standby names (that are only for
identification) and use groups for everything else. That's maybe a
bit more complicated (each slave needs to send the master a
name-for-identification and a group) but it's certainly workable. We
might also in the future have cases where you want to group standbys
in one way for the commit-rule and another way for some other setting,
but I can't think of exactly what other setting you'd be likely to
want to set in a fashion orthogonal from commit rule, and even if we
did think of one, allowing standbys to be members of multiple groups
would solve that problem, too. That feels a bit more complex to me,
but it's not that likely to happen in practice, so it would probably
be OK. So I guess I think individual registrations are a bit cleaner
and likely to lead to slightly fewer knobs over the long term, but
group registrations seem like they could be made to work, too.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#8Markus Wanner
markus@bluegap.ch
In reply to: Robert Haas (#7)

On 10/04/2010 11:32 PM, Robert Haas wrote:

I think in the end
this is not much different from standby registration; you still have
registrations, they just represent groups of machines instead of
single machines.

Such groups are often easy to represent in CIDR notation, which would
reduce the need for registering every single standby.

Anyway, I'm really with Josh on this. It's a configuration debate that
doesn't have much influence on the real implementation. As long as we
keep the 'what nodes and how long does the master wait' decision
flexible enough.

Regards

Markus Wanner

#9Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Josh Berkus (#6)
Re: standby registration

Josh Berkus <josh@agliodbs.com> writes:

Quorum commit, even with configurable vote weights, can't handle a
requirement that a particular commit be replicated to (A || B) && (C
|| D).

Good point.

So I've been trying to come up with something manually and failed. I
blame the fever — without it maybe I wouldn't have tried…

Now, if you want this level of precision in the setup, all we seem to be
missing from the quorum facility as currently proposed would be to have
a quorum list instead (or a max, but that's not helping the "easy" side).

Given those weights: A3 B2 C4 D4 you can ask for a quorum of 6 and
you're covered for your case, except that C&&D is when you reach the
quorum but don't have what you asked. Have the quorum input accept [6,7]
and it's easy to setup. Do we want that?

If not, it seems like standby registration is not *required* for 9.1. I
still tend to think it would be nice to have from a DBA perspective, but
we should separate required from "nice to have".

+1.
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support
#10Simon Riggs
simon@2ndQuadrant.com
In reply to: David Christensen (#4)

On Mon, 2010-10-04 at 14:25 -0500, David Christensen wrote:

Is there any benefit to be had from having standby roles instead of
individual names? For instance, you could integrate this into quorum
commit to express 3 of 5 "reporting" standbys, 1 "berlin" standby and
1 "tokyo" standby from a group of multiple per data center, or even
just utilize role sizes of 1 if you wanted individual standbys to be
"named" in this fashion. This role could be provided on connect of
the standby is more-or-less tangential to the specific registration
issue.

There is substantial benefit in that config.

If we want to do relaying and path minimization, as is possible with
Slony, we would want to do

M -> S1 -> S2 where M is in London, S1 and S2 are in Berlin.

so that the master sends data only once to Berlin.

If we send to a group, we can also allow things to continue working if
S1 goes down, since S2 might then know it could connect to M directly.

That's complex and not something for the first release, IMHO.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services

#11Simon Riggs
simon@2ndQuadrant.com
In reply to: Josh Berkus (#6)

On Mon, 2010-10-04 at 12:45 -0700, Josh Berkus wrote:

Quorum commit, even with configurable vote weights, can't handle a
requirement that a particular commit be replicated to (A || B) && (C
|| D).

Good point.

Asking for quorum_commit = 3 would cover that requirement.

Not exactly as requested, but in a way that is both simpler to express
and requires no changes to configuration after failover. ISTM better to
have a single parameter than 5 separate configuration files, with
behaviour that the community would not easily be able to validate.

If this is the only feature which standby registration is needed for,
has anyone written the code for it yet? Is anyone planning to?

(Not me)

If not, it seems like standby registration is not *required* for 9.1. I
still tend to think it would be nice to have from a DBA perspective, but
we should separate required from "nice to have".

Agreed.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services

#12Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#11)

On Tue, Oct 5, 2010 at 8:34 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On Mon, 2010-10-04 at 12:45 -0700, Josh Berkus wrote:

Quorum commit, even with configurable vote weights, can't handle a
requirement that a particular commit be replicated to (A || B) && (C
|| D).

Good point.

Asking for quorum_commit = 3 would cover that requirement.

Not exactly as requested, but in a way that is both simpler to express
and requires no changes to configuration after failover. ISTM better to
have a single parameter than 5 separate configuration files, with
behaviour that the community would not easily be able to validate.

That's just not the same thing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#13Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#12)

On Tue, 2010-10-05 at 08:57 -0400, Robert Haas wrote:

On Tue, Oct 5, 2010 at 8:34 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On Mon, 2010-10-04 at 12:45 -0700, Josh Berkus wrote:

Quorum commit, even with configurable vote weights, can't handle a
requirement that a particular commit be replicated to (A || B) && (C
|| D).

Good point.

Asking for quorum_commit = 3 would cover that requirement.

Not exactly as requested, but in a way that is both simpler to express
and requires no changes to configuration after failover. ISTM better to
have a single parameter than 5 separate configuration files, with
behaviour that the community would not easily be able to validate.

That's just not the same thing.

In what important ways does it differ? In both cases, no reply will be
received until both sites have confirmed receipt.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services

#14Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Simon Riggs (#13)

Simon Riggs <simon@2ndQuadrant.com> wrote:

Robert Haas wrote:

Simon Riggs <simon@2ndquadrant.com> wrote:

Josh Berkus wrote:

Quorum commit, even with configurable vote weights, can't
handle a requirement that a particular commit be replicated
to (A || B) && (C || D).

Good point.

Asking for quorum_commit = 3 would cover that requirement.

Not exactly as requested,

That's just not the same thing.

In what important ways does it differ?

When you have one server functioning at each site you'll block until
you get a third machine back, rather than replicating to both sites
and remaining functional.

-Kevin

#15Markus Wanner
markus@bluegap.ch
In reply to: Kevin Grittner (#14)

On 10/05/2010 04:07 PM, Kevin Grittner wrote:

When you have one server functioning at each site you'll block until
you get a third machine back, rather than replicating to both sites
and remaining functional.

That's not a very likely failure scenario, but yes.

What if the admin wants to add a standby in Berlin, but still wants one
ack from each location? None of the current proposals make that simple
enough to not require adjustment in configuration.

Maybe defining something like: at least one from Berlin and at least one
from Tokyo (where Berlin and Tokyo could be defined by CIDR notation).
IMO that's closer to the admin's reality than a plain quorum but still
not as verbose as a full standby registration.

But maybe we should really defer that discussion...

Regards

Markus Wanner

#16Simon Riggs
simon@2ndQuadrant.com
In reply to: Kevin Grittner (#14)

On Tue, 2010-10-05 at 09:07 -0500, Kevin Grittner wrote:

Simon Riggs <simon@2ndQuadrant.com> wrote:

Robert Haas wrote:

Simon Riggs <simon@2ndquadrant.com> wrote:

Josh Berkus wrote:

Quorum commit, even with configurable vote weights, can't
handle a requirement that a particular commit be replicated
to (A || B) && (C || D).

Good point.

Asking for quorum_commit = 3 would cover that requirement.

Not exactly as requested,

That's just not the same thing.

In what important ways does it differ?

When you have one server functioning at each site you'll block until
you get a third machine back, rather than replicating to both sites
and remaining functional.

And that is so important a consideration that you would like to move
from one parameter in one file to a whole set of parameters, set
differently in 5 separate files? Is it a common use case that people
have more than 3 separate servers for one application, which is where
the difference shows itself.

Another check: does specifying replication by server in such detail mean
we can't specify robustness at the transaction level? If we gave up that
feature, it would be a great loss for performance tuning.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services

#17Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#16)

On Tue, Oct 5, 2010 at 10:33 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On Tue, 2010-10-05 at 09:07 -0500, Kevin Grittner wrote:

Simon Riggs <simon@2ndQuadrant.com> wrote:

Robert Haas wrote:

Simon Riggs <simon@2ndquadrant.com> wrote:

Josh Berkus wrote:

Quorum commit, even with configurable vote weights, can't
handle a requirement that a particular commit be replicated
to (A || B) && (C || D).

Good point.

Asking for quorum_commit = 3 would cover that requirement.

Not exactly as requested,

That's just not the same thing.

In what important ways does it differ?

When you have one server functioning at each site you'll block until
you get a third machine back, rather than replicating to both sites
and remaining functional.

And that is so important a consideration that you would like to move
from one parameter in one file to a whole set of parameters, set
differently in 5 separate files?

I don't accept that this is the trade-off being proposed. You seem
convinced that having the config all in one place on the master is
going to make things much more complicated, but I can't see why.

Is it a common use case that people
have more than 3 separate servers for one application, which is where
the difference shows itself.

Much of the engineering we are doing centers around use cases that are
considerably more complex than what most people will do in real life.

Another check: does specifying replication by server in such detail mean
we can't specify robustness at the transaction level? If we gave up that
feature, it would be a great loss for performance tuning.

No, I don't think it means that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#18Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#17)

On Tue, 2010-10-05 at 10:41 -0400, Robert Haas wrote:

When you have one server functioning at each site you'll block until
you get a third machine back, rather than replicating to both sites
and remaining functional.

And that is so important a consideration that you would like to move
from one parameter in one file to a whole set of parameters, set
differently in 5 separate files?

I don't accept that this is the trade-off being proposed. You seem
convinced that having the config all in one place on the master is
going to make things much more complicated, but I can't see why.

But it is not "all in one place" because the file needs to be different
on 5 separate nodes. Which *does* make it more complicated than the
alternative is a single parameter, set the same everywhere.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services

#19Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Simon Riggs (#16)

Simon Riggs <simon@2ndQuadrant.com> wrote:

Is it a common use case that people have more than 3 separate
servers for one application, which is where the difference shows
itself.

I don't know how common it is, but we replicate circuit court data
to two machines each at two sites. That way a disaster which took
out one building would leave us with the ability to run from the
other building and still take a machine out of the production mix
for scheduled maintenance or to survive a single-server failure at
the other site. Of course, there's no way we would make that
replication synchronous, and we're replicating from dozens of source
machines -- so I don't know if you can even count our configuration.

Still, the fact that we're replicating to two machines each at two
sites and that is the same example which came to mind for Robert,
suggests that perhaps it isn't *that* bizarre.

-Kevin

#20Josh Berkus
josh@agliodbs.com
In reply to: Simon Riggs (#16)

Another check: does specifying replication by server in such detail mean
we can't specify robustness at the transaction level? If we gave up that
feature, it would be a greatloss for performance tuning.

It's orthagonal. The kinds of configurations we're talking about simply
define what it will mean when you commit a transaction "with synch".

However, I think we're getting way the heck away from how far we really
want to go for 9.1. Can I point out to people that synch rep is going
to involve a fair bit of testing and debugging, and that maybe we don't
want to try to implement The World's Most Configurable Standby Spec as a
first step?

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

#21Simon Riggs
simon@2ndQuadrant.com
In reply to: Kevin Grittner (#19)
#22Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#18)
#23Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#22)
#24Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#23)
#25Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#17)
#26Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#25)
#27Greg Smith
gsmith@gregsmith.com
In reply to: Josh Berkus (#20)
#28Robert Haas
robertmhaas@gmail.com
In reply to: Greg Smith (#27)
#29Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Greg Smith (#27)
#30Dave Page
dpage@pgadmin.org
In reply to: Heikki Linnakangas (#29)
#31Josh Berkus
josh@agliodbs.com
In reply to: Heikki Linnakangas (#29)
#32Robert Haas
robertmhaas@gmail.com
In reply to: Dave Page (#30)
#33Robert Haas
robertmhaas@gmail.com
In reply to: Josh Berkus (#31)
#34Aidan Van Dyk
aidan@highrise.ca
In reply to: Heikki Linnakangas (#29)
#35Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#33)
#36Robert Haas
robertmhaas@gmail.com
In reply to: Josh Berkus (#35)
#37Bruce Momjian
bruce@momjian.us
In reply to: Heikki Linnakangas (#29)
#38Greg Smith
gsmith@gregsmith.com
In reply to: Josh Berkus (#35)
#39Robert Haas
robertmhaas@gmail.com
In reply to: Greg Smith (#38)
#40Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Bruce Momjian (#37)
#41Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Josh Berkus (#35)
#42Fujii Masao
masao.fujii@gmail.com
In reply to: Robert Haas (#39)
#43Yeb Havinga
yebhavinga@gmail.com
In reply to: Robert Haas (#39)
#44Robert Haas
robertmhaas@gmail.com
In reply to: Yeb Havinga (#43)
#45Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#39)
#46Robert Haas
robertmhaas@gmail.com
In reply to: Josh Berkus (#45)