Issues with Quorum Commit

Started by Josh Berkusover 15 years ago138 messageshackers
Jump to latest
#1Josh Berkus
josh@agliodbs.com

All,

There's been a lot of discussion on synch rep lately which involves
quorum commit. I need to raise some major design issues with quorum
commit which I don't think that people have really considered, and may
be sufficient to prevent it from being included in 9.1.

A. Permanent Synchronization Failure
---------------------------------
Quorum commit, like other forms of more-than-one-standby synch rep,
offers the possibility that one or more standbys could end up
irretrievably desyncronized with the master.

1. Quorum is 3 servers (out of 5) with mode "apply"
2. Standbys 2 and 4 receive and apply transaction # 20001.
3. Due to a network issue, no other standby applies #20001.
4. Accordingly, the master rolls back #20001 and cancels, either due to
timeout or DBA cancel.
5. #2 and #5 are now hopelessly out of synch with the master.

B. Eventual Inconsistency
-------------------------
If we have a quorum commit, it's possible for any individual standby to
be indefinitely ahead of any standby which is not needed by the quorum.
This means that:

-- There is no clear criteria for when a standby which is not needed for
quorum should be considered no longer a synch standby, and
-- Applications cannot make assumptions that synch rep promises some
specific window of synchronicity, eliminating a lot of the value of
quorum commit.

C. Performance
--------------
Doing quorum commit requires significant extra accounting on the
master's part: it must keep track of how many standbys committed for
each pending transaction (and remember there may be many at the same
time).

Doing so could involve significant response-time overhead added to the
simple case where there is only one standby, as well as memory usage,
and likely a lot of troubleshooting of the mechanism from us.

D. Adding/Replacing Quorum Members
----------------------------------
For Quorum commit to be really valuable, we need to be able to add new
quorum members and remove dead ones *without stopping the master*. Per
discussion about the startup issues with only one master, we have not
worked out how to do this for synch rep standbys. It's reasonable to
assume that this will be more complex for a quorum group than with a
single synch standby.

Consider the case, for example, where due to a network outage we have
dropped below quorum. What is the strategy for getting the system
running again by adding standbys?

All of the problems above are resolvable. Some of the CAP databases
have probably resolved them, as well as some older telecom databases.
However, all of them will require significant work, and even more
significant debugging, from the project.

I would like to see Quorum Commit, in part because I think it would help
push PostgreSQL further into cloud frameworks. However, I'm worried
that if we make quorum commit a requirement of synch rep, we will not
have synch rep in 9.1.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

#2Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Josh Berkus (#1)
Re: Issues with Quorum Commit

On 05.10.2010 22:11, Josh Berkus wrote:

There's been a lot of discussion on synch rep lately which involves
quorum commit. I need to raise some major design issues with quorum
commit which I don't think that people have really considered, and may
be sufficient to prevent it from being included in 9.1.

Thanks for bringing these up.

A. Permanent Synchronization Failure
---------------------------------
Quorum commit, like other forms of more-than-one-standby synch rep,
offers the possibility that one or more standbys could end up
irretrievably desyncronized with the master.

1. Quorum is 3 servers (out of 5) with mode "apply"
2. Standbys 2 and 4 receive and apply transaction # 20001.
3. Due to a network issue, no other standby applies #20001.
4. Accordingly, the master rolls back #20001 and cancels, either due to
timeout or DBA cancel.

The master can not roll back or cancel the transaction. That's
completely infeasible, the WAL record has been written to local disk
already. The best it can do is halt and wait for enough standbys to
appear to fulfill the quorum. The client will hang waiting for the
COMMIT to finish, and the transaction will appear as in-progress to
other transactions.

There's subtle point here that I don't think has been discussed yet: If
the master is forcibly restarted at that point, with pg_ctl restart -m
immediate, strictly speaking the master should start up in the same
state, with the unlucky transaction still appearing as in-progress,
until the standby acknowledges.

5. #2 and #5 are now hopelessly out of synch with the master.

B. Eventual Inconsistency
-------------------------
If we have a quorum commit, it's possible for any individual standby to
be indefinitely ahead of any standby which is not needed by the quorum.
This means that:

-- There is no clear criteria for when a standby which is not needed for
quorum should be considered no longer a synch standby, and
-- Applications cannot make assumptions that synch rep promises some
specific window of synchronicity, eliminating a lot of the value of
quorum commit.

Yep.

C. Performance
--------------
Doing quorum commit requires significant extra accounting on the
master's part: it must keep track of how many standbys committed for
each pending transaction (and remember there may be many at the same
time).

Doing so could involve significant response-time overhead added to the
simple case where there is only one standby, as well as memory usage,
and likely a lot of troubleshooting of the mechanism from us.

My gut feeling is that overhead will pale to insignificance compared to
the network and other overheads of actually getting the WAL to the
standby and processing the acknowledgments.

D. Adding/Replacing Quorum Members
----------------------------------
For Quorum commit to be really valuable, we need to be able to add new
quorum members and remove dead ones *without stopping the master*. Per
discussion about the startup issues with only one master, we have not
worked out how to do this for synch rep standbys. It's reasonable to
assume that this will be more complex for a quorum group than with a
single synch standby.

Consider the case, for example, where due to a network outage we have
dropped below quorum. What is the strategy for getting the system
running again by adding standbys?

You start a new one from the latest base backup and let it catch up?
Possibly modifying the config file in the master to let it know about
the new standby, if we go down that path. This part doesn't seem
particularly hard to me.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#3Josh Berkus
josh@agliodbs.com
In reply to: Heikki Linnakangas (#2)
Re: Issues with Quorum Commit

Heikki,

The master can not roll back or cancel the transaction. That's
completely infeasible, the WAL record has been written to local disk
already. The best it can do is halt and wait for enough standbys to
appear to fulfill the quorum. The client will hang waiting for the
COMMIT to finish, and the transaction will appear as in-progress to
other transactions.

Ohhh. Good point. So there's no real point in a timeout setting for
quorum commit; it's always "wait forever".

So, this is a critical issue with "wait forever" even with one server.

There's subtle point here that I don't think has been discussed yet: If
the master is forcibly restarted at that point, with pg_ctl restart -m
immediate, strictly speaking the master should start up in the same
state, with the unlucky transaction still appearing as in-progress,
until the standby acknowledges.

Yeah. That makes the ability to issue a command which says "drop all
synch rep and commit whatever's pending" to be critical.

However, this makes for, in some ways, a worse situation: if you fail to
achieve quorum on any commit, then you need to rebuild your entire
quorum pool from scratch.

You start a new one from the latest base backup and let it catch up?
Possibly modifying the config file in the master to let it know about
the new standby, if we go down that path. This part doesn't seem
particularly hard to me.

Yeah? How do you modify the config file and get the master to consider
the new server to be part of the quorum pool *without restarting the
master*?

Again, I'm just saying that merely doing single-server synch rep, *and*
making HS/SR easier to admin in general, is going to be a big task for
9.1. Quorum Commit needs to be considered a separate feature, and one
which is dispensible for 9.1.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

#4Jeff Davis
pgsql@j-davis.com
In reply to: Josh Berkus (#1)
Re: Issues with Quorum Commit

On Tue, 2010-10-05 at 12:11 -0700, Josh Berkus wrote:

B. Eventual Inconsistency
-------------------------
If we have a quorum commit, it's possible for any individual standby to
be indefinitely ahead of any standby which is not needed by the quorum.
This means that:

-- There is no clear criteria for when a standby which is not needed for
quorum should be considered no longer a synch standby, and
-- Applications cannot make assumptions that synch rep promises some
specific window of synchronicity, eliminating a lot of the value of
quorum commit.

Point B seems particularly dangerous.

When you lose one of the systems and the lagging server becomes required
for quorum, then all of a sudden you could be facing a huge delay to
commit the next transaction (because it needs to catch up on a lot of
WAL replay). This can happen even without a network problem at all, and
seems very likely to result in the lagging system being considered
"down" due to a timeout. Not good, because the reason it is required for
quorum is because another standby just went down.

In other words, a lagging standby combined with a timeout mechanism is
essentially useless, because it will never catch up in time to be a part
of the quorum.

Regards,
Jeff Davis

#5Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#2)
Re: Issues with Quorum Commit

On Tue, 2010-10-05 at 22:32 +0300, Heikki Linnakangas wrote:

On 05.10.2010 22:11, Josh Berkus wrote:

There's been a lot of discussion on synch rep lately which involves
quorum commit. I need to raise some major design issues with quorum
commit which I don't think that people have really considered, and may
be sufficient to prevent it from being included in 9.1.

Thanks for bringing these up.

Yes, I'm very happy to discuss these.

The points appear to be directed at "quorum commit", which is a name
I've used. But most of the points apply more to Fujii's patch than my
own. I can only presume that Josh wants to prevent us from adopting a
design that allows sync against multiple standbys.

A. Permanent Synchronization Failure
---------------------------------
Quorum commit, like other forms of more-than-one-standby synch rep,
offers the possibility that one or more standbys could end up
irretrievably desyncronized with the master.

1. Quorum is 3 servers (out of 5) with mode "apply"
2. Standbys 2 and 4 receive and apply transaction # 20001.
3. Due to a network issue, no other standby applies #20001.
4. Accordingly, the master rolls back #20001 and cancels, either due to
timeout or DBA cancel.

The master can not roll back or cancel the transaction. That's
completely infeasible, the WAL record has been written to local disk
already. The best it can do is halt and wait for enough standbys to
appear to fulfill the quorum. The client will hang waiting for the
COMMIT to finish, and the transaction will appear as in-progress to
other transactions.

Yes, that point has long been understood. Neither patch does this, and
in fact the issue is a completely general one.

There's subtle point here that I don't think has been discussed yet: If
the master is forcibly restarted at that point, with pg_ctl restart -m
immediate, strictly speaking the master should start up in the same
state, with the unlucky transaction still appearing as in-progress,
until the standby acknowledges.

That is a very important point, but again, nothing to do with quorum
commit. For strict correctness, we should do that. Are you suggesting we
should do that here?

5. #2 and #5 are now hopelessly out of synch with the master.

B. Eventual Inconsistency
-------------------------
If we have a quorum commit, it's possible for any individual standby to
be indefinitely ahead of any standby which is not needed by the quorum.
This means that:

-- There is no clear criteria for when a standby which is not needed for
quorum should be considered no longer a synch standby, and
-- Applications cannot make assumptions that synch rep promises some
specific window of synchronicity, eliminating a lot of the value of
quorum commit.

Yep.

Could the person that wrote that actually explain what a "specific
window of synchronicity" is? I'm not sure whether to agree, or disagree.

C. Performance
--------------
Doing quorum commit requires significant extra accounting on the
master's part: it must keep track of how many standbys committed for
each pending transaction (and remember there may be many at the same
time).

Doing so could involve significant response-time overhead added to the
simple case where there is only one standby, as well as memory usage,
and likely a lot of troubleshooting of the mechanism from us.

My gut feeling is that overhead will pale to insignificance compared to
the network and other overheads of actually getting the WAL to the
standby and processing the acknowledgments.

You're ignoring Josh's points. Those exact points have been made by me
in support of the design of my patch and against Fujii's. The mechanism
to do this will be more complex and more likely to break. And it will be
slower and that is a concern for me.

D. Adding/Replacing Quorum Members
----------------------------------
For Quorum commit to be really valuable, we need to be able to add new
quorum members and remove dead ones *without stopping the master*. Per
discussion about the startup issues with only one master, we have not
worked out how to do this for synch rep standbys. It's reasonable to
assume that this will be more complex for a quorum group than with a
single synch standby.

Consider the case, for example, where due to a network outage we have
dropped below quorum. What is the strategy for getting the system
running again by adding standbys?

You start a new one from the latest base backup and let it catch up?
Possibly modifying the config file in the master to let it know about
the new standby, if we go down that path. This part doesn't seem
particularly hard to me.

Agreed, not sure of the issue there.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services

#6Simon Riggs
simon@2ndQuadrant.com
In reply to: Jeff Davis (#4)
Re: Issues with Quorum Commit

On Tue, 2010-10-05 at 13:45 -0700, Jeff Davis wrote:

On Tue, 2010-10-05 at 12:11 -0700, Josh Berkus wrote:

B. Eventual Inconsistency
-------------------------
If we have a quorum commit, it's possible for any individual standby to
be indefinitely ahead of any standby which is not needed by the quorum.
This means that:

-- There is no clear criteria for when a standby which is not needed for
quorum should be considered no longer a synch standby, and
-- Applications cannot make assumptions that synch rep promises some
specific window of synchronicity, eliminating a lot of the value of
quorum commit.

Point B seems particularly dangerous.

When you lose one of the systems and the lagging server becomes required
for quorum, then all of a sudden you could be facing a huge delay to
commit the next transaction (because it needs to catch up on a lot of
WAL replay). This can happen even without a network problem at all, and
seems very likely to result in the lagging system being considered
"down" due to a timeout. Not good, because the reason it is required for
quorum is because another standby just went down.

In other words, a lagging standby combined with a timeout mechanism is
essentially useless, because it will never catch up in time to be a part
of the quorum.

Thanks for explaining what was meant.

This issue is a serious problem with the apply to *all* servers that
Heikki has been describing as being a useful use case. We register a
standby, it goes down and we decide to wait for it. Then when it does
come back up it takes ages to catch up.

This is really the nail in the coffin for the "All" servers use case,
and a significant blow to the requirement for standby registration.

If we use N+1 redundancy as I have explained, then this situation does
not occur until you have less than N standbys available. But then it's
no surprise that RAID-5 won't work with 4 drives either.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services

#7Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#5)
Re: Issues with Quorum Commit

On Tue, Oct 5, 2010 at 5:10 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

The points appear to be directed at "quorum commit", which is a name
I've used. But most of the points apply more to Fujii's patch than my
own. I can only presume that Josh wants to prevent us from adopting a
design that allows sync against multiple standbys.

This looks to me like a cheap shot that doesn't advance the
discussion. You are the first to complain when people don't take your
ideas as seriously as you feel they should.

A. Permanent Synchronization Failure
---------------------------------
Quorum commit, like other forms of more-than-one-standby synch rep,
offers the possibility that one or more standbys could end up
irretrievably desyncronized with the master.

1. Quorum is 3 servers (out of 5) with mode "apply"
2. Standbys 2 and 4 receive and apply transaction # 20001.
3. Due to a network issue, no other standby applies #20001.
4. Accordingly, the master rolls back #20001 and cancels, either due to
timeout or DBA cancel.

The master can not roll back or cancel the transaction. That's
completely infeasible, the WAL record has been written to local disk
already. The best it can do is halt and wait for enough standbys to
appear to fulfill the quorum. The client will hang waiting for the
COMMIT to finish, and the transaction will appear as in-progress to
other transactions.

Yes, that point has long been understood. Neither patch does this, and
in fact the issue is a completely general one.

Yep.

There's subtle point here that I don't think has been discussed yet: If
the master is forcibly restarted at that point, with pg_ctl restart -m
immediate, strictly speaking the master should start up in the same
state, with the unlucky transaction still appearing as in-progress,
until the standby acknowledges.

That is a very important point, but again, nothing to do with quorum
commit. For strict correctness, we should do that. Are you suggesting we
should do that here?

I agree that this has nothing to do with quorum commit. It does have
to do with synchronous replication, but I'm skeptical that we want to
get into it for this release, if ever.

5. #2 and #5 are now hopelessly out of synch with the master.

B. Eventual Inconsistency
-------------------------
If we have a quorum commit, it's possible for any individual standby to
be indefinitely ahead of any standby which is not needed by the quorum.
  This means that:

-- There is no clear criteria for when a standby which is not needed for
quorum should be considered no longer a synch standby, and
-- Applications cannot make assumptions that synch rep promises some
specific window of synchronicity, eliminating a lot of the value of
quorum commit.

Yep.

Could the person that wrote that actually explain what a "specific
window of synchronicity" is? I'm not sure whether to agree, or disagree.

Me either.

C. Performance
--------------
Doing quorum commit requires significant extra accounting on the
master's part: it must keep track of how many standbys committed for
each pending transaction (and remember there may be many at the same
time).

Doing so could involve significant response-time overhead added to the
simple case where there is only one standby, as well as memory usage,
and likely a lot of troubleshooting of the mechanism from us.

My gut feeling is that overhead will pale to insignificance compared to
the network and other overheads of actually getting the WAL to the
standby and processing the acknowledgments.

You're ignoring Josh's points. Those exact points have been made by me
in support of the design of my patch and against Fujii's. The mechanism
to do this will be more complex and more likely to break. And it will be
slower and that is a concern for me.

I don't think Heikki ignored Josh's points, and I do think Heikki's
analysis is correct.

D. Adding/Replacing Quorum Members
----------------------------------
For Quorum commit to be really valuable, we need to be able to add new
quorum members and remove dead ones *without stopping the master*.  Per
discussion about the startup issues with only one master, we have not
worked out how to do this for synch rep standbys.  It's reasonable to
assume that this will be more complex for a quorum group than with a
single synch standby.

Consider the case, for example, where due to a network outage we have
dropped below quorum.  What is the strategy for getting the system
running again by adding standbys?

You start a new one from the latest base backup and let it catch up?
Possibly modifying the config file in the master to let it know about
the new standby, if we go down that path. This part doesn't seem
particularly hard to me.

Agreed, not sure of the issue there.

Also agreed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#8Simon Riggs
simon@2ndQuadrant.com
In reply to: Josh Berkus (#3)
Re: Issues with Quorum Commit

On Tue, 2010-10-05 at 13:43 -0700, Josh Berkus wrote:

Again, I'm just saying that merely doing single-server synch rep,
*and*
making HS/SR easier to admin in general, is going to be a big task for
9.1. Quorum Commit needs to be considered a separate feature, and one
which is dispensible for 9.1.

Agreed.

So no need at all for standby.conf. Phew!

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services

#9Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#7)
Re: Issues with Quorum Commit

On Tue, 2010-10-05 at 17:21 -0400, Robert Haas wrote:

On Tue, Oct 5, 2010 at 5:10 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

The points appear to be directed at "quorum commit", which is a name
I've used. But most of the points apply more to Fujii's patch than my
own. I can only presume that Josh wants to prevent us from adopting a
design that allows sync against multiple standbys.

This looks to me like a cheap shot that doesn't advance the
discussion. You are the first to complain when people don't take your
ideas as seriously as you feel they should.

Whatever are you talking about? This is a technical discussion.

I'm checking what Josh actually means by Quorum Commit, since
regrettably the points fall very badly against Fujii's patch. Josh has
echoed some points of mine and Jeff's point about dangerous behaviour
blows a hole a mile wide in the justification for standby.conf etc..

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services

#10Josh Berkus
josh@agliodbs.com
In reply to: Simon Riggs (#5)
Re: Issues with Quorum Commit

Simon, Robert,

The points appear to be directed at "quorum commit", which is a name
I've used. But most of the points apply more to Fujii's patch than my
own.

Per previous discussion, I'm trying to get at what reasonable behavior
is, rather than targeting one patch or the other.

I can only presume that Josh wants to prevent us from adopting a
design that allows sync against multiple standbys.

Quorum commit == "X servers need to ack for commit", where X > 1.
Usually done as "X out of Y servers must ack", but it's not a given that
the master needs to know how many servers there are, just how many ack'ed.

And I'm not against it; I'm just pointing out that it gives us some
issues which we don't have with a single standby, and thus quorum commit
ought to be treated as a separate feature in 9.1 development.

The master can not roll back or cancel the transaction. That's
completely infeasible, the WAL record has been written to local disk
already. The best it can do is halt and wait for enough standbys to
appear to fulfill the quorum. The client will hang waiting for the
COMMIT to finish, and the transaction will appear as in-progress to
other transactions.

Yes, that point has long been understood. Neither patch does this, and
in fact the issue is a completely general one.

So, in that case, if it's been 10 minutes, and we're still not getting
ack from standbys, what's the exit strategy for the hapless DBA?
Practically speaking? Without restarting the master?

Last I checked, our goal with synch standby was to increase availablity,
not decrease it. This is, however, not an issue with quorum commit, but
an issue with sync rep in general.

Could the person that wrote that actually explain what a "specific
window of synchronicity" is? I'm not sure whether to agree, or disagree.

A specific amount of time within which all nodes will be consistent
regarding that specific transaction.

You start a new one from the latest base backup and let it catch up?
Possibly modifying the config file in the master to let it know about
the new standby, if we go down that path. This part doesn't seem
particularly hard to me.

Agreed, not sure of the issue there.

See previous post. The critical phrase is *without restarting the
master*. AFAICT, no patch has addressed the need to change the master's
synch configuration without restarting it. It's possible that I'm not
following something, in which case I'd love to have it pointed out.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

#11Simon Riggs
simon@2ndQuadrant.com
In reply to: Josh Berkus (#10)
Re: Issues with Quorum Commit

On Tue, 2010-10-05 at 15:14 -0700, Josh Berkus wrote:

I can only presume that Josh wants to prevent us from adopting a
design that allows sync against multiple standbys.

Quorum commit == "X servers need to ack for commit", where X > 1.
Usually done as "X out of Y servers must ack", but it's not a given that
the master needs to know how many servers there are, just how many ack'ed.

And I'm not against it; I'm just pointing out that it gives us some
issues which we don't have with a single standby, and thus quorum commit
ought to be treated as a separate feature in 9.1 development.

OK, so I did understand you correctly.

Heikki had argued that a use case existed where Y out of Y (i.e. all)
nodes must acknowledge before we commit. That was the use case that
required us to have standby registration. It was optional in all other
cases.

We should note that Oracle only allows X=1, i.e. first acknowledgement
releases waiter. My patch provides X=1 only and takes advantage of the
simpler in-memory data structures as a result.

The master can not roll back or cancel the transaction. That's
completely infeasible, the WAL record has been written to local disk
already. The best it can do is halt and wait for enough standbys to
appear to fulfill the quorum. The client will hang waiting for the
COMMIT to finish, and the transaction will appear as in-progress to
other transactions.

Yes, that point has long been understood. Neither patch does this, and
in fact the issue is a completely general one.

So, in that case, if it's been 10 minutes, and we're still not getting
ack from standbys, what's the exit strategy for the hapless DBA?
Practically speaking? Without restarting the master?

Last I checked, our goal with synch standby was to increase availablity,
not decrease it. This is, however, not an issue with quorum commit, but
an issue with sync rep in general.

Completely agree. When we had that discussion some months/weeks back, we
spoke about having a timeout. My patch has implemented a timeout,
followed by a COMMIT. That allows increased availability, as you say.

You would also be able to specifically release all/some transactions
from wait state with a simple function pg_cancel_sync_wait() (or similar
name).

Could the person that wrote that actually explain what a "specific
window of synchronicity" is? I'm not sure whether to agree, or disagree.

A specific amount of time within which all nodes will be consistent
regarding that specific transaction.

Certainly no patch offers that. I'm not sure such a possibility exists.
Asking for higher X does make that situation worse.

You start a new one from the latest base backup and let it catch up?
Possibly modifying the config file in the master to let it know about
the new standby, if we go down that path. This part doesn't seem
particularly hard to me.

Agreed, not sure of the issue there.

See previous post. The critical phrase is *without restarting the
master*. AFAICT, no patch has addressed the need to change the master's
synch configuration without restarting it. It's possible that I'm not
following something, in which case I'd love to have it pointed out.

My patch does not require a restart of the master to add/remove sync rep
nodes. They just come and go as needed.

I don't think Fujii's patch would have a great problem with that either,
but I can't speak for that with precision.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services

#12Josh Berkus
josh@agliodbs.com
In reply to: Simon Riggs (#11)
Re: Issues with Quorum Commit

Heikki had argued that a use case existed where Y out of Y (i.e. all)
nodes must acknowledge before we commit. That was the use case that
required us to have standby registration. It was optional in all other
cases.

Yeah, Y of Y is just a special case of X of Y. And, IMHO, rather
pointless if we can't guarantee consistency between the standbys, which
we can't.

We should note that Oracle only allows X=1, i.e. first acknowledgement
releases waiter. My patch provides X=1 only and takes advantage of the
simpler in-memory data structures as a result.

I agree that we ought to start with X=1 for 9.1 and leave more
complicated architectures until we have that committed and tested.

You would also be able to specifically release all/some transactions
from wait state with a simple function pg_cancel_sync_wait() (or similar
name).

That would be fine for the use cases I'll be implementing.

My patch does not require a restart of the master to add/remove sync rep
nodes. They just come and go as needed.

I don't think Fujii's patch would have a great problem with that either,
but I can't speak for that with precision.

Ok. That really was not made clear in prior arguments.

FYI, for the production uses of synch rep I'd specifically be
implementing, what the users would want is:

1) One master, one synch standby, 1-2 asynch standbys
2) Synch rep tries to synch for # seconds.
3) If it fails, it switches the synch standby to asynch and screams
bloody murder somewhere nagios can pick it up.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

#13Jeff Davis
pgsql@j-davis.com
In reply to: Simon Riggs (#6)
Re: Issues with Quorum Commit

On Tue, 2010-10-05 at 22:19 +0100, Simon Riggs wrote:

In other words, a lagging standby combined with a timeout mechanism is
essentially useless, because it will never catch up in time to be a part
of the quorum.

Thanks for explaining what was meant.

This issue is a serious problem with the apply to *all* servers that
Heikki has been describing as being a useful use case. We register a
standby, it goes down and we decide to wait for it. Then when it does
come back up it takes ages to catch up.

This is really the nail in the coffin for the "All" servers use case,
and a significant blow to the requirement for standby registration.

I'm not sure I entirely understand. I was concerned about the case of a
standby server being allowed to lag behind the rest by a large number of
WAL records. That can't happen in the "wait for all servers to apply"
case, because the system would become unavailable rather than allow a
significant difference in the amount of WAL applied.

I'm not saying that an unavailable system is good, but I don't see how
my particular complaint applies to the "wait for all servers to apply"
case.

The case I was worried about is:
* 1 master and 2 standby
* The rule is "wait for at least one standby to apply the WAL"

In your notation, I believe that's M -> { S1, S2 }

In that case, if one S1 is just a little faster than S2, then S2 might
build up a significant queue of unapplied WAL. Then, when S1 goes down,
there's no way for the slower one to acknowledge a new transaction
without playing through all of the unapplied WAL.

Intuitively, the administrator would think that he was getting both HA
and redundancy, but in reality the availability is no better than if
there were only two servers (M -> S1), except that it might be faster to
replay the WAL then to set up a new standby (but that's not guaranteed).

I think you would call that a misconfiguration, and I would agree. I was
just trying to point out a pitfall that I didn't see until I read Josh's
email.

If we use N+1 redundancy as I have explained, then this situation does
not occur until you have less than N standbys available. But then it's
no surprise that RAID-5 won't work with 4 drives either.

Now I'm more confused. I assume that was a typo (because a RAID-5 does
work with 4 drives), but I think it obscured your point.

Regards,
Jeff Davis

#14Simon Riggs
simon@2ndQuadrant.com
In reply to: Jeff Davis (#13)
Re: Issues with Quorum Commit

On Tue, 2010-10-05 at 18:52 -0700, Jeff Davis wrote:

I'm not saying that an unavailable system is good, but I don't see how
my particular complaint applies to the "wait for all servers to apply"
case.

The case I was worried about is:
* 1 master and 2 standby
* The rule is "wait for at least one standby to apply the WAL"

In your notation, I believe that's M -> { S1, S2 }

In that case, if one S1 is just a little faster than S2, then S2 might
build up a significant queue of unapplied WAL. Then, when S1 goes down,
there's no way for the slower one to acknowledge a new transaction
without playing through all of the unapplied WAL.

That situation would require two things
* First, you have set up async replication and you're not monitoring it
properly. Shame on you.
* Second, you would have to request "apply" mode sync rep. If you had
requested "recv" or "fsync" mode, then the standby does *not* have to
have applied the WAL before acknowledgement.

Since the first problem is a generic problem with async replication, and
can already happen in 8.2+, its not exactly an argument against a new
feature.

Intuitively, the administrator would think that he was getting both HA
and redundancy, but in reality the availability is no better than if
there were only two servers (M -> S1), except that it might be faster to
replay the WAL then to set up a new standby (but that's not guaranteed).

Not guaranteed, but very likely that the standby would not be that far
behind. If it gets too far behind it will likely blow out the disk space
on the standby and fail.

I think you would call that a misconfiguration, and I would agree.

Yes, regrettably there are various ways to misconfigure this. The above
is really a degeneration of the 2 standby case into the 1 standby case:
if you ask for 2 standbys and one of them is ineffective, then the
system acts like you have only one.

I was
just trying to point out a pitfall that I didn't see until I read Josh's
email.

You mention that it cannot occur if we choose to lock up the master and
cause transactions to wait. That may be true in many cases. It does
still occur when we have transactions that generate a large amount of
WAL, loads, ALTER TABLEs etc.. In those cases, S2 could well fall far
behind S1 during those long transactions and if S1 goes down at that
point there would be a backlog to apply. But again, this only applies
to "apply" mode sync rep.

So it can occur in both cases, though it now looks to me that its less
important an issue in either case. So I think this doesn't rate the term
dangerous to describe it any longer.

Thanks for your careful thought and analysis on this.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services

#15Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Josh Berkus (#10)
Re: Issues with Quorum Commit

On 06.10.2010 01:14, Josh Berkus wrote:

Last I checked, our goal with synch standby was to increase availablity,
not decrease it.

No. Synchronous replication does not help with availability. It allows
you to achieve zero data loss, ie. if the master dies, you are
guaranteed that any transaction that was acknowledged as committed, is
still committed.

The other use case is keeping a hot standby server (or servers)
up-to-date, so that you can run queries against it and you are
guaranteed to get the same results you would if you ran the query in the
master.

Those are the two reasonable use cases I've seen. Anything else that has
been discussed is some sort of a combination of those two, or something
that doesn't make much sense when you scratch the surface and start
looking at the failure modes.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#16Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Josh Berkus (#10)
Re: Issues with Quorum Commit

On 06.10.2010 01:14, Josh Berkus wrote:

You start a new one from the latest base backup and let it catch up?
Possibly modifying the config file in the master to let it know about
the new standby, if we go down that path. This part doesn't seem
particularly hard to me.

Agreed, not sure of the issue there.

See previous post. The critical phrase is *without restarting the
master*. AFAICT, no patch has addressed the need to change the master's
synch configuration without restarting it. It's possible that I'm not
following something, in which case I'd love to have it pointed out.

Fair enough. I agree it's important that the configuration can be
changed on the fly. It's orthogonal to the other things discussed, so
let's just assume for now that we'll have that. If not in the first
version, it can be added afterwards. "pg_ctl reload" is probably how it
will be done.

There is some interesting behavioral questions there on what happens
when the configuration is changed. Like if you first define that 3 out
of 5 servers must acknowledge, and you have an in-progress commit that
has received 2 acks already. If you then change the config to "2 out of
4" servers must acknowledge, is the in-progress commit now satisfied?
From the admin point of view, the server that was removed from the
system might've been one that had acknowledged already, and logically in
the new configuration the transaction has only received 1 acknowledgment
from those servers that are still part of the system. Explicitly naming
the standbys in the config file would solve that particular corner case,
but it would no doubt introduce other similar ones.

But it's an orthogonal issue, we'll figure it out when we get there.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#17Markus Wanner
markus@bluegap.ch
In reply to: Simon Riggs (#14)
Re: Issues with Quorum Commit

On 10/06/2010 04:31 AM, Simon Riggs wrote:

That situation would require two things
* First, you have set up async replication and you're not monitoring it
properly. Shame on you.

The way I read it, Jeff is complaining about the timeout you propose
that effectively turns sync into async replication in case of a failure.

With a master that waits forever, the standby that's newly required for
quorum certainly still needs its time to catch up. But it wouldn't live
in danger of being "optimized away" for availability in case it cannot
catch up within the given timeout. It's a tradeoff between availability
and durability.

So it can occur in both cases, though it now looks to me that its less
important an issue in either case. So I think this doesn't rate the term
dangerous to describe it any longer.

The proposed timeout certainly still sounds dangerous to me. I'd rather
recommend setting it to an incredibly huge value to minimize its dangers
and get sync replication when that is what has been asked for. Use async
replication for increased availability.

Or do you envision any use case that requires a quorum of X standbies
for normal operation but is just fine with only none to (X-1) standbies
in case of failures? IMO that's when sync replication is most needed and
when it absolutely should hold to its promises - even if it means to
stop the system.

There's no point in continuing operation if you cannot guarantee the
minimum requirements for durability. If you happen to want such a thing,
you should better rethink your minimum requirement (as performance for
normal operations might benefit from a lower minimum as well).

Regards

Markus Wanner

#18Fujii Masao
masao.fujii@gmail.com
In reply to: Jeff Davis (#13)
Re: Issues with Quorum Commit

On Wed, Oct 6, 2010 at 10:52 AM, Jeff Davis <pgsql@j-davis.com> wrote:

I'm not sure I entirely understand. I was concerned about the case of a
standby server being allowed to lag behind the rest by a large number of
WAL records. That can't happen in the "wait for all servers to apply"
case, because the system would become unavailable rather than allow a
significant difference in the amount of WAL applied.

I'm not saying that an unavailable system is good, but I don't see how
my particular complaint applies to the "wait for all servers to apply"
case.

The case I was worried about is:
 * 1 master and 2 standby
 * The rule is "wait for at least one standby to apply the WAL"

In your notation, I believe that's M -> { S1, S2 }

In that case, if one S1 is just a little faster than S2, then S2 might
build up a significant queue of unapplied WAL. Then, when S1 goes down,
there's no way for the slower one to acknowledge a new transaction
without playing through all of the unapplied WAL.

Intuitively, the administrator would think that he was getting both HA
and redundancy, but in reality the availability is no better than if
there were only two servers (M -> S1), except that it might be faster to
replay the WAL then to set up a new standby (but that's not guaranteed).

Agreed. This is similar to my previous complaint.
http://archives.postgresql.org/pgsql-hackers/2010-09/msg00946.php

This problem would happen even if we fix the quorum to 1 as Josh propose.
To avoid this, the master must wait for ACK from all the connected
synchronous standbys.

I think that this is likely to happen especially when we choose 'apply'
replication level. Because that level can easily lag a synchronous
standby because of the conflict between recovery and read-only query.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#19Markus Wanner
markus@bluegap.ch
In reply to: Heikki Linnakangas (#15)
Re: Issues with Quorum Commit

On 10/06/2010 08:31 AM, Heikki Linnakangas wrote:

On 06.10.2010 01:14, Josh Berkus wrote:

Last I checked, our goal with synch standby was to increase availablity,
not decrease it.

No. Synchronous replication does not help with availability. It allows
you to achieve zero data loss, ie. if the master dies, you are
guaranteed that any transaction that was acknowledged as committed, is
still committed.

Strictly speaking, it even reduces availability. Which is why nobody
actually wants *only* synchronous replication. Instead they use quorum
commit or semi-synchronous (shudder) replication, which only requires
*some* nodes to be in sync, but effectively replicates asynchronously to
the others.

From that point of view, the requirement of having one synch and two
async standbies is pretty much the same as having three synch standbies
with a quorum commit of 1. (Except for additional availability of the
later variant, because in case of a failure of the one sync standby, any
of the others can take over without admin intervention).

Regards

Markus Wanner

#20Fujii Masao
masao.fujii@gmail.com
In reply to: Heikki Linnakangas (#15)
Re: Issues with Quorum Commit

On Wed, Oct 6, 2010 at 3:31 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

No. Synchronous replication does not help with availability. It allows you
to achieve zero data loss, ie. if the master dies, you are guaranteed that
any transaction that was acknowledged as committed, is still committed.

Hmm.. but we can increase availability without any data loss by using
synchronous
replication. Many people have already been using synchronous
replication softwares
such as DRBD for that purpose.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#21Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Fujii Masao (#20)
#22Markus Wanner
markus@bluegap.ch
In reply to: Heikki Linnakangas (#21)
#23Fujii Masao
masao.fujii@gmail.com
In reply to: Heikki Linnakangas (#21)
#24Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Markus Wanner (#22)
#25Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Fujii Masao (#23)
#26Markus Wanner
markus@bluegap.ch
In reply to: Heikki Linnakangas (#24)
#27Magnus Hagander
magnus@hagander.net
In reply to: Heikki Linnakangas (#21)
#28Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Magnus Hagander (#27)
#29Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Markus Wanner (#17)
#30Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Dimitri Fontaine (#29)
#31Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#30)
#32Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Heikki Linnakangas (#30)
#33Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#31)
#34Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Dimitri Fontaine (#32)
#35Markus Wanner
markus@bluegap.ch
In reply to: Simon Riggs (#31)
#36Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Heikki Linnakangas (#34)
#37Josh Berkus
josh@agliodbs.com
In reply to: Dimitri Fontaine (#36)
#38Markus Wanner
markus@bluegap.ch
In reply to: Dimitri Fontaine (#36)
#39Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Markus Wanner (#38)
#40Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Josh Berkus (#37)
#41Markus Wanner
markus@bluegap.ch
In reply to: Dimitri Fontaine (#39)
#42Josh Berkus
josh@agliodbs.com
In reply to: Heikki Linnakangas (#40)
#43Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#33)
#44Simon Riggs
simon@2ndQuadrant.com
In reply to: Josh Berkus (#37)
#45Simon Riggs
simon@2ndQuadrant.com
In reply to: Josh Berkus (#37)
#46Markus Wanner
markus@bluegap.ch
In reply to: Simon Riggs (#43)
#47Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Markus Wanner (#41)
#48Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Dimitri Fontaine (#47)
#49Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Heikki Linnakangas (#48)
#50Simon Riggs
simon@2ndQuadrant.com
In reply to: Markus Wanner (#46)
#51Markus Wanner
markus@bluegap.ch
In reply to: Simon Riggs (#50)
#52Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#44)
#53Markus Wanner
markus@bluegap.ch
In reply to: Dimitri Fontaine (#49)
#54Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Markus Wanner (#53)
#55Aidan Van Dyk
aidan@highrise.ca
In reply to: Dimitri Fontaine (#49)
#56Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Aidan Van Dyk (#55)
#57Aidan Van Dyk
aidan@highrise.ca
In reply to: Dimitri Fontaine (#56)
#58Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Aidan Van Dyk (#57)
#59Greg Smith
gsmith@gregsmith.com
In reply to: Markus Wanner (#53)
#60Josh Berkus
josh@agliodbs.com
In reply to: Aidan Van Dyk (#55)
#61Aidan Van Dyk
aidan@highrise.ca
In reply to: Josh Berkus (#60)
#62Josh Berkus
josh@agliodbs.com
In reply to: Aidan Van Dyk (#61)
#63Markus Wanner
markus@bluegap.ch
In reply to: Greg Smith (#59)
#64Josh Berkus
josh@agliodbs.com
In reply to: Greg Smith (#59)
#65Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Aidan Van Dyk (#61)
#66Robert Haas
robertmhaas@gmail.com
In reply to: Kevin Grittner (#65)
#67Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Robert Haas (#66)
#68Markus Wanner
markus@bluegap.ch
In reply to: Dimitri Fontaine (#54)
#69Markus Wanner
markus@bluegap.ch
In reply to: Aidan Van Dyk (#61)
#70Robert Haas
robertmhaas@gmail.com
In reply to: Kevin Grittner (#67)
#71Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Markus Wanner (#68)
#72Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Robert Haas (#70)
#73Simon Riggs
simon@2ndQuadrant.com
In reply to: Aidan Van Dyk (#61)
#74Simon Riggs
simon@2ndQuadrant.com
In reply to: Markus Wanner (#63)
#75Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#70)
#76Greg Smith
gsmith@gregsmith.com
In reply to: Markus Wanner (#63)
#77Fujii Masao
masao.fujii@gmail.com
In reply to: Markus Wanner (#26)
#78Fujii Masao
masao.fujii@gmail.com
In reply to: Heikki Linnakangas (#25)
#79Robert Haas
robertmhaas@gmail.com
In reply to: Fujii Masao (#78)
#80Fujii Masao
masao.fujii@gmail.com
In reply to: Dimitri Fontaine (#29)
#81Fujii Masao
masao.fujii@gmail.com
In reply to: Simon Riggs (#43)
#82Fujii Masao
masao.fujii@gmail.com
In reply to: Markus Wanner (#38)
#83Joshua D. Drake
jd@commandprompt.com
In reply to: Greg Smith (#76)
#84Fujii Masao
masao.fujii@gmail.com
In reply to: Greg Smith (#76)
#85Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Greg Smith (#76)
#86Markus Wanner
markus@bluegap.ch
In reply to: Simon Riggs (#74)
#87Markus Wanner
markus@bluegap.ch
In reply to: Simon Riggs (#73)
#88Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Markus Wanner (#68)
#89Markus Wanner
markus@bluegap.ch
In reply to: Fujii Masao (#77)
#90Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Fujii Masao (#82)
#91Simon Riggs
simon@2ndQuadrant.com
In reply to: Markus Wanner (#87)
#92Markus Wanner
markus@bluegap.ch
In reply to: Fujii Masao (#82)
#93Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#73)
#94Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#88)
#95Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#94)
#96Markus Wanner
markus@bluegap.ch
In reply to: Heikki Linnakangas (#95)
#97Markus Wanner
markus@bluegap.ch
In reply to: Heikki Linnakangas (#88)
#98Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#95)
#99Markus Wanner
markus@bluegap.ch
In reply to: Greg Smith (#76)
#100Markus Wanner
markus@bluegap.ch
In reply to: Simon Riggs (#98)
#101Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Markus Wanner (#99)
#102Markus Wanner
markus@bluegap.ch
In reply to: Dimitri Fontaine (#101)
#103Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Markus Wanner (#102)
#104Fujii Masao
masao.fujii@gmail.com
In reply to: Markus Wanner (#89)
#105Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Smith (#76)
#106Fujii Masao
masao.fujii@gmail.com
In reply to: Heikki Linnakangas (#90)
#107Markus Wanner
markus@bluegap.ch
In reply to: Tom Lane (#105)
#108Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Tom Lane (#105)
#109Markus Wanner
markus@bluegap.ch
In reply to: Dimitri Fontaine (#103)
#110Tom Lane
tgl@sss.pgh.pa.us
In reply to: Markus Wanner (#107)
#111Markus Wanner
markus@bluegap.ch
In reply to: Tom Lane (#110)
#112Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#105)
#113Fujii Masao
masao.fujii@gmail.com
In reply to: Markus Wanner (#92)
#114Fujii Masao
masao.fujii@gmail.com
In reply to: Simon Riggs (#98)
#115Simon Riggs
simon@2ndQuadrant.com
In reply to: Fujii Masao (#114)
#116Markus Wanner
markus@bluegap.ch
In reply to: Simon Riggs (#112)
#117Markus Wanner
markus@bluegap.ch
In reply to: Fujii Masao (#113)
#118Josh Berkus
josh@agliodbs.com
In reply to: Fujii Masao (#104)
#119Rob Wultsch
wultsch@gmail.com
In reply to: Fujii Masao (#106)
#120Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Fujii Masao (#106)
#121Greg Smith
gsmith@gregsmith.com
In reply to: Markus Wanner (#99)
#122Greg Smith
gsmith@gregsmith.com
In reply to: Tom Lane (#105)
#123Simon Riggs
simon@2ndQuadrant.com
In reply to: Markus Wanner (#116)
#124Simon Riggs
simon@2ndQuadrant.com
In reply to: Greg Smith (#122)
#125Markus Wanner
markus@bluegap.ch
In reply to: Greg Smith (#121)
#126Fujii Masao
masao.fujii@gmail.com
In reply to: Markus Wanner (#117)
#127Fujii Masao
masao.fujii@gmail.com
In reply to: Josh Berkus (#118)
#128Fujii Masao
masao.fujii@gmail.com
In reply to: Heikki Linnakangas (#120)
#129Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Fujii Masao (#128)
#130Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#129)
#131Markus Wanner
markus@bluegap.ch
In reply to: Fujii Masao (#126)
#132Fujii Masao
masao.fujii@gmail.com
In reply to: Heikki Linnakangas (#129)
#133Fujii Masao
masao.fujii@gmail.com
In reply to: Robert Haas (#130)
#134Robert Haas
robertmhaas@gmail.com
In reply to: Fujii Masao (#133)
#135Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#130)
#136Fujii Masao
masao.fujii@gmail.com
In reply to: Bruce Momjian (#135)
#137Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#135)
#138Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#105)