2-phase commit

Started by Andrew Sullivanover 22 years ago128 messageshackers
Jump to latest
#1Andrew Sullivan
andrew@libertyrms.info

Hi,

As the 7.4 beta rolls on, I thought now would be a good time to start
talking about the future.

I have a potential need in the future for distributed transactions
(XA). To get that from Postgres, I'd need two-phase commit, I think.
There is someone working on such a project
(<http://snaga.org/pgsql/&gt;), but last time it was discussed here, it
received a rather lukewarm reception (see, e.g., the thread starting
at
<http://archives.postgresql.org/pgsql-hackers/2003-06/msg00752.php&gt;).

While at OSCON, I had a discussion with Joe Conway, Bruce Momjian,
and Greg Sabino Mullane about 2PC. Various people expressed various
opinions on the topic, but I think we agreed on the following. The
relevant folks can correct me if I'm wrong:

Two-phase commit has theoretical problems, but it is implemented in
several "enterprise" RDBMS. 2PC is something needed by certain kinds
of clients (especially those with transaction managers), so if
PostgreSQL doesn't have it, PostgreSQL just won't get supported in
that arena. Someone is already working on 2PC, but may feel unwanted
due to the reactions last heard on the topic, and may not continue
working unless he gets some support. What is a necessary condition
for such support is to get some idea of what compromises 2PC might
impose, and thereafter to try to determine which such compromises, if
any, are acceptable ones.

I think the idea here is that, while in most cases a "pretty-good"
implementation of a desirable feature might get included in the
source on the grounds that it can always be improved upon later,
something like 2PC has the potential to do great harm to an otherwise
reliable transaction manager. So the arguments about what to do need
to be aired in advance.

I (perhaps foolishly) volunteered to undertake to collect the
arguments in various directions, on the grounds that I can contribute
no code, but have skin made of asbestos. I thought I'd try to
collect some information about what people think the problems and
potentially acceptable compromises are, to see if there is some way
to understand what can and cannot be contemplated for 2PC. I'll
include in any such outline the remarks found in the -hackers thread
referenced above. Any objections?

A

-- 
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110
#2Chris Browne
cbbrowne@acm.org
In reply to: Andrew Sullivan (#1)
Re: 2-phase commit

In an attempt to throw the authorities off his trail, andrew@libertyrms.info (Andrew Sullivan) transmitted:

As the 7.4 beta rolls on, I thought now would be a good time to start
talking about the future.

I have a potential need in the future for distributed transactions
(XA). To get that from Postgres, I'd need two-phase commit, I think.
There is someone working on such a project
(<http://snaga.org/pgsql/&gt;), but last time it was discussed here, it
received a rather lukewarm reception (see, e.g., the thread starting
at
<http://archives.postgresql.org/pgsql-hackers/2003-06/msg00752.php&gt;).

Interesting/positive news on this front; the XA specification
documents are now all available in PDF form "freely", from the Open
Group, where they used to be fairly pricey.

<http://www.opengroup.org/publications/catalog/tp.htm&gt;

Another notable XA documentation source is here...
<http://www.middleware.net/tuxedo/resources/XA_Documentation.html&gt;

Two interesting implications of XA support would be that there could
be some "congruence of interests" that would arise regarding two
vendors:

- XA is essentially based on the API of BEA Tuxedo. I'm told they
include a simple database system with Tuxedo, but nothing particularly
wonderful. (Who thinks of BEA as a DBMS vendor???) They might have
interest in bundling something better...

- The main Tuxedo reseller that I am aware of is PeopleSoft, who use
it for their "high traffic" clients. Anyone that has seen news lately
knows that they and Oracle aren't exactly "best pals" these days;
having another DB option could be helpful to them...
--
(format nil "~S@~S" "aa454" "freenet.carleton.ca")
http://www3.sympatico.ca/cbbrowne/tpmonitor.html
"In order to make an apple pie from scratch, you must first create the
universe." -- Carl Sagan, Cosmos

#3Joe Conway
mail@joeconway.com
In reply to: Chris Browne (#2)
Re: [HACKERS] 2-phase commit

(moving to advocacy)

Christopher Browne wrote:

- The main Tuxedo reseller that I am aware of is PeopleSoft, who use
it for their "high traffic" clients. Anyone that has seen news lately
knows that they and Oracle aren't exactly "best pals" these days;
having another DB option could be helpful to them...

That's an interesting observation, because I've long thought PeopleSoft
ought to support Postgres too. From what I recall, their database schema
is *very* database neutral (at least as of PSFT version 7.x) and fairly
simple (we ran it on MSSQL 6.5). It would probably be pretty easily
ported to run on Postgres.

I wonder how we could get them to consider it...

Joe

In reply to: Chris Browne (#2)
Re: 2-phase commit

On Tue, Aug 26, 2003 at 08:04:13PM -0400, Christopher Browne wrote:

Interesting/positive news on this front; the XA specification
documents are now all available in PDF form "freely", from the Open
Group, where they used to be fairly pricey.

A step in the right direction, but AFAIC it's too little, too late.
The impression I get, at least, is that it's as good as dead now: Java
may use it, but it hides the details anyway so it might as well not be
there--the Java way is to standardize the API but nothing that goes "on
the wire".

Lots of proprietary middleware uses XA, but from what I hear there are
enough subtle differences to make mixing-and-matching of products risky
at best--the proprietary way is to bundle products that will work at
least marginally together, and relegate standards to a bullshit point
in the PowerPoint presentations. "Based on industry standard" means
about the same as "based on a true story."

Then there's the fact that the necessary followup standards never got
anywhere, and the fact that XA doesn't cope with threading really well.

Don't get me wrong, XA support may well be a good thing. But at this
stage, personally I'd go for a good 2PC implementation first and worry
about supporting XA later.

Jeroen

#5Justin Clift
justin@postgresql.org
In reply to: Joe Conway (#3)
Re: [HACKERS] 2-phase commit

Joe Conway wrote:

(moving to advocacy)

Christopher Browne wrote:

- The main Tuxedo reseller that I am aware of is PeopleSoft, who use
it for their "high traffic" clients. Anyone that has seen news lately
knows that they and Oracle aren't exactly "best pals" these days;
having another DB option could be helpful to them...

That's an interesting observation, because I've long thought PeopleSoft
ought to support Postgres too. From what I recall, their database schema
is *very* database neutral (at least as of PSFT version 7.x) and fairly
simple (we ran it on MSSQL 6.5). It would probably be pretty easily
ported to run on Postgres.

I wonder how we could get them to consider it...

Not a bad suggestion. Just went to their site and submitted an quick
brief of benefits/etc via their "Partner Proposal" page:

http://checkers.peoplesoft.com/allconn/ppp.nsf/PPP?OpenForm&amp;Seq=2#_RefreshKW_type

I'm hoping they are read by People With A Clue, and that they in turn
will pass it on to the right group internally.

Worth a shot I guess.

:-)

Regards and best wishes,

Justin Clift

Show quoted text

Joe

#6Chris Browne
cbbrowne@acm.org
In reply to: Andrew Sullivan (#1)
Re: [HACKERS] 2-phase commit

After a long battle with technology,mail@joeconway.com (Joe Conway), an earthling, wrote:

(moving to advocacy)

Christopher Browne wrote:

- The main Tuxedo reseller that I am aware of is PeopleSoft, who use
it for their "high traffic" clients. Anyone that has seen news lately
knows that they and Oracle aren't exactly "best pals" these days;
having another DB option could be helpful to them...

That's an interesting observation, because I've long thought
PeopleSoft ought to support Postgres too. From what I recall, their
database schema is *very* database neutral (at least as of PSFT
version 7.x) and fairly simple (we ran it on MSSQL 6.5). It would
probably be pretty easily ported to run on Postgres.

I wonder how we could get them to consider it...

XA support so that it would "play well" with Tuxedo would be the best
thing I can think of. Arguing that they _should_ consider PostgreSQL
when it doesn't support their "scalability extender" wouldn't seem
likely to me to sell well.

That's _exactly_ why I mentioned both products; congruence of
interests...
--
select 'cbbrowne' || '@' || 'acm.org';
http://cbbrowne.com/info/internet.html
Consciousness - that annoying time between naps.

#7Chris Browne
cbbrowne@acm.org
In reply to: Andrew Sullivan (#1)
Re: [HACKERS] 2-phase commit

After a long battle with technology,justin@postgresql.org (Justin Clift), an earthling, wrote:

Worth a shot I guess.

:-)

I'd think that they would take the idea more seriously if PostgreSQL
supported XA and thereby was compatible with Tuxedo. But it probably
doesn't hurt for them to hear the idea multiple times...
--
let name="aa454" and tld="freenet.carleton.ca" in String.concat "@" [name;tld];;
http://www.ntlug.org/~cbbrowne/finances.html
Whatever you do don't mail me at pink-and-wobbly@asdkjlwelkj.com,
because then I'll know you're just an address-harvester, and blacklist
your IP until the end of time

#8Bruno Wolff III
bruno@wolff.to
In reply to: Joe Conway (#3)
Re: [HACKERS] 2-phase commit

On Wed, Aug 27, 2003 at 22:46:58 -0700,
Joe Conway <mail@joeconway.com> wrote:

That's an interesting observation, because I've long thought PeopleSoft
ought to support Postgres too. From what I recall, their database schema
is *very* database neutral (at least as of PSFT version 7.x) and fairly
simple (we ran it on MSSQL 6.5). It would probably be pretty easily
ported to run on Postgres.

In my opinion it is too database agnostic. They pretty much just use the
DB as a file. From what I have seen of the system it is one big hack.

Their trusted client security model is ridiculous. Fortunately in
version 8 you don't have to let people run 2 tier accept for developer
types. (Anyone with 2 tier access owns the system.) I really don't
even trust 3 tier access, because I believe that a fair amount of
security is enforced by the client rather than the app server.

It was annoying that the set of characters usable for passwords in 7.6
(and presumably still apply to the connect ID in 8) was restricted
because they didn't want to quote the password string so that you could
have special characters in it.

They aren't big on using referential integrity to keep the data clean.

#9Joe Conway
mail@joeconway.com
In reply to: Bruno Wolff III (#8)
Re: [HACKERS] 2-phase commit

Bruno Wolff III wrote:

On Wed, Aug 27, 2003 at 22:46:58 -0700,
Joe Conway <mail@joeconway.com> wrote:

That's an interesting observation, because I've long thought PeopleSoft
ought to support Postgres too. From what I recall, their database schema
is *very* database neutral (at least as of PSFT version 7.x) and fairly
simple (we ran it on MSSQL 6.5). It would probably be pretty easily
ported to run on Postgres.

In my opinion it is too database agnostic. They pretty much just use the
DB as a file. From what I have seen of the system it is one big hack.

Yeah, I didn't say I *liked* their schema, just that I thought it would
be easy for them to support Postgres ;-)

Like it or not, they are one of the larger ERP/CRM players (after the
merger with JDEwards, they will be *ahead* of Oracle, only second to
SAP), and having them offer PostgreSQL support would be significant. If
the XA/Tuxedo thing is an issue, they could position it for mid-tier
customers who don't need the transaction manager anyway.

Joe

#10Chris Browne
cbbrowne@acm.org
In reply to: Andrew Sullivan (#1)
Re: [HACKERS] 2-phase commit

Oops! bruno@wolff.to (Bruno Wolff III) was seen spray-painting on a wall:

On Wed, Aug 27, 2003 at 22:46:58 -0700,
Joe Conway <mail@joeconway.com> wrote:

That's an interesting observation, because I've long thought
PeopleSoft ought to support Postgres too. From what I recall, their
database schema is *very* database neutral (at least as of PSFT
version 7.x) and fairly simple (we ran it on MSSQL 6.5). It would
probably be pretty easily ported to run on Postgres.

In my opinion it is too database agnostic. They pretty much just use
the DB as a file. From what I have seen of the system it is one big
hack.

Ah, so it's like the way SAP R/3's HR module works. (I expect I'm the
only one around that is more than passing familiar with "cluster
tables"; quite supremely nonrelational stuff, and quite
bletcherous...)

To a great extent this comes from the nature of the application. HR
is all about collecting together "documents," and these applications
replace "paper" with "pseudopaper."

They aren't big on using referential integrity to keep the data
clean.

Ditto for SAP R/3; "cleanliness" is, there, imposed by only using
their applications to do updates, which includes writing your software
to invoke their functions.
--
(reverse (concatenate 'string "gro.mca" "@" "enworbbc"))
http://www.ntlug.org/~cbbrowne/linuxxian.html
ASSEMBLER is a language. Any language that can take a half-dozen
keystrokes and compile it down to one byte of code is all right in my
books. Though for the REAL programmer, assembler is a waste of
time. Why use a compiler when you can code directly into memory
through a front panel.

#11Bruce Momjian
bruce@momjian.us
In reply to: Andrew Sullivan (#1)
Re: 2-phase commit

I haven't seen any comment on this email.

From our previous discussion of 2-phase commit, there was concern that
the failure modes of 2-phase commit were not solvable. However, I think
multi-master replication is going to have similar non-solvable failure
modes, yet people still want multi-master replication.

We have had several requests for 2-phase commit in the past month. I
think we should encourage the Japanese group to continue on their
2-phase commit patch to be included in 7.5. Yes, it will have
non-solvable failure modes, but let's discuss them and find an
appropriate way to deal with the failures.

---------------------------------------------------------------------------

Andrew Sullivan wrote:

Hi,

As the 7.4 beta rolls on, I thought now would be a good time to start
talking about the future.

I have a potential need in the future for distributed transactions
(XA). To get that from Postgres, I'd need two-phase commit, I think.
There is someone working on such a project
(<http://snaga.org/pgsql/&gt;), but last time it was discussed here, it
received a rather lukewarm reception (see, e.g., the thread starting
at
<http://archives.postgresql.org/pgsql-hackers/2003-06/msg00752.php&gt;).

While at OSCON, I had a discussion with Joe Conway, Bruce Momjian,
and Greg Sabino Mullane about 2PC. Various people expressed various
opinions on the topic, but I think we agreed on the following. The
relevant folks can correct me if I'm wrong:

Two-phase commit has theoretical problems, but it is implemented in
several "enterprise" RDBMS. 2PC is something needed by certain kinds
of clients (especially those with transaction managers), so if
PostgreSQL doesn't have it, PostgreSQL just won't get supported in
that arena. Someone is already working on 2PC, but may feel unwanted
due to the reactions last heard on the topic, and may not continue
working unless he gets some support. What is a necessary condition
for such support is to get some idea of what compromises 2PC might
impose, and thereafter to try to determine which such compromises, if
any, are acceptable ones.

I think the idea here is that, while in most cases a "pretty-good"
implementation of a desirable feature might get included in the
source on the grounds that it can always be improved upon later,
something like 2PC has the potential to do great harm to an otherwise
reliable transaction manager. So the arguments about what to do need
to be aired in advance.

I (perhaps foolishly) volunteered to undertake to collect the
arguments in various directions, on the grounds that I can contribute
no code, but have skin made of asbestos. I thought I'd try to
collect some information about what people think the problems and
potentially acceptable compromises are, to see if there is some way
to understand what can and cannot be contemplated for 2PC. I'll
include in any such outline the remarks found in the -hackers thread
referenced above. Any objections?

A

-- 
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
+1 416 646 3304 x110

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#12Mike Mascari
mascarm@mascari.com
In reply to: Bruce Momjian (#11)
Re: 2-phase commit

Bruce Momjian wrote:

I haven't seen any comment on this email.

From our previous discussion of 2-phase commit, there was concern that
the failure modes of 2-phase commit were not solvable. However, I think
multi-master replication is going to have similar non-solvable failure
modes, yet people still want multi-master replication.

We have had several requests for 2-phase commit in the past month. I
think we should encourage the Japanese group to continue on their
2-phase commit patch to be included in 7.5. Yes, it will have
non-solvable failure modes, but let's discuss them and find an
appropriate way to deal with the failures.

FWIW, Oracle 8's manual for the recovery of a distributed tx where the
coordinator never comes back on line is:

https://www.ifi.uni-klu.ac.at/Public/Documentation/oracle/product/8.0.3/doc/server803/A54643_01/ch_intro.htm#7783

"If a database must be recovered to a point in the past, Oracle's
recovery facilities allow database administrators at other sites to
return their databases to the earlier point in time also. This ensures
that the global database remains consistent."

So it seems, for Oracle 8 at least, PITR is the method of recovery for
cohorts after unrecoverable coordinator failure.

Ugly and yet probably a prerequisite.

Mike Mascari
mascarm@mascari.com

#13Bruce Momjian
bruce@momjian.us
In reply to: Mike Mascari (#12)
Re: 2-phase commit

Mike Mascari wrote:

Bruce Momjian wrote:

I haven't seen any comment on this email.

From our previous discussion of 2-phase commit, there was concern that
the failure modes of 2-phase commit were not solvable. However, I think
multi-master replication is going to have similar non-solvable failure
modes, yet people still want multi-master replication.

We have had several requests for 2-phase commit in the past month. I
think we should encourage the Japanese group to continue on their
2-phase commit patch to be included in 7.5. Yes, it will have
non-solvable failure modes, but let's discuss them and find an
appropriate way to deal with the failures.

FWIW, Oracle 8's manual for the recovery of a distributed tx where the
coordinator never comes back on line is:

https://www.ifi.uni-klu.ac.at/Public/Documentation/oracle/product/8.0.3/doc/server803/A54643_01/ch_intro.htm#7783

"If a database must be recovered to a point in the past, Oracle's
recovery facilities allow database administrators at other sites to
return their databases to the earlier point in time also. This ensures
that the global database remains consistent."

So it seems, for Oracle 8 at least, PITR is the method of recovery for
cohorts after unrecoverable coordinator failure.

Yep, I assume PITR would be the solution for most failure cases --- very
ugly of course.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#11)
Re: 2-phase commit

Bruce Momjian <pgman@candle.pha.pa.us> writes:

From our previous discussion of 2-phase commit, there was concern that
the failure modes of 2-phase commit were not solvable. However, I think
multi-master replication is going to have similar non-solvable failure
modes, yet people still want multi-master replication.

No. The real problem with 2PC in my mind is that its failure modes
occur *after* you have promised commit to one or more parties. In
multi-master, if you fail you know it before you have told the client
his data is committed.

regards, tom lane

In reply to: Bruce Momjian (#13)
Re: 2-phase commit

On Tue, Sep 09, 2003 at 08:38:41PM -0400, Bruce Momjian wrote:

Yep, I assume PITR would be the solution for most failure cases --- very
ugly of course.

Anything can be broken in some way, if bad luck is willing to work hard
enough. In at least one, ah, competing company I know of, employees are
allowed by the legal people to say "assured" but not "guaranteed" for
precisely this reason.

First thing is an acceptable failure mode, then you try to narrow its
chances of occurring. And if worst comes to worst, one example of an
acceptable failure mode is "when in danger or doubt, run in circles,
scream and shout."

Jeroen

#16Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Jeroen T. Vermeulen (#15)
Re: 2-phase commit

From our previous discussion of 2-phase commit, there was concern that
the failure modes of 2-phase commit were not solvable. However, I think
multi-master replication is going to have similar non-solvable failure
modes, yet people still want multi-master replication.

No. The real problem with 2PC in my mind is that its failure modes
occur *after* you have promised commit to one or more parties. In
multi-master, if you fail you know it before you have told the client
his data is committed.

Hmm ? The appl cannot take the first phase commit as its commit info. It
needs to wait for the second phase commit. The second phase is only finished
when all coservers have reported back. 2PC is synchronous.

The problems with 2PC are when after second phase commit was sent to all
servers and before all report back one of them becomes unreachable/down ...
(did it receive and do the 2nd commit or not) Such a transaction must stay
open until the coserver is reachable again or an administrator committed/aborted it.

It is multi master replication that usually has an asynchronous mode for
performance, and there the trouble starts.

Andreas

#17Bruce Momjian
bruce@momjian.us
In reply to: Zeugswetter Andreas SB SD (#16)
Re: 2-phase commit

Zeugswetter Andreas SB SD wrote:

From our previous discussion of 2-phase commit, there was concern that
the failure modes of 2-phase commit were not solvable. However, I think
multi-master replication is going to have similar non-solvable failure
modes, yet people still want multi-master replication.

No. The real problem with 2PC in my mind is that its failure modes
occur *after* you have promised commit to one or more parties. In
multi-master, if you fail you know it before you have told the client
his data is committed.

Hmm ? The appl cannot take the first phase commit as its commit info. It
needs to wait for the second phase commit. The second phase is only finished
when all coservers have reported back. 2PC is synchronous.

The problems with 2PC are when after second phase commit was sent to all
servers and before all report back one of them becomes unreachable/down ...
(did it receive and do the 2nd commit or not) Such a transaction must stay
open until the coserver is reachable again or an administrator committed/aborted it.

It is multi master replication that usually has an asynchronous mode for
performance, and there the trouble starts.

Let me diagram this so we can see the issues. Normal operation is:

Master Slave
------ -----
commit ready-->
<--OK
commit done--->
<--OK
completed

One possible failure is:

Master Slave
------ -----
commit ready-->
<--OK
commit done--->
dies here
stuck waiting

Another possible failure is:

Master Slave
------ -----
commit ready-->
<--OK
dies here
stuck waiting

Are these the issues? Can't we just add GUC timeouts to cause the
commit to fail, and the slave to stop waiting? I suppose a problem is:

Master Slave
------ -----
commit ready-->
<--OK
sleep
stuck waiting, times out
commit done

Could we allow slaves to check if the backend is still alive, perhaps by
asking the postmaster, similar to what we do with the cancel signal ---
that way, the slave would never time out and always wait if the master
was alive.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#17)
Re: 2-phase commit

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Could we allow slaves to check if the backend is still alive, perhaps by
asking the postmaster, similar to what we do with the cancel signal ---
that way, the slave would never time out and always wait if the master
was alive.

You're not considering the possibility of a transient communication
failure. The fact that you cannot currently contact the other guy
is not proof that he's not still alive.

Example:

Master Slave
------ -----
commit ready-->
<--OK
commit done->XX

where "->XX" means the message gets lost due to network failure. Now
what? The slave cannot abort; he promised he could commit, and he does
not know whether the master has committed or not. The master does not
know the slave's state either; maybe he got the second message, and
maybe he didn't. Both sides are forced to keep information about the
open transaction indefinitely. Timing out on either side could yield
the wrong result.

regards, tom lane

#19Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#18)
Re: 2-phase commit

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Could we allow slaves to check if the backend is still alive, perhaps by
asking the postmaster, similar to what we do with the cancel signal ---
that way, the slave would never time out and always wait if the master
was alive.

You're not considering the possibility of a transient communication
failure. The fact that you cannot currently contact the other guy
is not proof that he's not still alive.

Example:

Master Slave
------ -----
commit ready-->
<--OK
commit done->XX

where "->XX" means the message gets lost due to network failure. Now
what? The slave cannot abort; he promised he could commit, and he does
not know whether the master has committed or not. The master does not
know the slave's state either; maybe he got the second message, and
maybe he didn't. Both sides are forced to keep information about the
open transaction indefinitely. Timing out on either side could yield
the wrong result.

Can't the master re-send the request after a timeout?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#20The Hermit Hacker
scrappy@hub.org
In reply to: Tom Lane (#18)
Re: 2-phase commit

On Fri, 26 Sep 2003, Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Could we allow slaves to check if the backend is still alive, perhaps by
asking the postmaster, similar to what we do with the cancel signal ---
that way, the slave would never time out and always wait if the master
was alive.

You're not considering the possibility of a transient communication
failure. The fact that you cannot currently contact the other guy
is not proof that he's not still alive.

Example:

Master Slave
------ -----
commit ready-->
<--OK
commit done->XX

where "->XX" means the message gets lost due to network failure. Now

'k, but isn't alot of that a "retry" issue? we're talking TCP here, not
UDP, which I *thought* was designed for transient network problems ... ?
I would think that any implementation would have a timeout/retry GUC
variable associated with it ... 'if no answer in x seconds, retry up to y
times' ...

if we are talking two computers sitting next to each other on a switch,
you'd expect those to be low ... but if you were talking about two
seperate geographical locations (and yes, I realize you are adding lag to
the mix with waiting for responses), you'd expect those #s to rise ...

#21The Hermit Hacker
scrappy@hub.org
In reply to: Tom Lane (#18)
#22Patrick Welche
prlw1@newn.cam.ac.uk
In reply to: The Hermit Hacker (#21)
#23Bruce Momjian
bruce@momjian.us
In reply to: Patrick Welche (#22)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#19)
#25Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#24)
#26The Hermit Hacker
scrappy@hub.org
In reply to: Tom Lane (#24)
#27Chris Browne
cbbrowne@acm.org
In reply to: Patrick Welche (#22)
#28Andrew Sullivan
andrew@libertyrms.info
In reply to: Tom Lane (#18)
#29Andrew Sullivan
andrew@libertyrms.info
In reply to: Tom Lane (#24)
#30The Hermit Hacker
scrappy@hub.org
In reply to: Chris Browne (#27)
#31Rod Taylor
rbt@rbt.ca
In reply to: Bruce Momjian (#23)
#32Rod Taylor
rbt@rbt.ca
In reply to: Andrew Sullivan (#28)
#33Mike Mascari
mascarm@mascari.com
In reply to: The Hermit Hacker (#26)
#34Gavin Sherry
swm@linuxworld.com.au
In reply to: Chris Browne (#27)
#35Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Tom Lane (#24)
#36Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christopher Kings-Lynne (#35)
#37Jeff Davis
pgsql@j-davis.com
In reply to: Christopher Kings-Lynne (#35)
#38Richard Huxton
dev@archonet.com
In reply to: Tom Lane (#36)
#39The Hermit Hacker
scrappy@hub.org
In reply to: Tom Lane (#36)
#40Bruce Momjian
bruce@momjian.us
In reply to: Richard Huxton (#38)
#41Shridhar Daithankar
shridhar_daithankar@persistent.co.in
In reply to: Bruce Momjian (#40)
#42Richard Huxton
dev@archonet.com
In reply to: Bruce Momjian (#40)
#43The Hermit Hacker
scrappy@hub.org
In reply to: Bruce Momjian (#40)
#44Bruce Momjian
bruce@momjian.us
In reply to: The Hermit Hacker (#43)
#45Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Tom Lane (#24)
#46Kevin Brown
kevin@sysexperts.com
In reply to: Bruce Momjian (#44)
#47Bruce Momjian
bruce@momjian.us
In reply to: Kevin Brown (#46)
#48Kevin Brown
kevin@sysexperts.com
In reply to: Bruce Momjian (#47)
#49Rod Taylor
rbt@rbt.ca
In reply to: Bruce Momjian (#47)
#50Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Hiroshi Inoue (#45)
#51The Hermit Hacker
scrappy@hub.org
In reply to: Hiroshi Inoue (#50)
#52Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hiroshi Inoue (#50)
#53Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Hiroshi Inoue (#45)
#54Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Hiroshi Inoue (#45)
#55Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Hiroshi Inoue (#45)
#56Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Hiroshi Inoue (#45)
#57Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Hiroshi Inoue (#56)
#58The Hermit Hacker
scrappy@hub.org
In reply to: Hiroshi Inoue (#54)
#59Bruce Momjian
bruce@momjian.us
In reply to: The Hermit Hacker (#58)
#60Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#52)
#61The Hermit Hacker
scrappy@hub.org
In reply to: Bruce Momjian (#59)
#62Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Zeugswetter Andreas SB SD (#57)
#63Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Hiroshi Inoue (#62)
#64Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Zeugswetter Andreas SB SD (#63)
#65Bruce Momjian
bruce@momjian.us
In reply to: The Hermit Hacker (#61)
#66Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hiroshi Inoue (#54)
#67Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Tom Lane (#66)
#68Bruce Momjian
bruce@momjian.us
In reply to: Zeugswetter Andreas SB SD (#67)
#69Andrew Sullivan
andrew@libertyrms.info
In reply to: Rod Taylor (#32)
#70Andrew Sullivan
andrew@libertyrms.info
In reply to: Jeff Davis (#37)
#71Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#65)
#72Andrew Sullivan
andrew@libertyrms.info
In reply to: Kevin Brown (#48)
#73Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#71)
#74Andrew Sullivan
andrew@libertyrms.info
In reply to: The Hermit Hacker (#39)
#75Andrew Sullivan
andrew@libertyrms.info
In reply to: The Hermit Hacker (#61)
#76Bruce Momjian
bruce@momjian.us
In reply to: Andrew Sullivan (#74)
#77Rod Taylor
rbt@rbt.ca
In reply to: Bruce Momjian (#76)
#78Rod Taylor
rbt@rbt.ca
In reply to: Andrew Sullivan (#69)
#79Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#14)
#80Manfred Spraul
manfred@colorfullife.com
In reply to: Peter Eisentraut (#79)
#81Dann Corbit
DCorbit@connx.com
In reply to: Manfred Spraul (#80)
#82Peter Eisentraut
peter_e@gmx.net
In reply to: Manfred Spraul (#80)
#83Rod Taylor
rbt@rbt.ca
In reply to: Peter Eisentraut (#82)
#84Andrew Sullivan
andrew@libertyrms.info
In reply to: Bruce Momjian (#76)
#85Dann Corbit
DCorbit@connx.com
In reply to: Andrew Sullivan (#84)
#86Andrew Sullivan
andrew@libertyrms.info
In reply to: Andrew Sullivan (#75)
#87Chris Browne
cbbrowne@acm.org
In reply to: Dann Corbit (#85)
#88Dann Corbit
DCorbit@connx.com
In reply to: Chris Browne (#87)
#89Hans-Jürgen Schönig
postgres@cybertec.at
In reply to: Bruce Momjian (#40)
#90Bruce Momjian
bruce@momjian.us
In reply to: Andrew Sullivan (#74)
#91Andrew Sullivan
andrew@libertyrms.info
In reply to: Bruce Momjian (#90)
#92Peter Eisentraut
peter_e@gmx.net
In reply to: Andrew Sullivan (#91)
#93Bruce Momjian
bruce@momjian.us
In reply to: Peter Eisentraut (#92)
#94Andrew Sullivan
andrew@libertyrms.info
In reply to: Peter Eisentraut (#92)
#95Peter Eisentraut
peter_e@gmx.net
In reply to: Bruce Momjian (#93)
#96Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Peter Eisentraut (#95)
#97Mike Mascari
mascarm@mascari.com
In reply to: Bruce Momjian (#93)
#98Bruce Momjian
bruce@momjian.us
In reply to: Peter Eisentraut (#95)
#99Rod Taylor
rbt@rbt.ca
In reply to: Peter Eisentraut (#95)
#100Andrew Sullivan
andrew@libertyrms.info
In reply to: Mike Mascari (#97)
#101Robert Treat
xzilla@users.sourceforge.net
In reply to: Andrew Sullivan (#100)
#102Andrew Sullivan
andrew@libertyrms.info
In reply to: Robert Treat (#101)
#103Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Andrew Sullivan (#102)
#104Bruce Momjian
bruce@momjian.us
In reply to: Tatsuo Ishii (#103)
#105The Hermit Hacker
scrappy@hub.org
In reply to: Tatsuo Ishii (#103)
#106Chris Browne
cbbrowne@acm.org
In reply to: Andrew Sullivan (#100)
#107Hans-Jürgen Schönig
postgres@cybertec.at
In reply to: Andrew Sullivan (#100)
#108Hans-Jürgen Schönig
postgres@cybertec.at
In reply to: Bruce Momjian (#93)
#109Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Bruce Momjian (#104)
#110Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Heikki Linnakangas (#109)
#111Andrew Sullivan
andrew@libertyrms.info
In reply to: Tatsuo Ishii (#103)
#112Andrew Sullivan
andrew@libertyrms.info
In reply to: Chris Browne (#106)
#113Satoshi Nagayasu
pgsql@snaga.org
In reply to: Andrew Sullivan (#111)
#114Dann Corbit
DCorbit@connx.com
In reply to: Satoshi Nagayasu (#113)
#115Chris Browne
cbbrowne@acm.org
In reply to: Dann Corbit (#114)
#116Dann Corbit
DCorbit@connx.com
In reply to: Chris Browne (#115)
#117Dann Corbit
DCorbit@connx.com
In reply to: Dann Corbit (#116)
In reply to: Dann Corbit (#116)
#119Dann Corbit
DCorbit@connx.com
In reply to: Jeroen T. Vermeulen (#118)
#120Rod Taylor
rbt@rbt.ca
In reply to: Dann Corbit (#119)
#121Jordan Henderson
jordan_henders@yahoo.com
In reply to: Rod Taylor (#120)
#122Jan Wieck
JanWieck@Yahoo.com
In reply to: Bruce Momjian (#104)
#123Peter Galbavy
peter.galbavy@knowtion.net
In reply to: Bruce Momjian (#104)
#124Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Heikki Linnakangas (#109)
#125Bruce Momjian
bruce@momjian.us
In reply to: Satoshi Nagayasu (#113)
#126Satoshi Nagayasu
pgsql@snaga.org
In reply to: Bruce Momjian (#125)
#127Bruce Momjian
bruce@momjian.us
In reply to: Satoshi Nagayasu (#126)
#128Rob Butler
robert.butler5@verizon.net
In reply to: Tatsuo Ishii (#103)