2-phase commit

Started by Andrew Sullivanalmost 23 years ago128 messageshackers

andrew@libertyrms.info

almost 23 years ago

Hi,

As the 7.4 beta rolls on, I thought now would be a good time to start
talking about the future.

I have a potential need in the future for distributed transactions
(XA). To get that from Postgres, I'd need two-phase commit, I think.
There is someone working on such a project
(<http://snaga.org/pgsql/>), but last time it was discussed here, it
received a rather lukewarm reception (see, e.g., the thread starting
at
<http://archives.postgresql.org/pgsql-hackers/2003-06/msg00752.php>).

While at OSCON, I had a discussion with Joe Conway, Bruce Momjian,
and Greg Sabino Mullane about 2PC. Various people expressed various
opinions on the topic, but I think we agreed on the following. The
relevant folks can correct me if I'm wrong:

Two-phase commit has theoretical problems, but it is implemented in
several "enterprise" RDBMS. 2PC is something needed by certain kinds
of clients (especially those with transaction managers), so if
PostgreSQL doesn't have it, PostgreSQL just won't get supported in
that arena. Someone is already working on 2PC, but may feel unwanted
due to the reactions last heard on the topic, and may not continue
working unless he gets some support. What is a necessary condition
for such support is to get some idea of what compromises 2PC might
impose, and thereafter to try to determine which such compromises, if
any, are acceptable ones.

I think the idea here is that, while in most cases a "pretty-good"
implementation of a desirable feature might get included in the
source on the grounds that it can always be improved upon later,
something like 2PC has the potential to do great harm to an otherwise
reliable transaction manager. So the arguments about what to do need
to be aired in advance.

I (perhaps foolishly) volunteered to undertake to collect the
arguments in various directions, on the grounds that I can contribute
no code, but have skin made of asbestos. I thought I'd try to
collect some information about what people think the problems and
potentially acceptable compromises are, to see if there is some way
to understand what can and cannot be contemplated for 2PC. I'll
include in any such outline the remarks found in the -hackers thread
referenced above. Any objections?

-- 
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110

Chris Browne

cbbrowne@acm.org

almost 23 years ago

In reply to: Andrew Sullivan (#1)

Re: 2-phase commit

In an attempt to throw the authorities off his trail, andrew@libertyrms.info (Andrew Sullivan) transmitted:

As the 7.4 beta rolls on, I thought now would be a good time to start
talking about the future.

I have a potential need in the future for distributed transactions
(XA). To get that from Postgres, I'd need two-phase commit, I think.
There is someone working on such a project
(<http://snaga.org/pgsql/>), but last time it was discussed here, it
received a rather lukewarm reception (see, e.g., the thread starting
at
<http://archives.postgresql.org/pgsql-hackers/2003-06/msg00752.php>).

Interesting/positive news on this front; the XA specification
documents are now all available in PDF form "freely", from the Open
Group, where they used to be fairly pricey.

<http://www.opengroup.org/publications/catalog/tp.htm>

Another notable XA documentation source is here...
<http://www.middleware.net/tuxedo/resources/XA_Documentation.html>

Two interesting implications of XA support would be that there could
be some "congruence of interests" that would arise regarding two
vendors:

- XA is essentially based on the API of BEA Tuxedo. I'm told they
include a simple database system with Tuxedo, but nothing particularly
wonderful. (Who thinks of BEA as a DBMS vendor???) They might have
interest in bundling something better...

- The main Tuxedo reseller that I am aware of is PeopleSoft, who use
it for their "high traffic" clients. Anyone that has seen news lately
knows that they and Oracle aren't exactly "best pals" these days;
having another DB option could be helpful to them...
--
(format nil "~S@~S" "aa454" "freenet.carleton.ca")
http://www3.sympatico.ca/cbbrowne/tpmonitor.html
"In order to make an apple pie from scratch, you must first create the
universe." -- Carl Sagan, Cosmos

Joe Conway

mail@joeconway.com

almost 23 years ago

In reply to: Chris Browne (#2)

Re: [HACKERS] 2-phase commit

(moving to advocacy)

Christopher Browne wrote:

- The main Tuxedo reseller that I am aware of is PeopleSoft, who use
it for their "high traffic" clients. Anyone that has seen news lately
knows that they and Oracle aren't exactly "best pals" these days;
having another DB option could be helpful to them...

That's an interesting observation, because I've long thought PeopleSoft
ought to support Postgres too. From what I recall, their database schema
is *very* database neutral (at least as of PSFT version 7.x) and fairly
simple (we ran it on MSSQL 6.5). It would probably be pretty easily
ported to run on Postgres.

I wonder how we could get them to consider it...

Joe

Jeroen T. Vermeulen

jtv@xs4all.nl

almost 23 years ago

In reply to: Chris Browne (#2)

Re: 2-phase commit

On Tue, Aug 26, 2003 at 08:04:13PM -0400, Christopher Browne wrote:

Interesting/positive news on this front; the XA specification
documents are now all available in PDF form "freely", from the Open
Group, where they used to be fairly pricey.

A step in the right direction, but AFAIC it's too little, too late.
The impression I get, at least, is that it's as good as dead now: Java
may use it, but it hides the details anyway so it might as well not be
there--the Java way is to standardize the API but nothing that goes "on
the wire".

Lots of proprietary middleware uses XA, but from what I hear there are
enough subtle differences to make mixing-and-matching of products risky
at best--the proprietary way is to bundle products that will work at
least marginally together, and relegate standards to a bullshit point
in the PowerPoint presentations. "Based on industry standard" means
about the same as "based on a true story."

Then there's the fact that the necessary followup standards never got
anywhere, and the fact that XA doesn't cope with threading really well.

Don't get me wrong, XA support may well be a good thing. But at this
stage, personally I'd go for a good 2PC implementation first and worry
about supporting XA later.

Jeroen

Justin Clift

justin@postgresql.org

almost 23 years ago

In reply to: Joe Conway (#3)

Re: [HACKERS] 2-phase commit

Joe Conway wrote:

(moving to advocacy)

Christopher Browne wrote:

- The main Tuxedo reseller that I am aware of is PeopleSoft, who use
it for their "high traffic" clients. Anyone that has seen news lately
knows that they and Oracle aren't exactly "best pals" these days;
having another DB option could be helpful to them...

That's an interesting observation, because I've long thought PeopleSoft
ought to support Postgres too. From what I recall, their database schema
is *very* database neutral (at least as of PSFT version 7.x) and fairly
simple (we ran it on MSSQL 6.5). It would probably be pretty easily
ported to run on Postgres.

I wonder how we could get them to consider it...

Not a bad suggestion. Just went to their site and submitted an quick
brief of benefits/etc via their "Partner Proposal" page:

http://checkers.peoplesoft.com/allconn/ppp.nsf/PPP?OpenForm&Seq=2#_RefreshKW_type

I'm hoping they are read by People With A Clue, and that they in turn
will pass it on to the right group internally.

Worth a shot I guess.

:-)

Regards and best wishes,

Justin Clift

Show quoted text

Joe

Chris Browne

cbbrowne@acm.org

almost 23 years ago

In reply to: Andrew Sullivan (#1)

Re: [HACKERS] 2-phase commit

After a long battle with technology,mail@joeconway.com (Joe Conway), an earthling, wrote:

(moving to advocacy)

Christopher Browne wrote:

- The main Tuxedo reseller that I am aware of is PeopleSoft, who use
it for their "high traffic" clients. Anyone that has seen news lately
knows that they and Oracle aren't exactly "best pals" these days;
having another DB option could be helpful to them...

That's an interesting observation, because I've long thought
PeopleSoft ought to support Postgres too. From what I recall, their
database schema is *very* database neutral (at least as of PSFT
version 7.x) and fairly simple (we ran it on MSSQL 6.5). It would
probably be pretty easily ported to run on Postgres.

I wonder how we could get them to consider it...

XA support so that it would "play well" with Tuxedo would be the best
thing I can think of. Arguing that they _should_ consider PostgreSQL
when it doesn't support their "scalability extender" wouldn't seem
likely to me to sell well.

That's _exactly_ why I mentioned both products; congruence of
interests...
--
select 'cbbrowne' || '@' || 'acm.org';
http://cbbrowne.com/info/internet.html
Consciousness - that annoying time between naps.

Chris Browne

cbbrowne@acm.org

almost 23 years ago

In reply to: Andrew Sullivan (#1)

Re: [HACKERS] 2-phase commit

After a long battle with technology,justin@postgresql.org (Justin Clift), an earthling, wrote:

Worth a shot I guess.

:-)

I'd think that they would take the idea more seriously if PostgreSQL
supported XA and thereby was compatible with Tuxedo. But it probably
doesn't hurt for them to hear the idea multiple times...
--
let name="aa454" and tld="freenet.carleton.ca" in String.concat "@" [name;tld];;
http://www.ntlug.org/~cbbrowne/finances.html
Whatever you do don't mail me at pink-and-wobbly@asdkjlwelkj.com,
because then I'll know you're just an address-harvester, and blacklist
your IP until the end of time

Bruno Wolff III

bruno@wolff.to

almost 23 years ago

In reply to: Joe Conway (#3)

Re: [HACKERS] 2-phase commit

On Wed, Aug 27, 2003 at 22:46:58 -0700,
Joe Conway <mail@joeconway.com> wrote:

That's an interesting observation, because I've long thought PeopleSoft
ought to support Postgres too. From what I recall, their database schema
is *very* database neutral (at least as of PSFT version 7.x) and fairly
simple (we ran it on MSSQL 6.5). It would probably be pretty easily
ported to run on Postgres.

In my opinion it is too database agnostic. They pretty much just use the
DB as a file. From what I have seen of the system it is one big hack.

Their trusted client security model is ridiculous. Fortunately in
version 8 you don't have to let people run 2 tier accept for developer
types. (Anyone with 2 tier access owns the system.) I really don't
even trust 3 tier access, because I believe that a fair amount of
security is enforced by the client rather than the app server.

It was annoying that the set of characters usable for passwords in 7.6
(and presumably still apply to the connect ID in 8) was restricted
because they didn't want to quote the password string so that you could
have special characters in it.

They aren't big on using referential integrity to keep the data clean.

Joe Conway

mail@joeconway.com

almost 23 years ago

In reply to: Bruno Wolff III (#8)

Re: [HACKERS] 2-phase commit

Bruno Wolff III wrote:

On Wed, Aug 27, 2003 at 22:46:58 -0700,
Joe Conway <mail@joeconway.com> wrote:

That's an interesting observation, because I've long thought PeopleSoft
ought to support Postgres too. From what I recall, their database schema
is *very* database neutral (at least as of PSFT version 7.x) and fairly
simple (we ran it on MSSQL 6.5). It would probably be pretty easily
ported to run on Postgres.

In my opinion it is too database agnostic. They pretty much just use the
DB as a file. From what I have seen of the system it is one big hack.

Yeah, I didn't say I *liked* their schema, just that I thought it would
be easy for them to support Postgres ;-)

Like it or not, they are one of the larger ERP/CRM players (after the
merger with JDEwards, they will be *ahead* of Oracle, only second to
SAP), and having them offer PostgreSQL support would be significant. If
the XA/Tuxedo thing is an issue, they could position it for mid-tier
customers who don't need the transaction manager anyway.

Joe

#10

Chris Browne

cbbrowne@acm.org

almost 23 years ago

In reply to: Andrew Sullivan (#1)

Re: [HACKERS] 2-phase commit

Oops! bruno@wolff.to (Bruno Wolff III) was seen spray-painting on a wall:

On Wed, Aug 27, 2003 at 22:46:58 -0700,
Joe Conway <mail@joeconway.com> wrote:

That's an interesting observation, because I've long thought
PeopleSoft ought to support Postgres too. From what I recall, their
database schema is *very* database neutral (at least as of PSFT
version 7.x) and fairly simple (we ran it on MSSQL 6.5). It would
probably be pretty easily ported to run on Postgres.

In my opinion it is too database agnostic. They pretty much just use
the DB as a file. From what I have seen of the system it is one big
hack.

Ah, so it's like the way SAP R/3's HR module works. (I expect I'm the
only one around that is more than passing familiar with "cluster
tables"; quite supremely nonrelational stuff, and quite
bletcherous...)

To a great extent this comes from the nature of the application. HR
is all about collecting together "documents," and these applications
replace "paper" with "pseudopaper."

They aren't big on using referential integrity to keep the data
clean.

Ditto for SAP R/3; "cleanliness" is, there, imposed by only using
their applications to do updates, which includes writing your software
to invoke their functions.
--
(reverse (concatenate 'string "gro.mca" "@" "enworbbc"))
http://www.ntlug.org/~cbbrowne/linuxxian.html
ASSEMBLER is a language. Any language that can take a half-dozen
keystrokes and compile it down to one byte of code is all right in my
books. Though for the REAL programmer, assembler is a waste of
time. Why use a compiler when you can code directly into memory
through a front panel.

#11

Bruce Momjian

bruce@momjian.us

over 22 years ago

In reply to: Andrew Sullivan (#1)

Re: 2-phase commit

I haven't seen any comment on this email.

From our previous discussion of 2-phase commit, there was concern that
the failure modes of 2-phase commit were not solvable. However, I think
multi-master replication is going to have similar non-solvable failure
modes, yet people still want multi-master replication.

We have had several requests for 2-phase commit in the past month. I
think we should encourage the Japanese group to continue on their
2-phase commit patch to be included in 7.5. Yes, it will have
non-solvable failure modes, but let's discuss them and find an
appropriate way to deal with the failures.

---------------------------------------------------------------------------

Andrew Sullivan wrote:

Hi,

As the 7.4 beta rolls on, I thought now would be a good time to start
talking about the future.

I have a potential need in the future for distributed transactions
(XA). To get that from Postgres, I'd need two-phase commit, I think.
There is someone working on such a project
(<http://snaga.org/pgsql/>), but last time it was discussed here, it
received a rather lukewarm reception (see, e.g., the thread starting
at
<http://archives.postgresql.org/pgsql-hackers/2003-06/msg00752.php>).

While at OSCON, I had a discussion with Joe Conway, Bruce Momjian,
and Greg Sabino Mullane about 2PC. Various people expressed various
opinions on the topic, but I think we agreed on the following. The
relevant folks can correct me if I'm wrong:

Two-phase commit has theoretical problems, but it is implemented in
several "enterprise" RDBMS. 2PC is something needed by certain kinds
of clients (especially those with transaction managers), so if
PostgreSQL doesn't have it, PostgreSQL just won't get supported in
that arena. Someone is already working on 2PC, but may feel unwanted
due to the reactions last heard on the topic, and may not continue
working unless he gets some support. What is a necessary condition
for such support is to get some idea of what compromises 2PC might
impose, and thereafter to try to determine which such compromises, if
any, are acceptable ones.

I think the idea here is that, while in most cases a "pretty-good"
implementation of a desirable feature might get included in the
source on the grounds that it can always be improved upon later,
something like 2PC has the potential to do great harm to an otherwise
reliable transaction manager. So the arguments about what to do need
to be aired in advance.

I (perhaps foolishly) volunteered to undertake to collect the
arguments in various directions, on the grounds that I can contribute
no code, but have skin made of asbestos. I thought I'd try to
collect some information about what people think the problems and
potentially acceptable compromises are, to see if there is some way
to understand what can and cannot be contemplated for 2PC. I'll
include in any such outline the remarks found in the -hackers thread
referenced above. Any objections?

A
-- 
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
+1 416 646 3304 x110
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

#12

Mike Mascari

mascarm@mascari.com

over 22 years ago

In reply to: Bruce Momjian (#11)

Re: 2-phase commit

Bruce Momjian wrote:

I haven't seen any comment on this email.

From our previous discussion of 2-phase commit, there was concern that
the failure modes of 2-phase commit were not solvable. However, I think
multi-master replication is going to have similar non-solvable failure
modes, yet people still want multi-master replication.

We have had several requests for 2-phase commit in the past month. I
think we should encourage the Japanese group to continue on their
2-phase commit patch to be included in 7.5. Yes, it will have
non-solvable failure modes, but let's discuss them and find an
appropriate way to deal with the failures.

FWIW, Oracle 8's manual for the recovery of a distributed tx where the
coordinator never comes back on line is:

https://www.ifi.uni-klu.ac.at/Public/Documentation/oracle/product/8.0.3/doc/server803/A54643_01/ch_intro.htm#7783

"If a database must be recovered to a point in the past, Oracle's
recovery facilities allow database administrators at other sites to
return their databases to the earlier point in time also. This ensures
that the global database remains consistent."

So it seems, for Oracle 8 at least, PITR is the method of recovery for
cohorts after unrecoverable coordinator failure.

Ugly and yet probably a prerequisite.

Mike Mascari
mascarm@mascari.com

#13

Bruce Momjian

bruce@momjian.us

over 22 years ago

In reply to: Mike Mascari (#12)

Re: 2-phase commit

Mike Mascari wrote:

Bruce Momjian wrote:

I haven't seen any comment on this email.

From our previous discussion of 2-phase commit, there was concern that
the failure modes of 2-phase commit were not solvable. However, I think
multi-master replication is going to have similar non-solvable failure
modes, yet people still want multi-master replication.

We have had several requests for 2-phase commit in the past month. I
think we should encourage the Japanese group to continue on their
2-phase commit patch to be included in 7.5. Yes, it will have
non-solvable failure modes, but let's discuss them and find an
appropriate way to deal with the failures.

FWIW, Oracle 8's manual for the recovery of a distributed tx where the
coordinator never comes back on line is:

https://www.ifi.uni-klu.ac.at/Public/Documentation/oracle/product/8.0.3/doc/server803/A54643_01/ch_intro.htm#7783

"If a database must be recovered to a point in the past, Oracle's
recovery facilities allow database administrators at other sites to
return their databases to the earlier point in time also. This ensures
that the global database remains consistent."

So it seems, for Oracle 8 at least, PITR is the method of recovery for
cohorts after unrecoverable coordinator failure.

Yep, I assume PITR would be the solution for most failure cases --- very
ugly of course.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

#14

Tom Lane

tgl@sss.pgh.pa.us

over 22 years ago

In reply to: Bruce Momjian (#11)

Re: 2-phase commit

Bruce Momjian <pgman@candle.pha.pa.us> writes:

From our previous discussion of 2-phase commit, there was concern that
the failure modes of 2-phase commit were not solvable. However, I think
multi-master replication is going to have similar non-solvable failure
modes, yet people still want multi-master replication.

No. The real problem with 2PC in my mind is that its failure modes
occur *after* you have promised commit to one or more parties. In
multi-master, if you fail you know it before you have told the client
his data is committed.

regards, tom lane

#15

Jeroen T. Vermeulen

jtv@xs4all.nl

over 22 years ago

In reply to: Bruce Momjian (#13)

Re: 2-phase commit

On Tue, Sep 09, 2003 at 08:38:41PM -0400, Bruce Momjian wrote:

Yep, I assume PITR would be the solution for most failure cases --- very
ugly of course.

Anything can be broken in some way, if bad luck is willing to work hard
enough. In at least one, ah, competing company I know of, employees are
allowed by the legal people to say "assured" but not "guaranteed" for
precisely this reason.

First thing is an acceptable failure mode, then you try to narrow its
chances of occurring. And if worst comes to worst, one example of an
acceptable failure mode is "when in danger or doubt, run in circles,
scream and shout."

Jeroen

#16

Zeugswetter Andreas SB SD

ZeugswetterA@spardat.at

over 22 years ago

In reply to: Jeroen T. Vermeulen (#15)

Re: 2-phase commit

From our previous discussion of 2-phase commit, there was concern that
the failure modes of 2-phase commit were not solvable. However, I think
multi-master replication is going to have similar non-solvable failure
modes, yet people still want multi-master replication.

No. The real problem with 2PC in my mind is that its failure modes
occur *after* you have promised commit to one or more parties. In
multi-master, if you fail you know it before you have told the client
his data is committed.

Hmm ? The appl cannot take the first phase commit as its commit info. It
needs to wait for the second phase commit. The second phase is only finished
when all coservers have reported back. 2PC is synchronous.

The problems with 2PC are when after second phase commit was sent to all
servers and before all report back one of them becomes unreachable/down ...
(did it receive and do the 2nd commit or not) Such a transaction must stay
open until the coserver is reachable again or an administrator committed/aborted it.

It is multi master replication that usually has an asynchronous mode for
performance, and there the trouble starts.

Andreas

Import Notes

Resolved by subject fallback

#17

Bruce Momjian

bruce@momjian.us

over 22 years ago

In reply to: Zeugswetter Andreas SB SD (#16)

Re: 2-phase commit

Zeugswetter Andreas SB SD wrote:

From our previous discussion of 2-phase commit, there was concern that
the failure modes of 2-phase commit were not solvable. However, I think
multi-master replication is going to have similar non-solvable failure
modes, yet people still want multi-master replication.

No. The real problem with 2PC in my mind is that its failure modes
occur *after* you have promised commit to one or more parties. In
multi-master, if you fail you know it before you have told the client
his data is committed.

Hmm ? The appl cannot take the first phase commit as its commit info. It
needs to wait for the second phase commit. The second phase is only finished
when all coservers have reported back. 2PC is synchronous.

The problems with 2PC are when after second phase commit was sent to all
servers and before all report back one of them becomes unreachable/down ...
(did it receive and do the 2nd commit or not) Such a transaction must stay
open until the coserver is reachable again or an administrator committed/aborted it.

It is multi master replication that usually has an asynchronous mode for
performance, and there the trouble starts.

Let me diagram this so we can see the issues. Normal operation is:

Master Slave
------ -----
commit ready-->
<--OK
commit done--->
<--OK
completed

One possible failure is:

Master Slave
------ -----
commit ready-->
<--OK
commit done--->
dies here
stuck waiting

Another possible failure is:

Master Slave
------ -----
commit ready-->
<--OK
dies here
stuck waiting

Are these the issues? Can't we just add GUC timeouts to cause the
commit to fail, and the slave to stop waiting? I suppose a problem is:

Master Slave
------ -----
commit ready-->
<--OK
sleep
stuck waiting, times out
commit done

Could we allow slaves to check if the backend is still alive, perhaps by
asking the postmaster, similar to what we do with the cancel signal ---
that way, the slave would never time out and always wait if the master
was alive.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

#18

Tom Lane

tgl@sss.pgh.pa.us

over 22 years ago

In reply to: Bruce Momjian (#17)

Re: 2-phase commit

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Could we allow slaves to check if the backend is still alive, perhaps by
asking the postmaster, similar to what we do with the cancel signal ---
that way, the slave would never time out and always wait if the master
was alive.

You're not considering the possibility of a transient communication
failure. The fact that you cannot currently contact the other guy
is not proof that he's not still alive.

Example:

Master Slave
------ -----
commit ready-->
<--OK
commit done->XX

where "->XX" means the message gets lost due to network failure. Now
what? The slave cannot abort; he promised he could commit, and he does
not know whether the master has committed or not. The master does not
know the slave's state either; maybe he got the second message, and
maybe he didn't. Both sides are forced to keep information about the
open transaction indefinitely. Timing out on either side could yield
the wrong result.

regards, tom lane

#19

Bruce Momjian

bruce@momjian.us

over 22 years ago

In reply to: Tom Lane (#18)

Re: 2-phase commit

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Could we allow slaves to check if the backend is still alive, perhaps by
asking the postmaster, similar to what we do with the cancel signal ---
that way, the slave would never time out and always wait if the master
was alive.

You're not considering the possibility of a transient communication
failure. The fact that you cannot currently contact the other guy
is not proof that he's not still alive.

Example:

Master Slave
------ -----
commit ready-->
<--OK
commit done->XX

where "->XX" means the message gets lost due to network failure. Now
what? The slave cannot abort; he promised he could commit, and he does
not know whether the master has committed or not. The master does not
know the slave's state either; maybe he got the second message, and
maybe he didn't. Both sides are forced to keep information about the
open transaction indefinitely. Timing out on either side could yield
the wrong result.

Can't the master re-send the request after a timeout?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

#20

The Hermit Hacker

scrappy@hub.org

over 22 years ago

In reply to: Tom Lane (#18)

Re: 2-phase commit

On Fri, 26 Sep 2003, Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Could we allow slaves to check if the backend is still alive, perhaps by
asking the postmaster, similar to what we do with the cancel signal ---
that way, the slave would never time out and always wait if the master
was alive.

You're not considering the possibility of a transient communication
failure. The fact that you cannot currently contact the other guy
is not proof that he's not still alive.

Example:

Master Slave
------ -----
commit ready-->
<--OK
commit done->XX

where "->XX" means the message gets lost due to network failure. Now

'k, but isn't alot of that a "retry" issue? we're talking TCP here, not
UDP, which I *thought* was designed for transient network problems ... ?
I would think that any implementation would have a timeout/retry GUC
variable associated with it ... 'if no answer in x seconds, retry up to y
times' ...

if we are talking two computers sitting next to each other on a switch,
you'd expect those to be low ... but if you were talking about two
seperate geographical locations (and yes, I realize you are adding lag to
the mix with waiting for responses), you'd expect those #s to rise ...

#21

The Hermit Hacker

scrappy@hub.org

over 22 years ago

In reply to: Tom Lane (#18)

#22

Patrick Welche

prlw1@newn.cam.ac.uk

over 22 years ago

In reply to: The Hermit Hacker (#21)

#23

Bruce Momjian

bruce@momjian.us

over 22 years ago

In reply to: Patrick Welche (#22)

#24

Tom Lane

tgl@sss.pgh.pa.us

over 22 years ago

In reply to: Bruce Momjian (#19)

#25

Bruce Momjian

bruce@momjian.us

over 22 years ago

In reply to: Tom Lane (#24)

#26

The Hermit Hacker

scrappy@hub.org

over 22 years ago

In reply to: Tom Lane (#24)

#27

Chris Browne

cbbrowne@acm.org

over 22 years ago

In reply to: Patrick Welche (#22)

#28

Andrew Sullivan

andrew@libertyrms.info

over 22 years ago

In reply to: Tom Lane (#18)

#29

Andrew Sullivan

andrew@libertyrms.info

over 22 years ago

In reply to: Tom Lane (#24)

#30

The Hermit Hacker

scrappy@hub.org

over 22 years ago

In reply to: Chris Browne (#27)

#31

Rod Taylor

rbt@rbt.ca

over 22 years ago

In reply to: Bruce Momjian (#23)

#32

Rod Taylor

rbt@rbt.ca

over 22 years ago

In reply to: Andrew Sullivan (#28)

#33

Mike Mascari

mascarm@mascari.com

over 22 years ago

In reply to: The Hermit Hacker (#26)

#34

Gavin Sherry

swm@linuxworld.com.au

over 22 years ago

In reply to: Chris Browne (#27)

#35

Christopher Kings-Lynne

chriskl@familyhealth.com.au

over 22 years ago

In reply to: Tom Lane (#24)

#36

Tom Lane

tgl@sss.pgh.pa.us

over 22 years ago

In reply to: Christopher Kings-Lynne (#35)

#37

Jeff Davis

pgsql@j-davis.com

over 22 years ago

In reply to: Christopher Kings-Lynne (#35)

#38

Richard Huxton

dev@archonet.com

over 22 years ago

In reply to: Tom Lane (#36)

#39

The Hermit Hacker

scrappy@hub.org

over 22 years ago

In reply to: Tom Lane (#36)

#40

Bruce Momjian

bruce@momjian.us

over 22 years ago

In reply to: Richard Huxton (#38)

#41

Shridhar Daithankar

shridhar_daithankar@persistent.co.in

over 22 years ago

In reply to: Bruce Momjian (#40)

#42

Richard Huxton

dev@archonet.com

over 22 years ago

In reply to: Bruce Momjian (#40)

#43

The Hermit Hacker

scrappy@hub.org

over 22 years ago

In reply to: Bruce Momjian (#40)

#44

Bruce Momjian

bruce@momjian.us

over 22 years ago

In reply to: The Hermit Hacker (#43)

#45

Hiroshi Inoue

Inoue@tpf.co.jp

over 22 years ago

In reply to: Tom Lane (#24)

#46

Kevin Brown

kevin@sysexperts.com

over 22 years ago

In reply to: Bruce Momjian (#44)

#47

Bruce Momjian

bruce@momjian.us

over 22 years ago

In reply to: Kevin Brown (#46)

#48

Kevin Brown

kevin@sysexperts.com

over 22 years ago

In reply to: Bruce Momjian (#47)

#49

Rod Taylor

rbt@rbt.ca

over 22 years ago

In reply to: Bruce Momjian (#47)

#50

Hiroshi Inoue

Inoue@tpf.co.jp

over 22 years ago

In reply to: Hiroshi Inoue (#45)

#51

The Hermit Hacker

scrappy@hub.org

over 22 years ago

In reply to: Hiroshi Inoue (#50)

#52

Tom Lane

tgl@sss.pgh.pa.us

over 22 years ago

In reply to: Hiroshi Inoue (#50)

#53

Hiroshi Inoue

Inoue@tpf.co.jp

over 22 years ago

In reply to: Hiroshi Inoue (#45)

#54

Hiroshi Inoue

Inoue@tpf.co.jp

over 22 years ago

In reply to: Hiroshi Inoue (#45)

#55

Hiroshi Inoue

Inoue@tpf.co.jp

over 22 years ago

In reply to: Hiroshi Inoue (#45)

#56

Hiroshi Inoue

Inoue@tpf.co.jp

over 22 years ago

In reply to: Hiroshi Inoue (#45)

#57

Zeugswetter Andreas SB SD

ZeugswetterA@spardat.at

over 22 years ago

In reply to: Hiroshi Inoue (#56)

#58

The Hermit Hacker

scrappy@hub.org

over 22 years ago

In reply to: Hiroshi Inoue (#54)

#59

Bruce Momjian

bruce@momjian.us

over 22 years ago

In reply to: The Hermit Hacker (#58)

#60

Bruce Momjian

bruce@momjian.us

over 22 years ago

In reply to: Tom Lane (#52)

#61

The Hermit Hacker

scrappy@hub.org

over 22 years ago

In reply to: Bruce Momjian (#59)

#62

Hiroshi Inoue

Inoue@tpf.co.jp

over 22 years ago

In reply to: Zeugswetter Andreas SB SD (#57)

#63

Zeugswetter Andreas SB SD

ZeugswetterA@spardat.at

over 22 years ago

In reply to: Hiroshi Inoue (#62)

#64

Zeugswetter Andreas SB SD

ZeugswetterA@spardat.at

over 22 years ago

In reply to: Zeugswetter Andreas SB SD (#63)

#65

Bruce Momjian

bruce@momjian.us

over 22 years ago

In reply to: The Hermit Hacker (#61)

#66

Tom Lane

tgl@sss.pgh.pa.us

over 22 years ago

In reply to: Hiroshi Inoue (#54)

#67

Zeugswetter Andreas SB SD

ZeugswetterA@spardat.at

over 22 years ago

In reply to: Tom Lane (#66)

#68

Bruce Momjian

bruce@momjian.us

over 22 years ago

In reply to: Zeugswetter Andreas SB SD (#67)

#69

Andrew Sullivan

andrew@libertyrms.info

over 22 years ago

In reply to: Rod Taylor (#32)

#70

Andrew Sullivan

andrew@libertyrms.info

over 22 years ago

In reply to: Jeff Davis (#37)

#71

Tom Lane

tgl@sss.pgh.pa.us

over 22 years ago

In reply to: Bruce Momjian (#65)

#72

Andrew Sullivan

andrew@libertyrms.info

over 22 years ago

In reply to: Kevin Brown (#48)

#73

Bruce Momjian

bruce@momjian.us

over 22 years ago

In reply to: Tom Lane (#71)

#74

Andrew Sullivan

andrew@libertyrms.info

over 22 years ago

In reply to: The Hermit Hacker (#39)

#75

Andrew Sullivan

andrew@libertyrms.info

over 22 years ago

In reply to: The Hermit Hacker (#61)

#76

Bruce Momjian

bruce@momjian.us

over 22 years ago

In reply to: Andrew Sullivan (#74)

#77

Rod Taylor

rbt@rbt.ca

over 22 years ago

In reply to: Bruce Momjian (#76)

#78

Rod Taylor

rbt@rbt.ca

over 22 years ago

In reply to: Andrew Sullivan (#69)

#79

Peter Eisentraut

peter_e@gmx.net

over 22 years ago

In reply to: Tom Lane (#14)

#80

Manfred Spraul

manfred@colorfullife.com

over 22 years ago

In reply to: Peter Eisentraut (#79)

#81

Dann Corbit

DCorbit@connx.com

over 22 years ago

In reply to: Manfred Spraul (#80)

#82

Peter Eisentraut

peter_e@gmx.net

over 22 years ago

In reply to: Manfred Spraul (#80)

#83

Rod Taylor

rbt@rbt.ca

over 22 years ago

In reply to: Peter Eisentraut (#82)

#84

Andrew Sullivan

andrew@libertyrms.info

over 22 years ago

In reply to: Bruce Momjian (#76)

#85

Dann Corbit

DCorbit@connx.com

over 22 years ago

In reply to: Andrew Sullivan (#84)

#86

Andrew Sullivan

andrew@libertyrms.info

over 22 years ago

In reply to: Andrew Sullivan (#75)

#87

Chris Browne

cbbrowne@acm.org

over 22 years ago

In reply to: Dann Corbit (#85)

#88

Dann Corbit

DCorbit@connx.com

over 22 years ago

In reply to: Chris Browne (#87)

#89

Hans-Jürgen Schönig

postgres@cybertec.at

over 22 years ago

In reply to: Bruce Momjian (#40)

#90

Bruce Momjian

bruce@momjian.us

over 22 years ago

In reply to: Andrew Sullivan (#74)

#91

Andrew Sullivan

andrew@libertyrms.info

over 22 years ago

In reply to: Bruce Momjian (#90)

#92

Peter Eisentraut

peter_e@gmx.net

over 22 years ago

In reply to: Andrew Sullivan (#91)

#93

Bruce Momjian

bruce@momjian.us

over 22 years ago

In reply to: Peter Eisentraut (#92)

#94

Andrew Sullivan

andrew@libertyrms.info

over 22 years ago

In reply to: Peter Eisentraut (#92)

#95

Peter Eisentraut

peter_e@gmx.net

over 22 years ago

In reply to: Bruce Momjian (#93)

#96

Zeugswetter Andreas SB SD

ZeugswetterA@spardat.at

over 22 years ago

In reply to: Peter Eisentraut (#95)

#97

Mike Mascari

mascarm@mascari.com

over 22 years ago

In reply to: Bruce Momjian (#93)

#98

Bruce Momjian

bruce@momjian.us

over 22 years ago

In reply to: Peter Eisentraut (#95)

#99

Rod Taylor

rbt@rbt.ca

over 22 years ago

In reply to: Peter Eisentraut (#95)

#100

Andrew Sullivan

andrew@libertyrms.info

over 22 years ago

In reply to: Mike Mascari (#97)

#101

Robert Treat

xzilla@users.sourceforge.net

over 22 years ago

In reply to: Andrew Sullivan (#100)

#102

Andrew Sullivan

andrew@libertyrms.info

over 22 years ago

In reply to: Robert Treat (#101)

#103

Tatsuo Ishii

t-ishii@sra.co.jp

over 22 years ago

In reply to: Andrew Sullivan (#102)

#104

Bruce Momjian

bruce@momjian.us

over 22 years ago

In reply to: Tatsuo Ishii (#103)

#105

The Hermit Hacker

scrappy@hub.org

over 22 years ago

In reply to: Tatsuo Ishii (#103)

#106

Chris Browne

cbbrowne@acm.org

over 22 years ago

In reply to: Andrew Sullivan (#100)

#107

Hans-Jürgen Schönig

postgres@cybertec.at

over 22 years ago

In reply to: Andrew Sullivan (#100)

#108

Hans-Jürgen Schönig

postgres@cybertec.at

over 22 years ago

In reply to: Bruce Momjian (#93)

#109

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 22 years ago

In reply to: Bruce Momjian (#104)

#110

Zeugswetter Andreas SB SD

ZeugswetterA@spardat.at

over 22 years ago

In reply to: Heikki Linnakangas (#109)

#111

Andrew Sullivan

andrew@libertyrms.info

over 22 years ago

In reply to: Tatsuo Ishii (#103)

#112

Andrew Sullivan

andrew@libertyrms.info

over 22 years ago

In reply to: Chris Browne (#106)

#113

Satoshi Nagayasu

pgsql@snaga.org

over 22 years ago

In reply to: Andrew Sullivan (#111)

#114

Dann Corbit

DCorbit@connx.com

over 22 years ago

In reply to: Satoshi Nagayasu (#113)

#115

Chris Browne

cbbrowne@acm.org

over 22 years ago

In reply to: Dann Corbit (#114)

#116

Dann Corbit

DCorbit@connx.com

over 22 years ago

In reply to: Chris Browne (#115)

#117

Dann Corbit

DCorbit@connx.com

over 22 years ago

In reply to: Dann Corbit (#116)

#118

Jeroen T. Vermeulen

jtv@xs4all.nl

over 22 years ago

In reply to: Dann Corbit (#116)

#119

Dann Corbit

DCorbit@connx.com

over 22 years ago

In reply to: Jeroen T. Vermeulen (#118)

#120

Rod Taylor

rbt@rbt.ca

over 22 years ago

In reply to: Dann Corbit (#119)

#121

Jordan Henderson

jordan_henders@yahoo.com

over 22 years ago

In reply to: Rod Taylor (#120)

#122

Jan Wieck

JanWieck@Yahoo.com

over 22 years ago

In reply to: Bruce Momjian (#104)

#123

Peter Galbavy

peter.galbavy@knowtion.net

over 22 years ago

In reply to: Bruce Momjian (#104)

#124

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 22 years ago

In reply to: Heikki Linnakangas (#109)

#125

Bruce Momjian

bruce@momjian.us

over 22 years ago

In reply to: Satoshi Nagayasu (#113)

#126

Satoshi Nagayasu

pgsql@snaga.org

over 22 years ago

In reply to: Bruce Momjian (#125)

#127

Bruce Momjian

bruce@momjian.us

over 22 years ago

In reply to: Satoshi Nagayasu (#126)

#128

Rob Butler

robert.butler5@verizon.net

over 22 years ago

In reply to: Tatsuo Ishii (#103)