PG wire protocol question
Hi,
it was a long time I have read this list or written to it.
Now, I have a question. This blog post was written about 3 years ago:
https://aphyr.com/posts/282-jepsen-postgres
Basically, it talks about the client AND the server as a system
and if the network is cut between sending COMMIT and
receiving the answer for it, the client has no way to know
whether the transaction was actually committed.
The client connection may just timeout and a reconnect would
give it a new connection but it cannot pick up its old connection
where it left. So it cannot really know whether the old transaction
was committed or not, possibly without doing expensive queries first.
Has anything changed on that front?
There is a 10.0 debate on -hackers. If this problem posed by
the above article is not fixed yet and needs a new wire protocol
to get it fixed, 10.0 would be justified.
Thanks in advance,
Zolt�n B�sz�rm�nyi
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Boszormenyi Zoltan wrote:
it was a long time I have read this list or written to it.
Now, I have a question. This blog post was written about 3 years ago:
https://aphyr.com/posts/282-jepsen-postgresBasically, it talks about the client AND the server as a system
and if the network is cut between sending COMMIT and
receiving the answer for it, the client has no way to know
whether the transaction was actually committed.The client connection may just timeout and a reconnect would
give it a new connection but it cannot pick up its old connection
where it left. So it cannot really know whether the old transaction
was committed or not, possibly without doing expensive queries first.Has anything changed on that front?
That blog post seems ill-informed - that has nothing to do with
two-phase commit.
The problem - that the server may commit a transaction, but the client
never receives the server's response - is independent of whether
two-phase commit is used or not.
This is not a problem of PostgreSQL, it is a generic problem of communication.
What would be the alternative?
That the server has to wait for the client to receive the commit response?
But what if the client received the message and the server or the network
go down before the server learns of the fact?
You see that this would lead to an infinite regress.
Yours,
Laurenz Albe
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Tue, May 17, 2016 at 9:29 AM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
That blog post seems ill-informed - that has nothing to do with
two-phase commit.The problem - that the server may commit a transaction, but the client
never receives the server's response - is independent of whether
two-phase commit is used or not.
The author addresses this in a comment within the linked page:
«The database may be consistent, but the system isn’t. A concurrent
request to the db will get the answer “yes, the transaction has
committed”, but the same request of the remote client gets “no, the
transaction has not yet committed.” The system may eventuallybecome
consistent, if the partition is healed and the acknowledgement reaches
the client. But it isn’t consistent until that point.
And the client can’t just wait indefinitely for acknowledgement–the
commit request may not have reached the server, in which case the
client would deadlock forever. Not to mention practical concerns (a
customer and clerk aren’t going to wait very long for a credit card
transaction to complete). Introducing timeouts then causes the
temporary inconsistency to become permanent.»
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Sat, 14 May 2016 21:58:48 +0200, Boszormenyi Zoltan <zboszor@pr.hu>
wrote:
Hi,
it was a long time I have read this list or written to it.
Now, I have a question. This blog post was written about 3 years ago:
https://aphyr.com/posts/282-jepsen-postgresBasically, it talks about the client AND the server as a system
and if the network is cut between sending COMMIT and
receiving the answer for it, the client has no way to know
whether the transaction was actually committed.The client connection may just timeout and a reconnect would
give it a new connection but it cannot pick up its old connection
where it left. So it cannot really know whether the old transaction
was committed or not, possibly without doing expensive queries first.Has anything changed on that front?
There is a 10.0 debate on -hackers. If this problem posed by
the above article is not fixed yet and needs a new wire protocol
to get it fixed, 10.0 would be justified.
It isn't going to be fixed ... it is a basic *unsolvable* problem in
communication theory that affects coordination in any distributed
system. For a simple explanation, see
https://en.wikipedia.org/wiki/Two_Generals'_Problem
Thanks in advance,
Zolt�n B�sz�rm�nyi
George
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
2016-05-17 15:29 keltezéssel, Albe Laurenz írta:
Boszormenyi Zoltan wrote:
it was a long time I have read this list or written to it.
Now, I have a question. This blog post was written about 3 years ago:
https://aphyr.com/posts/282-jepsen-postgresBasically, it talks about the client AND the server as a system
and if the network is cut between sending COMMIT and
receiving the answer for it, the client has no way to know
whether the transaction was actually committed.The client connection may just timeout and a reconnect would
give it a new connection but it cannot pick up its old connection
where it left. So it cannot really know whether the old transaction
was committed or not, possibly without doing expensive queries first.Has anything changed on that front?
That blog post seems ill-informed - that has nothing to do with
two-phase commit.
In the blog post 2pc was mentioned related to the communication,
not as a transaction control inside the database. I wouldn't call
it misinformed. After all, terminology can mean different things
in different contexts.
The problem - that the server may commit a transaction, but the client
never receives the server's response - is independent of whether
two-phase commit is used or not.This is not a problem of PostgreSQL, it is a generic problem of communication.
Indeed.
What would be the alternative?
That the server has to wait for the client to receive the commit response?
Not quite. That would mean constantly sending an ack that the other
received the last ack, which would be silly.
If the network connection is cut, the client should be able to
reconnect to the old backend and query the last state and continue
where it left, maybe confirming via some key or UUID that it was
indeed the client that connected previously.
But what if the client received the message and the server or the network
go down before the server learns of the fact?
You see that this would lead to an infinite regress.Yours,
Laurenz Albe
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Wed, May 18, 2016 at 5:05 AM, Boszormenyi Zoltan <zboszor@pr.hu> wrote:
2016-05-17 15:29 keltezéssel, Albe Laurenz írta:
Boszormenyi Zoltan wrote:
it was a long time I have read this list or written to it.
Now, I have a question. This blog post was written about 3 years ago:
https://aphyr.com/posts/282-jepsen-postgresBasically, it talks about the client AND the server as a system
and if the network is cut between sending COMMIT and
receiving the answer for it, the client has no way to know
whether the transaction was actually committed.The client connection may just timeout and a reconnect would
give it a new connection but it cannot pick up its old connection
where it left. So it cannot really know whether the old transaction
was committed or not, possibly without doing expensive queries first.Has anything changed on that front?
That blog post seems ill-informed - that has nothing to do with
two-phase commit.Not quite. That would mean constantly sending an ack that the other
received the last ack, which would be silly.If the network connection is cut, the client should be able to
reconnect to the old backend and query the last state and continue
where it left, maybe confirming via some key or UUID that it was
indeed the client that connected previously.
I agree. It's the server's job to make sure itself is consistent. If
the client is suspicious it may have lost the ack for whatever reason,
it needs to verify against the database that the transaction
succeeded. This is an application problem, not a protocol problem.
merlin
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general