A 2 phase commit weirdness
Hackers,
I'm seeing the following weirdness with the 2PC patch:
alvherre=# begin;
BEGIN
alvherre=# create table a (a int);
CREATE TABLE
alvherre=# insert into a values (1);
INSERT 0 1
alvherre=# prepare transaction 'foo';
PREPARE TRANSACTION
alvherre=# select * from a;
At this point, the backend freezes. However, if I connect in another
session and issue the same "select * from a" query, it correctly returns
"no such relation". Now, because the backend cannot see the table
(because it was created by a transaction that is not yet committed), the
first backend shouldn't freeze but return the same "no such relation".
My guess is that the backend that created the prepared transaction has
its relcache populated with the new table's entry. But given that we
prepared the transaction, we should forget about the table, and only
remember it when we receive the shared inval message that will get sent
when the prepared transaction is committed.
I'm wondering what should happen at prepare time so that "my own cache"
is correct. Do I need to send the inval messages to me? Is this even
possible? Maybe I need to read the messages from the prepare file and
send it to me. Any ideas?
--
Alvaro Herrera (<alvherre[a]surnet.cl>)
Licensee shall have no right to use the Licensed Software
for productive or commercial use. (Licencia de StarOffice 6.0 beta)
Alvaro Herrera <alvherre@surnet.cl> writes:
I'm wondering what should happen at prepare time so that "my own cache"
is correct.
Good point. As far as the local caches are concerned, we probably have
to make it look like the transaction rolled back. I think Heikki
already had code in there to send the right inval messages when the
prepared transaction ultimately commits ... but we'll have to check that
that sequence does the right things ...
Do I need to send the inval messages to me? Is this even
possible?
inval.c is less than readable, isn't it :-( But yes, and yes.
regards, tom lane
On Thu, 26 May 2005, Tom Lane wrote:
Alvaro Herrera <alvherre@surnet.cl> writes:
I'm wondering what should happen at prepare time so that "my own cache"
is correct.Good point. As far as the local caches are concerned, we probably have
to make it look like the transaction rolled back. I think Heikki
already had code in there to send the right inval messages when the
prepared transaction ultimately commits ... but we'll have to check that
that sequence does the right things ...
Looking at the sequence, at least the relcache init file stuff looks if
not broken at least a bit heavy-handed...
BTW: Is there a race condition in the relcache init file invalidation,
even without 2PC?
AtEOXact_Inval does basically this:
1. Unlink init file
2. Send inval messages
3. Unlink the init file again
Now consider this scenario:
backend A: Do updates that cause an init file invalidation
backend A: Commit begins
backend A: unlink init file
backend B starts and recreates init file
backend A: send inval message
backend C starts and reads the now stale init file
The window is admittedly very small, but it just caught my eye. Or am I
missing some lock etc?
- Heikki
Heikki Linnakangas <hlinnaka@iki.fi> writes:
Looking at the sequence, at least the relcache init file stuff looks if
not broken at least a bit heavy-handed...
I was planning to change that ;-) ... using separate 2PC action records
for the relcache init file actions would make it much better.
Now consider this scenario:
backend A: Do updates that cause an init file invalidation
backend A: Commit begins
backend A: unlink init file
backend B starts and recreates init file
backend A: send inval message
backend C starts and reads the now stale init file
No problem, because C will receive A's inval messages after that.
regards, tom lane
On Fri, May 27, 2005 at 11:12:06AM -0400, Tom Lane wrote:
Heikki Linnakangas <hlinnaka@iki.fi> writes:
Looking at the sequence, at least the relcache init file stuff looks if
not broken at least a bit heavy-handed...I was planning to change that ;-) ... using separate 2PC action records
for the relcache init file actions would make it much better.
Hum, do you mean separate for 2PC only, or make'em completely separate
invalidation messages?
I fixed the problem I had -- it was very easy to make the messages get
processed locally. However strangeness can still happen. Consider:
create table foo ();
begin;
drop table foo;
prepare transaction 'foo';
Now any backend that tries to access table foo will block, because the
'foo' prepared transaction has acquired a lock on it. However the table
is still visible in the catalogs, as it should be. It can easily be
awakened by other backend doing
commit transaction 'foo';
But at awakening, the user will get this:
ERROR: relation 66002 deleted while still in use
This is ugly -- I don't think there is a way to get out of it.
Unrelated question: is it intended that the prepared transactions are
visible cross-database through pg_prepared_xacts?
--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"No single strategy is always right (Unless the boss says so)"
(Larry Wall)
Alvaro Herrera <alvherre@surnet.cl> writes:
But at awakening, the user will get this:
ERROR: relation 66002 deleted while still in use
This is ugly -- I don't think there is a way to get out of it.
There had better be a way, since (I suppose) the ERROR is preventing the
commit from succeeding ...
Unrelated question: is it intended that the prepared transactions are
visible cross-database through pg_prepared_xacts?
That is a good question. Can a backend running in a different database
execute the COMMIT (or ROLLBACK)? Offhand I'd bet that will not work,
which suggests we'd better make the view per-database. [ thinks a bit
more... ] We might be able to make it work, but there seems like a lot
of potential for bugs/fragility there. Might be best to take the narrow
definition to start with.
regards, tom lane
On Tue, May 31, 2005 at 02:09:56AM -0400, Tom Lane wrote:
Alvaro Herrera <alvherre@surnet.cl> writes:
But at awakening, the user will get this:
ERROR: relation 66002 deleted while still in use
This is ugly -- I don't think there is a way to get out of it.There had better be a way, since (I suppose) the ERROR is preventing the
commit from succeeding ...
No, the ERROR is in a completely unrelated transaction. The scenario
again is this:
CREATE TABLE foo ();
BEGIN;
DROP TABLE foo;
PREPARE TRANSACTION 'foo';
SELECT * FROM foo;
-- hangs
COMMIT TRANSACTION 'foo';
ERROR, relation deleted while still in
use
So it's a rather contorted situation to begin with.
Unrelated question: is it intended that the prepared transactions are
visible cross-database through pg_prepared_xacts?That is a good question. Can a backend running in a different database
execute the COMMIT (or ROLLBACK)? Offhand I'd bet that will not work,
which suggests we'd better make the view per-database. [ thinks a bit
more... ] We might be able to make it work, but there seems like a lot
of potential for bugs/fragility there. Might be best to take the narrow
definition to start with.
Ok.
--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"El sentido de las cosas no viene de las cosas, sino de
las inteligencias que las aplican a sus problemas diarios
en busca del progreso." (Ernesto Hern�ndez-Novich)
Alvaro Herrera <alvherre@surnet.cl> writes:
No, the ERROR is in a completely unrelated transaction. The scenario
again is this:
CREATE TABLE foo ();
BEGIN;
DROP TABLE foo;
PREPARE TRANSACTION 'foo';
SELECT * FROM foo;
-- hangs
COMMIT TRANSACTION 'foo';
ERROR, relation deleted while still in
use
Oh. Well, you get that now without any use of PREPARE; it's not clear
what else we could do, except possibly make the message a bit more
user-friendly.
regards, tom lane
On Tue, May 31, 2005 at 10:44:58AM -0400, Tom Lane wrote:
Alvaro Herrera <alvherre@surnet.cl> writes:
No, the ERROR is in a completely unrelated transaction. The scenario
again is this:CREATE TABLE foo ();
BEGIN;
DROP TABLE foo;
PREPARE TRANSACTION 'foo';SELECT * FROM foo;
-- hangs
COMMIT TRANSACTION 'foo';
ERROR, relation deleted while still in
useOh. Well, you get that now without any use of PREPARE; it's not clear
what else we could do, except possibly make the message a bit more
user-friendly.
Ah, you are right, sorry :-) I was imagining I had to cope with that
but evidently not.
--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"Granting software the freedom to evolve guarantees only different results,
not better ones." (Zygo Blaxell)