dblink: rollback transaction
Is there a way to rollback a dblink transaction?
Say, I delete some data from the remote database, but I don't want this
to be visible untill the data is inserted in the current database. And
if the insertion of data in the current database throws an error, I want
to rollback the dblink transaction, which should restore data in its
original remote location.
Thanks.
Oleg
*************************************
This e-mail may contain privileged or confidential material intended for the named recipient only.
If you are not the named recipient, delete this message and all attachments.
Unauthorized reviewing, copying, printing, disclosing, or otherwise using information in this e-mail is prohibited.
We reserve the right to monitor e-mail sent through our network.
*************************************
Oleg Lebedev said:
Is there a way to rollback a dblink transaction?
Say, I delete some data from the remote database, but I don't want this
to be visible untill the data is inserted in the current database. And
if the insertion of data in the current database throws an error, I want
to rollback the dblink transaction, which should restore data in its
original remote location.
What about doing the insert first, and then issuing the delete via the
dblink? If the insert fails and the transaction rolls back then the
deletion in the remote database will never have been done.
John Sidney-Woollett
John,
The example I provided was for illustrational purposes only :) The
problem that I am trying to solve is more complex. Basically, I am
trying to propagate remote data from remote tables and install it in the
local tables. I do this operation in a loop as follows:
For j IN all_tables LOOP
1. Bring remote data from remote_tables[j] (using dblink)
2. Insert received data in local_tables[j]
3. Delete data from remote_table[j] (using dblink)
END LOOP
Suppose I successfully ran the first loop iteration, but the second
iteration caused step 2 to through a "duplicate key" exception. This
will cause the effects of both loop iterations to roll back. However,
only local operations (step 2), but not the remote operations (step 3)
are rolled back. This causes the data brought and installed into the
first table to be deleted locally (i.e. step 2 of the first iteration is
rolled back), but not restored remotely (i.e. step 3 of the first
iteration is NOT rolled back). Therefore, I lose data for the first
table completely both in the local and remote locations.
Is there any way to roll back a remote dblink Xaction? Does anyone have
a better solution for my problem?
Thanks.
Oleg
-----Original Message-----
From: John Sidney-Woollett [mailto:johnsw@wardbrook.com]
Sent: Thursday, February 05, 2004 12:49 AM
To: Oleg Lebedev
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] dblink: rollback transaction
Oleg Lebedev said:
Is there a way to rollback a dblink transaction?
Say, I delete some data from the remote database, but I don't want
this to be visible untill the data is inserted in the current
database. And if the insertion of data in the current database throws
an error, I want to rollback the dblink transaction, which should
restore data in its original remote location.
What about doing the insert first, and then issuing the delete via the
dblink? If the insert fails and the transaction rolls back then the
deletion in the remote database will never have been done.
John Sidney-Woollett
*************************************
This e-mail may contain privileged or confidential material intended for the named recipient only.
If you are not the named recipient, delete this message and all attachments.
Unauthorized reviewing, copying, printing, disclosing, or otherwise using information in this e-mail is prohibited.
We reserve the right to monitor e-mail sent through our network.
*************************************
Import Notes
Resolved by subject fallback
Oleg Lebedev said:
The example I provided was for illustrational purposes only :)
I figured!
Is there any way to roll back a remote dblink Xaction? Does anyone have
a better solution for my problem?
Don't know enough to say or guess, sorry!
I think PG badly needs nested transaction support... ;)
John Sidney-Woollett
John Sidney-Woollett wrote:
I think PG badly needs nested transaction support... ;)
I think that is a main take-away here. You should not try to depend on
dblink as a robust replication solution. Perhaps if postgres had
two-phase commit and nested transactions, but not at the moment.
That said, depending on how you are implementing the loop in your
pseudo-code, you might be able to get closer by using persistent dblink
connections, and starting a transaction on the remote side before
starting the local transaction and running your plpgsql function (or
whatever it is you're running). If the local transaction fails, send an
ABORT to the remote side before closing the connection. However I can't
offhand think of a way to do that in an automated fashion.
Joe
I think that is a main take-away here. You should not try to depend on
dblink as a robust replication solution. Perhaps if postgres had
two-phase commit and nested transactions, but not at the moment.
Agreed. I wonder if I should simulate local Xactions by using local
dblink calls?
What do you think, Joe?
That said, depending on how you are implementing the loop in your
pseudo-code, you might be able to get closer by using persistent
dblink
connections, and starting a transaction on the remote side before
starting the local transaction and running your plpgsql function (or
whatever it is you're running). If the local transaction fails, send
an
ABORT to the remote side before closing the connection. However I
can't
offhand think of a way to do that in an automated fashion.
So, is it actually possible to use BEGIN; .. COMMIT; statement with
dblink?
Even if I start the remote Xaction before the local one starts, there is
no way for me to catch an exception thrown by the local Xaction. I don't
think Pl/PgSQL supports exceptions. So, if the local Xaction throws an
exception then the whole process terminates.
Ideas?
Thanks.
Oleg
*************************************
This e-mail may contain privileged or confidential material intended for the named recipient only.
If you are not the named recipient, delete this message and all attachments.
Unauthorized reviewing, copying, printing, disclosing, or otherwise using information in this e-mail is prohibited.
We reserve the right to monitor e-mail sent through our network.
*************************************
Import Notes
Resolved by subject fallback
Oleg Lebedev wrote:
Agreed. I wonder if I should simulate local Xactions by using local
dblink calls?
What do you think, Joe?
It is an interesting thought. Withing a single plpgsql function, open
one local and one remote persistent, named dblink connection. Start a
transaction in each. Go into your loop. Here's the problem -- I don't
know how you can programmatically detect an error. Try playing with
dblink_exec for this. If you can detect an error condition, you can then
ABORT both transactions.
So, is it actually possible to use BEGIN; .. COMMIT; statement with
dblink?
Sure. Use a named persistent connection. Then issue a BEGIN just like
any other remote SQL statement (might be best to use dblink_exec with
this also).
Even if I start the remote Xaction before the local one starts, there is
no way for me to catch an exception thrown by the local Xaction. I don't
think Pl/PgSQL supports exceptions. So, if the local Xaction throws an
exception then the whole process terminates.Ideas?
[runs off to try a few things...]
I played with this a bit, and found that with some minor changes to
dblink_exec(), I can get the behavior we want, I think.
===============================================================
Here's the SQL:
===============================================================
\c remote
drop table foo;
create table foo(f1 int primary key, f2 text);
insert into foo values (1,'a');
insert into foo values (2,'b');
insert into foo values (3,'b');
\c local
drop table foo;
create table foo(f1 int primary key, f2 text);
--note this is missing on remote side
create unique index uindx1 on foo(f2);
create or replace function test() returns text as '
declare
res text;
tup record;
sql text;
begin
-- leaving out result checking for clarity
select into res dblink_connect(''localconn'',''dbname=local'');
select into res dblink_connect(''remoteconn'',''dbname=remote'');
select into res dblink_exec(''localconn'',''BEGIN'');
select into res dblink_exec(''remoteconn'',''BEGIN'');
for tup in select * from dblink(''remoteconn'',''select * from foo'')
as t(f1 int, f2 text) loop
sql := ''insert into foo values ('' || tup.f1::text || '','''''' ||
tup.f2 || '''''')'';
select into res dblink_exec(''localconn'',sql);
if res = ''ERROR'' then
select into res dblink_exec(''localconn'',''ABORT'');
select into res dblink_exec(''remoteconn'',''ABORT'');
select into res dblink_disconnect(''localconn'');
select into res dblink_disconnect(''remoteconn'');
return ''ERROR'';
else
sql := ''delete from foo where f1 = '' || tup.f1::text;
select into res dblink_exec(''remoteconn'',sql);
end if;
end loop;
select into res dblink_exec(''localconn'',''COMMIT'');
select into res dblink_exec(''remoteconn'',''COMMIT'');
select into res dblink_disconnect(''localconn'');
select into res dblink_disconnect(''remoteconn'');
return ''OK'';
end;
' language plpgsql;
===============================================================
Here's the test:
===============================================================
local=# select test();
NOTICE: sql error
DETAIL: ERROR: duplicate key violates unique constraint "uindx1"
CONTEXT: PL/pgSQL function "test" line 15 at select into variables
test
-------
ERROR
(1 row)
local=# select * from foo;
f1 | f2
----+----
(0 rows)
local=# select * from dblink('dbname=remote','select * from foo') as
t(f1 int, f2 text);
f1 | f2
----+----
1 | a
2 | b
3 | b
(3 rows)
local=# drop index uindx1;
DROP INDEX
local=# select test();
test
------
OK
(1 row)
local=# select * from foo;
f1 | f2
----+----
1 | a
2 | b
3 | b
(3 rows)
local=# select * from dblink('dbname=remote','select * from foo') as
t(f1 int, f2 text);
f1 | f2
----+----
(0 rows)
===============================================================
Patch attached. Thoughts?
Joe
Attachments:
dblink-trans-test.patchtext/plain; name=dblink-trans-test.patchDownload+27-17
Joe,
Your fix is awesome! That's exactly what I need.
What version of postgres do I need to have installed to try this patch?
I am on 7.3 now.
Thanks.
Oleg
-----Original Message-----
From: Joe Conway [mailto:mail@joeconway.com]
Sent: Thursday, February 05, 2004 11:50 AM
To: Oleg Lebedev
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] dblink: rollback transaction
Oleg Lebedev wrote:
Agreed. I wonder if I should simulate local Xactions by using local
dblink calls? What do you think, Joe?
It is an interesting thought. Withing a single plpgsql function, open
one local and one remote persistent, named dblink connection. Start a
transaction in each. Go into your loop. Here's the problem -- I don't
know how you can programmatically detect an error. Try playing with
dblink_exec for this. If you can detect an error condition, you can then
ABORT both transactions.
So, is it actually possible to use BEGIN; .. COMMIT; statement with
dblink?
Sure. Use a named persistent connection. Then issue a BEGIN just like
any other remote SQL statement (might be best to use dblink_exec with
this also).
Even if I start the remote Xaction before the local one starts, there
is no way for me to catch an exception thrown by the local Xaction. I
don't think Pl/PgSQL supports exceptions. So, if the local Xaction
throws an exception then the whole process terminates.Ideas?
[runs off to try a few things...]
I played with this a bit, and found that with some minor changes to
dblink_exec(), I can get the behavior we want, I think.
===============================================================
Here's the SQL:
===============================================================
\c remote
drop table foo;
create table foo(f1 int primary key, f2 text);
insert into foo values (1,'a');
insert into foo values (2,'b');
insert into foo values (3,'b');
\c local
drop table foo;
create table foo(f1 int primary key, f2 text);
--note this is missing on remote side
create unique index uindx1 on foo(f2);
create or replace function test() returns text as '
declare
res text;
tup record;
sql text;
begin
-- leaving out result checking for clarity
select into res dblink_connect(''localconn'',''dbname=local'');
select into res dblink_connect(''remoteconn'',''dbname=remote'');
select into res dblink_exec(''localconn'',''BEGIN'');
select into res dblink_exec(''remoteconn'',''BEGIN'');
for tup in select * from dblink(''remoteconn'',''select * from foo'')
as t(f1 int, f2 text) loop
sql := ''insert into foo values ('' || tup.f1::text || '','''''' ||
tup.f2 || '''''')'';
select into res dblink_exec(''localconn'',sql);
if res = ''ERROR'' then
select into res dblink_exec(''localconn'',''ABORT'');
select into res dblink_exec(''remoteconn'',''ABORT'');
select into res dblink_disconnect(''localconn'');
select into res dblink_disconnect(''remoteconn'');
return ''ERROR'';
else
sql := ''delete from foo where f1 = '' || tup.f1::text;
select into res dblink_exec(''remoteconn'',sql);
end if;
end loop;
select into res dblink_exec(''localconn'',''COMMIT'');
select into res dblink_exec(''remoteconn'',''COMMIT'');
select into res dblink_disconnect(''localconn'');
select into res dblink_disconnect(''remoteconn'');
return ''OK'';
end;
' language plpgsql;
===============================================================
Here's the test:
===============================================================
local=# select test();
NOTICE: sql error
DETAIL: ERROR: duplicate key violates unique constraint "uindx1"
CONTEXT: PL/pgSQL function "test" line 15 at select into variables
test
-------
ERROR
(1 row)
local=# select * from foo;
f1 | f2
----+----
(0 rows)
local=# select * from dblink('dbname=remote','select * from foo') as
t(f1 int, f2 text);
f1 | f2
----+----
1 | a
2 | b
3 | b
(3 rows)
local=# drop index uindx1;
DROP INDEX
local=# select test();
test
------
OK
(1 row)
local=# select * from foo;
f1 | f2
----+----
1 | a
2 | b
3 | b
(3 rows)
local=# select * from dblink('dbname=remote','select * from foo') as
t(f1 int, f2 text);
f1 | f2
----+----
(0 rows)
===============================================================
Patch attached. Thoughts?
Joe
*************************************
This e-mail may contain privileged or confidential material intended for the named recipient only.
If you are not the named recipient, delete this message and all attachments.
Unauthorized reviewing, copying, printing, disclosing, or otherwise using information in this e-mail is prohibited.
We reserve the right to monitor e-mail sent through our network.
*************************************
Import Notes
Resolved by subject fallback
Oleg Lebedev wrote:
Your fix is awesome! That's exactly what I need.
What version of postgres do I need to have installed to try this patch?
I am on 7.3 now.
The patch is against 7.5devel, but it ought to apply against 7.4. I
can't remember if 7.3 supported named persistent connections
...[looks]... doesn't look like it, so you'll need 7.4 at least. I
haven't thought through the possible negative side-effects of this
change yet -- let me know how it goes for you if you try it.
Thanks,
Joe
Oleg Lebedev wrote:
Your fix is awesome! That's exactly what I need.
What version of postgres do I need to have installed to try this patch?
I am on 7.3 now.
BTW, in the last example I neglected to check for errors on the remote
side. For the mail archives, this one is more complete (but it still
probably needs more thought/error checking):
\c remote
drop table foo;
create table foo(f1 int primary key, f2 text);
insert into foo values (1,'a');
insert into foo values (2,'b');
insert into foo values (3,'b');
create table bar(f1 int primary key, f2 int references foo(f1));
insert into bar values (1,3);
\c local
drop table foo;
create table foo(f1 int primary key, f2 text);
--note this is missing on remote side
create unique index uindx1 on foo(f2);
create or replace function test() returns text as '
declare
res text;
tup record;
sql text;
begin
-- leaving out result checking for clarity
select into res dblink_connect(''localconn'',''dbname=local'');
select into res dblink_connect(''remoteconn'',''dbname=remote'');
select into res dblink_exec(''localconn'',''BEGIN'');
select into res dblink_exec(''remoteconn'',''BEGIN'');
for tup in select * from dblink(''remoteconn'',''select * from foo'')
as t(f1 int, f2 text) loop
sql := ''insert into foo values ('' || tup.f1::text || '','''''' ||
tup.f2 || '''''')'';
select into res dblink_exec(''localconn'',sql);
if res = ''ERROR'' then
select into res dblink_exec(''localconn'',''ABORT'');
select into res dblink_exec(''remoteconn'',''ABORT'');
select into res dblink_disconnect(''localconn'');
select into res dblink_disconnect(''remoteconn'');
return ''ERROR'';
else
sql := ''delete from foo where f1 = '' || tup.f1::text;
select into res dblink_exec(''remoteconn'',sql);
if res = ''ERROR'' then
select into res dblink_exec(''localconn'',''ABORT'');
select into res dblink_exec(''remoteconn'',''ABORT'');
select into res dblink_disconnect(''localconn'');
select into res dblink_disconnect(''remoteconn'');
return ''ERROR'';
end if;
end if;
end loop;
select into res dblink_exec(''localconn'',''COMMIT'');
select into res dblink_exec(''remoteconn'',''COMMIT'');
select into res dblink_disconnect(''localconn'');
select into res dblink_disconnect(''remoteconn'');
return ''OK'';
end;
' language plpgsql;
local=# select test();
NOTICE: sql error
DETAIL: ERROR: duplicate key violates unique constraint "uindx1"
CONTEXT: PL/pgSQL function "test" line 15 at select into variables
test
-------
ERROR
(1 row)
local=# select * from foo;
f1 | f2
----+----
(0 rows)
local=# select * from dblink('dbname=remote','select * from foo') as
t(f1 int, f2 text);
f1 | f2
----+----
1 | a
2 | b
3 | b
(3 rows)
local=# drop index uindx1;
DROP INDEX
local=# select test();
NOTICE: sql error
DETAIL: ERROR: update or delete on "foo" violates foreign key
constraint "$1" on "bar"
DETAIL: Key (f1)=(3) is still referenced from table "bar".
CONTEXT: PL/pgSQL function "test" line 24 at select into variables
test
-------
ERROR
(1 row)
local=# select * from foo;
f1 | f2
----+----
(0 rows)
local=# select * from dblink('dbname=remote','select * from foo') as
t(f1 int, f2 text);
f1 | f2
----+----
1 | a
2 | b
3 | b
(3 rows)
local=# \c remote
You are now connected to database "remote".
remote=# delete from bar;
DELETE 1
remote=# \c local
You are now connected to database "local".
local=# select test();
test
------
OK
(1 row)
local=# select * from foo;
f1 | f2
----+----
1 | a
2 | b
3 | b
(3 rows)
local=# select * from dblink('dbname=remote','select * from foo') as
t(f1 int, f2 text);
f1 | f2
----+----
(0 rows)
Requires previously attached patch and Postgres >= 7.4
HTH,
Joe
Oleg Lebedev wrote:
Your fix is awesome! That's exactly what I need.
What version of postgres do I need to have installed to try this patch?
I am on 7.3 now.
I plan to apply the attached to cvs tip in 24 hours or so. I don't think
it qualifies as a bug fix, and it does represent a change in user facing
behavior, so I do not intend to apply for 7.3.6 or 7.4.2.
The patch changes dblink_exec() such that an ERROR on the remote side of
the connection will generate a NOTICE on the local side, with the
dblink_exec() return value set to 'ERROR'. This allows the remote side
ERROR to be trapped and handled locally.
One question that I'd like some feedback on is the following: should the
same change be applied to other functions that might throw an ERROR
based on the remote side of the connection? For example, currently if
dblink() is used in an attempt to access a non-existent remote table, an
ERROR is thrown locally in response, killing any currently open
transaction. Thoughts?
Thanks,
Joe
Attachments:
dblink-trans-test.patchtext/plain; name=dblink-trans-test.patchDownload+27-17
Joe Conway <mail@joeconway.com> writes:
One question that I'd like some feedback on is the following: should the
same change be applied to other functions that might throw an ERROR
based on the remote side of the connection? For example, currently if
dblink() is used in an attempt to access a non-existent remote table, an
ERROR is thrown locally in response, killing any currently open
transaction. Thoughts?
It seems like a good idea to offer a consistent policy about this.
But I'm not sure that you should make a 180-degree change in error
handling without any backward-compatibility option. (In view of
recent discussions, the phrase "GUC variable" will not cross my
lips here.)
What seems like a good idea after a few moments' thought is to leave the
behavior of the various dblink_foo() functions the same as now (ie,
throw error on remote error) and add new API functions named something
like dblink_foo_noerror() that don't throw error but return a
recognizable failure code instead. My argument for this approach is
that there is no situation in which the programmer shouldn't have to
think when he writes a given call whether it will elog or return an
error indicator, because if he wants an error indicator then he is going
to have to check for it.
I'm not wedded to that in particular, but I do think you need to offer
a consistent approach with reasonable regard to backwards compatibility.
If you apply the patch as given you will surely be breaking existing
callers ... what's worse, the breakage is silent, and will only show up
under stress in the field.
regards, tom lane
Tom Lane wrote:
Joe Conway <mail@joeconway.com> writes:
One question that I'd like some feedback on is the following: should the
same change be applied to other functions that might throw an ERROR
based on the remote side of the connection? For example, currently if
dblink() is used in an attempt to access a non-existent remote table, an
ERROR is thrown locally in response, killing any currently open
transaction. Thoughts?
What seems like a good idea after a few moments' thought is to leave the
behavior of the various dblink_foo() functions the same as now (ie,
throw error on remote error) and add new API functions named something
like dblink_foo_noerror() that don't throw error but return a
recognizable failure code instead. My argument for this approach is
that there is no situation in which the programmer shouldn't have to
think when he writes a given call whether it will elog or return an
error indicator, because if he wants an error indicator then he is going
to have to check for it.
I like the idea in general, but maybe instead there should be a new
overloaded version of the existing function names that accepts an
additional bool argument. Without the argument, behavior would be as it
is now; with it, you could specify the old or new behavior.
Joe
Joe Conway <mail@joeconway.com> writes:
I like the idea in general, but maybe instead there should be a new
overloaded version of the existing function names that accepts an
additional bool argument. Without the argument, behavior would be as it
is now; with it, you could specify the old or new behavior.
Um, maybe I'm confused about the context, but aren't we talking about C
function names here? No overloading is possible in C ...
regards, tom lane
Tom Lane wrote:
Joe Conway <mail@joeconway.com> writes:
I like the idea in general, but maybe instead there should be a new
overloaded version of the existing function names that accepts an
additional bool argument. Without the argument, behavior would be as it
is now; with it, you could specify the old or new behavior.Um, maybe I'm confused about the context, but aren't we talking about C
function names here? No overloading is possible in C ...
I was thinking in terms of overloaded SQL function names. For example,
in addition to dblink_exec(text) and dblink_exec(text,text) we create
dblink_exec(text,bool) and dblink_exec(text,text,bool).
Currently both SQL versions of dblink_exec are implemented by a single C
level function. But yes, we'd need another C level function to support
the new SQL functions because there would be no way to distinguish the 2
two-argument versions otherwise. (Actually, now I'm wondering if we
could use a single C function for all four SQL versions -- between
PG_NARGS() and get_fn_expr_argtype() we should be able to figure out how
we were called, shouldn't we?)
Joe
Joe Conway wrote:
Tom Lane wrote:
Joe Conway <mail@joeconway.com> writes:
I like the idea in general, but maybe instead there should be a new
overloaded version of the existing function names that accepts an
additional bool argument. Without the argument, behavior would be as
it is now; with it, you could specify the old or new behavior.Um, maybe I'm confused about the context, but aren't we talking about C
function names here? No overloading is possible in C ...I was thinking in terms of overloaded SQL function names. For example,
in addition to dblink_exec(text) and dblink_exec(text,text) we create
dblink_exec(text,bool) and dblink_exec(text,text,bool).Currently both SQL versions of dblink_exec are implemented by a single C
level function. But yes, we'd need another C level function to support
the new SQL functions because there would be no way to distinguish the 2
two-argument versions otherwise. (Actually, now I'm wondering if we
could use a single C function for all four SQL versions -- between
PG_NARGS() and get_fn_expr_argtype() we should be able to figure out how
we were called, shouldn't we?)
The attached implements the new overloaded SQL functions as discussed
above (i.e. start with existing argument combinations, add a new bool
argument to each). I ended up with a single C function (by making use of
number and type of the arguments) for each overloaded SQL function name.
I'll commit in a day or two if there are no objections.
Thanks,
Joe