BEFORE UPDATE trigger on postgres_fdw table not work
Hi,
I noticed returning a modified record in a row-level BEFORE UPDATE trigger
on postgres_fdw foreign tables do not work. Attached patch fixes this issue.
Below are scenarios similar to postgres_fdw test to reproduce the issue.
postgres=# CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw OPTIONS (dbname 'postgres',port '5432');
postgres=# CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
postgres=# create table loc1 (f1 serial, f2 text);
postgres=# create foreign table rem1 (f1 serial, f2 text)
postgres-# server loopback options(table_name 'loc1');
postgres=# CREATE FUNCTION trig_row_before_insupdate() RETURNS TRIGGER AS $$
postgres$# BEGIN
postgres$# NEW.f2 := NEW.f2 || ' triggered !';
postgres$# RETURN NEW;
postgres$# END
postgres$# $$ language plpgsql;
postgres=# CREATE TRIGGER trig_row_before_insupd BEFORE INSERT OR UPDATE ON rem1
postgres-# FOR EACH ROW EXECUTE PROCEDURE trig_row_before_insupdate();
-- insert trigger is OK
postgres=# INSERT INTO rem1 values(1, 'insert');
postgres=# SELECT * FROM rem1;
f1 | f2
----+--------------------
1 | insert triggered !
(1 row)
-- update trigger is OK if we update f2
postgres=# UPDATE rem1 set f2 = 'update';
postgres=# SELECT * FROM rem1;
f1 | f2
----+--------------------
1 | update triggered !
Without attached patch:
postgres=# UPDATE rem1 set f1 = 10;
postgres=# SELECT * FROM rem1;
f1 | f2
----+--------------------
10 | update triggered !
(1 row)
f2 should be updated by trigger, but not.
This is because current fdw code adds only columns to RemoteSQL that were
explicitly targets of the UPDATE as follows.
postgres=# EXPLAIN (verbose, costs off)
UPDATE rem1 set f1 = 10;
QUERY PLAN
---------------------------------------------------------------------
Update on public.rem1
Remote SQL: UPDATE public.loc1 SET f1 = $2 WHERE ctid = $1 <--- not set f2
-> Foreign Scan on public.rem1
Output: 10, f2, ctid, rem1.*
Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
(5 rows)
With attached patch, f2 is updated by a trigger and "f2 = $3" is added to remote SQL
as follows.
postgres=# UPDATE rem1 set f1 = 10;
postgres=# select * from rem1;
f1 | f2
----+--------------------------------
10 | update triggered ! triggered !
(1 row)
postgres=# EXPLAIN (verbose, costs off)
postgres-# UPDATE rem1 set f1 = 10;
QUERY PLAN
-----------------------------------------------------------------------
Update on public.rem1
Remote SQL: UPDATE public.loc1 SET f1 = $2, f2 = $3 WHERE ctid = $1
-> Foreign Scan on public.rem1
Output: 10, f2, ctid, rem1.*
Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
(5 rows)
My patch adds all columns to a target list of remote update query
as in INSERT case if a before update trigger exists.
I tried to add only columns modified in trigger to the target list of
a remote update query, but I cannot find simple way to do that because
update query is built during planning phase at postgresPlanForeignModify
while it is difficult to decide which columns are modified by a trigger
until query execution.
Regards,
--
Shohei Mochizuki
TOSHIBA CORPORATION
Attachments:
fix_fdw_update_trigger.patchtext/plain; charset=UTF-8; name=fix_fdw_update_trigger.patchDownload
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index e034b03..546d97b 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -6558,6 +6558,14 @@ SELECT * from loc1;
2 | skidoo triggered !
(2 rows)
+UPDATE rem1 set f1 = 10;
+SELECT * from loc1;
+ f1 | f2
+----+--------------------------------
+ 10 | skidoo triggered ! triggered !
+ 10 | skidoo triggered ! triggered !
+(2 rows)
+
DELETE FROM rem1;
-- Add a second trigger, to check that the changes are propagated correctly
-- from trigger to trigger
@@ -6670,7 +6678,7 @@ NOTICE: trig_row_after(23, skidoo) AFTER ROW INSERT ON rem1
NOTICE: NEW: (13,"test triggered !")
ctid
--------
- (0,27)
+ (0,29)
(1 row)
-- cleanup
@@ -6774,10 +6782,10 @@ BEFORE UPDATE ON rem1
FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
EXPLAIN (verbose, costs off)
UPDATE rem1 set f2 = ''; -- can't be pushed down
- QUERY PLAN
----------------------------------------------------------------------
+ QUERY PLAN
+-----------------------------------------------------------------------
Update on public.rem1
- Remote SQL: UPDATE public.loc1 SET f2 = $2 WHERE ctid = $1
+ Remote SQL: UPDATE public.loc1 SET f1 = $2, f2 = $3 WHERE ctid = $1
-> Foreign Scan on public.rem1
Output: f1, ''::text, ctid, rem1.*
Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
@@ -7404,12 +7412,12 @@ AFTER UPDATE OR DELETE ON bar2
FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
explain (verbose, costs off)
update bar set f2 = f2 + 100;
- QUERY PLAN
---------------------------------------------------------------------------------------
+ QUERY PLAN
+--------------------------------------------------------------------------------------------------------
Update on public.bar
Update on public.bar
Foreign Update on public.bar2
- Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1 RETURNING f1, f2, f3
+ Remote SQL: UPDATE public.loct2 SET f1 = $2, f2 = $3, f3 = $4 WHERE ctid = $1 RETURNING f1, f2, f3
-> Seq Scan on public.bar
Output: bar.f1, (bar.f2 + 100), bar.ctid
-> Foreign Scan on public.bar2
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 2b6d885..6a1c7a4 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -1685,13 +1685,19 @@ postgresPlanForeignModify(PlannerInfo *root,
rel = table_open(rte->relid, NoLock);
/*
- * In an INSERT, we transmit all columns that are defined in the foreign
- * table. In an UPDATE, we transmit only columns that were explicitly
- * targets of the UPDATE, so as to avoid unnecessary data transmission.
- * (We can't do that for INSERT since we would miss sending default values
- * for columns not listed in the source statement.)
- */
- if (operation == CMD_INSERT)
+ * In an INSERT and an UPDATE of a table with a before update trigger, we
+ * transmit all columns that are defined in the foreign table. In other
+ * UPDATE, we transmit only columns that were explicitly targets of the
+ * UPDATE, so as to avoid unnecessary data transmission. (We can't do that
+ * for INSERT since we would miss sending default values for columns not
+ * listed in the source statement and for an UPDATE of a table with a
+ * before update trigger since we would miss updating colums modified in a
+ * trigger but not listed in the souce statement. )
+ */
+ if (operation == CMD_INSERT ||
+ (operation == CMD_UPDATE &&
+ rel->trigdesc &&
+ rel->trigdesc->trig_update_before_row))
{
TupleDesc tupdesc = RelationGetDescr(rel);
int attnum;
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 73852f1..3da0293 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -1552,6 +1552,8 @@ UPDATE rem1 set f2 = '';
SELECT * from loc1;
UPDATE rem1 set f2 = 'skidoo' RETURNING f2;
SELECT * from loc1;
+UPDATE rem1 set f1 = 10;
+SELECT * from loc1;
DELETE FROM rem1;
Mochizuki-san,
On 2019/05/27 10:52, Shohei Mochizuki wrote:
Hi,
I noticed returning a modified record in a row-level BEFORE UPDATE trigger
on postgres_fdw foreign tables do not work. Attached patch fixes this issue.Without attached patch:
postgres=# UPDATE rem1 set f1 = 10;
postgres=# SELECT * FROM rem1;
f1 | f2
----+--------------------
10 | update triggered !
(1 row)f2 should be updated by trigger, but not.
Indeed. That seems like a bug to me.
This is because current fdw code adds only columns to RemoteSQL that were
explicitly targets of the UPDATE as follows.
Yeah. So, the trigger execution correctly modifies the existing tuple
fetched from the remote server, but those changes are then essentially
discarded by postgres_fdw, that is, postgresExecForeignModify().
With attached patch, f2 is updated by a trigger and "f2 = $3" is added to
remote SQL
as follows.postgres=# UPDATE rem1 set f1 = 10;
postgres=# select * from rem1;
f1 | f2
----+--------------------------------
10 | update triggered ! triggered !
(1 row)postgres=# EXPLAIN (verbose, costs off)
postgres-# UPDATE rem1 set f1 = 10;
QUERY PLAN
-----------------------------------------------------------------------
Update on public.rem1
Remote SQL: UPDATE public.loc1 SET f1 = $2, f2 = $3 WHERE ctid = $1
-> Foreign Scan on public.rem1
Output: 10, f2, ctid, rem1.*
Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
(5 rows)My patch adds all columns to a target list of remote update query
as in INSERT case if a before update trigger exists.
Thanks for the patch. It seems to fix the problem as far as I can see.
I tried to add only columns modified in trigger to the target list of
a remote update query, but I cannot find simple way to do that because
update query is built during planning phase at postgresPlanForeignModify
while it is difficult to decide which columns are modified by a trigger
until query execution.
I think that the approach in your patch may be fine, but others may disagree.
We don't require row triggers' definition to declare which columns of the
input row it intends to modify. Without that information, the planner
can't determine the exact set of changed columns to transmit to the remote
server. So it's too early, for example, for PlanForeignModify() to
construct an optimal update query which transmits only the columns that
are changed, including those that may be modified by triggers. If the FDW
had delayed the construction of the exact update query to
ExecForeignUpdate(), we could build a more optimal update query, because
by then we will know *all* columns that have changed, including those that
are changed by BEFORE UPDATE row triggers if any. Maybe other FDWs beside
postgres_fdw do that already, so it's possible to rejigger postgres_fdw to
do that too. But considering that such rejiggering is only necessary for
efficiency, I'm not sure if others will agree to pursuing it, especially
if it requires too much code change. Also, in the worst case, we'll end
up generating new query for every row being changed, because the trigger
may change different columns for different rows based on some condition.
Thanks,
Amit
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
On 2019/05/27 10:52, Shohei Mochizuki wrote:
I noticed returning a modified record in a row-level BEFORE UPDATE trigger
on postgres_fdw foreign tables do not work. Attached patch fixes this issue.
This is because current fdw code adds only columns to RemoteSQL that were
explicitly targets of the UPDATE as follows.
Yeah. So, the trigger execution correctly modifies the existing tuple
fetched from the remote server, but those changes are then essentially
discarded by postgres_fdw, that is, postgresExecForeignModify().
... Also, in the worst case, we'll end
up generating new query for every row being changed, because the trigger
may change different columns for different rows based on some condition.
Perhaps, if the table has relevant BEFORE triggers, we should just abandon
our attempts to optimize away fetching/storing all columns? It seems like
another potential hazard here is a trigger needing to read a column that
is not mentioned in the SQL query.
regards, tom lane
On 2019/05/27 22:02, Tom Lane wrote:
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
On 2019/05/27 10:52, Shohei Mochizuki wrote:
I noticed returning a modified record in a row-level BEFORE UPDATE trigger
on postgres_fdw foreign tables do not work. Attached patch fixes this issue.
This is because current fdw code adds only columns to RemoteSQL that were
explicitly targets of the UPDATE as follows.Yeah. So, the trigger execution correctly modifies the existing tuple
fetched from the remote server, but those changes are then essentially
discarded by postgres_fdw, that is, postgresExecForeignModify().... Also, in the worst case, we'll end
up generating new query for every row being changed, because the trigger
may change different columns for different rows based on some condition.Perhaps, if the table has relevant BEFORE triggers, we should just abandon
our attempts to optimize away fetching/storing all columns? It seems like
another potential hazard here is a trigger needing to read a column that
is not mentioned in the SQL query.
The fetching side is fine, because rewriteTargetListUD() adds a
whole-row-var to the target list when the UPDATE / DELETE target is a
foreign table *and* there is a row trigger on the table. postgres_fdw
sees that and constructs the query to fetch all columns.
So, the only problem here is the optimizing away of storing all columns,
which the Mochizuki-san's patch seems enough to fix.
Thanks,
Amit
On 2019/05/28 12:54, Amit Langote wrote:
On 2019/05/27 22:02, Tom Lane wrote:
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
On 2019/05/27 10:52, Shohei Mochizuki wrote:
I noticed returning a modified record in a row-level BEFORE UPDATE trigger
on postgres_fdw foreign tables do not work. Attached patch fixes this issue.
This is because current fdw code adds only columns to RemoteSQL that were
explicitly targets of the UPDATE as follows.Yeah. So, the trigger execution correctly modifies the existing tuple
fetched from the remote server, but those changes are then essentially
discarded by postgres_fdw, that is, postgresExecForeignModify().... Also, in the worst case, we'll end
up generating new query for every row being changed, because the trigger
may change different columns for different rows based on some condition.Perhaps, if the table has relevant BEFORE triggers, we should just abandon
our attempts to optimize away fetching/storing all columns? It seems like
another potential hazard here is a trigger needing to read a column that
is not mentioned in the SQL query.The fetching side is fine, because rewriteTargetListUD() adds a
whole-row-var to the target list when the UPDATE / DELETE target is a
foreign table *and* there is a row trigger on the table. postgres_fdw
sees that and constructs the query to fetch all columns.So, the only problem here is the optimizing away of storing all columns,
which the Mochizuki-san's patch seems enough to fix.
Amit-san, Tom,
Thanks for the comments.
I checked other scenario. If a foreign table has AFTER trigger, remote update
query must return all columns and these cases are added at deparseReturningList
and covered by following existing test cases.
EXPLAIN (verbose, costs off)
UPDATE rem1 set f2 = ''; -- can't be pushed down
QUERY PLAN
-------------------------------------------------------------------------------
Update on public.rem1
Remote SQL: UPDATE public.loc1 SET f2 = $2 WHERE ctid = $1 RETURNING f1, f2
-> Foreign Scan on public.rem1
Output: f1, ''::text, ctid, rem1.*
Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
(5 rows)
Regards,
--
Shohei Mochizuki
TOSHIBA CORPORATION
Mochizuki-san,
On 2019/05/28 13:10, Shohei Mochizuki wrote:
On 2019/05/28 12:54, Amit Langote wrote:
On 2019/05/27 22:02, Tom Lane wrote:
Perhaps, if the table has relevant BEFORE triggers, we should just abandon
our attempts to optimize away fetching/storing all columns? It seems like
another potential hazard here is a trigger needing to read a column that
is not mentioned in the SQL query.The fetching side is fine, because rewriteTargetListUD() adds a
whole-row-var to the target list when the UPDATE / DELETE target is a
foreign table *and* there is a row trigger on the table. postgres_fdw
sees that and constructs the query to fetch all columns.So, the only problem here is the optimizing away of storing all columns,
which the Mochizuki-san's patch seems enough to fix.Amit-san, Tom,
Thanks for the comments.I checked other scenario. If a foreign table has AFTER trigger, remote update
query must return all columns and these cases are added at
deparseReturningList
and covered by following existing test cases.EXPLAIN (verbose, costs off)
UPDATE rem1 set f2 = ''; -- can't be pushed down
QUERY PLAN
-------------------------------------------------------------------------------Update on public.rem1
Remote SQL: UPDATE public.loc1 SET f2 = $2 WHERE ctid = $1 RETURNING
f1, f2
-> Foreign Scan on public.rem1
Output: f1, ''::text, ctid, rem1.*
Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
(5 rows)
Ah, I had missed the AFTER triggers case, which seems to be working fine
as you've shown here.
Thanks,
Amit
Hi,
On Tue, May 28, 2019 at 12:54 PM Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
On 2019/05/27 22:02, Tom Lane wrote:
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
On 2019/05/27 10:52, Shohei Mochizuki wrote:
I noticed returning a modified record in a row-level BEFORE UPDATE trigger
on postgres_fdw foreign tables do not work. Attached patch fixes this issue.
This is because current fdw code adds only columns to RemoteSQL that were
explicitly targets of the UPDATE as follows.Yeah. So, the trigger execution correctly modifies the existing tuple
fetched from the remote server, but those changes are then essentially
discarded by postgres_fdw, that is, postgresExecForeignModify().
Perhaps, if the table has relevant BEFORE triggers, we should just abandon
our attempts to optimize away fetching/storing all columns? It seems like
another potential hazard here is a trigger needing to read a column that
is not mentioned in the SQL query.
So, the only problem here is the optimizing away of storing all columns,
which the Mochizuki-san's patch seems enough to fix.
Will look into the patch after returning from PGCon, unless somebody wants to.
Best regards,
Etsuro Fujita
On Tue, May 28, 2019 at 3:40 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
On Tue, May 28, 2019 at 12:54 PM Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:On 2019/05/27 22:02, Tom Lane wrote:
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
On 2019/05/27 10:52, Shohei Mochizuki wrote:
I noticed returning a modified record in a row-level BEFORE UPDATE trigger
on postgres_fdw foreign tables do not work. Attached patch fixes this issue.
This is because current fdw code adds only columns to RemoteSQL that were
explicitly targets of the UPDATE as follows.Yeah. So, the trigger execution correctly modifies the existing tuple
fetched from the remote server, but those changes are then essentially
discarded by postgres_fdw, that is, postgresExecForeignModify().Perhaps, if the table has relevant BEFORE triggers, we should just abandon
our attempts to optimize away fetching/storing all columns? It seems like
another potential hazard here is a trigger needing to read a column that
is not mentioned in the SQL query.So, the only problem here is the optimizing away of storing all columns,
which the Mochizuki-san's patch seems enough to fix.
Yeah, I think so too, because in UPDATE, we fetch all columns from the
remote (even if the target table doesn't have relevant triggers).
Will look into the patch after returning from PGCon, unless somebody wants to.
I'll look into the patch more closely tomorrow. Sorry for the delay.
As I said in another email today, I felt a bit under the weather last
week.
Best regards,
Etsuro Fujita
Fujita-san,
Thanks for the comments.
On Mon, Jun 10, 2019 at 9:04 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
On Tue, May 28, 2019 at 12:54 PM Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:On 2019/05/27 22:02, Tom Lane wrote:
Perhaps, if the table has relevant BEFORE triggers, we should just abandon
our attempts to optimize away fetching/storing all columns? It seems like
another potential hazard here is a trigger needing to read a column that
is not mentioned in the SQL query.So, the only problem here is the optimizing away of storing all columns,
which the Mochizuki-san's patch seems enough to fix.Yeah, I think so too, because in UPDATE, we fetch all columns from the
remote (even if the target table doesn't have relevant triggers).
Hmm, your parenthetical remark contradicts my observation. I can see
that not all columns are fetched if there are no triggers present.
create extension postgres_fdw ;
create server loopback foreign data wrapper postgres_fdw ;
create user mapping for current_user server loopback;
create table loc1 (a int, b int);
create foreign table rem1 (a int, b int generated always as (a+1)
stored) server loopback options (table_name 'loc1');
explain verbose update rem1 set a = 1;
QUERY PLAN
─────────────────────────────────────────────────────────────────────────────
Update on public.rem1 (cost=100.00..182.27 rows=2409 width=14)
Remote SQL: UPDATE public.loc1 SET a = $2, b = $3 WHERE ctid = $1
-> Foreign Scan on public.rem1 (cost=100.00..182.27 rows=2409 width=14)
Output: 1, b, ctid
Remote SQL: SELECT b, ctid FROM public.loc1 FOR UPDATE
(5 rows)
whereas, all columns are fetched if a trigger is defined:
create or replace function trigfunc() returns trigger as $$ begin
raise notice '%', new; return new; end; $$ language plpgsql;
create trigger rem1_trig before insert or update on rem1 for each row
execute function trigfunc();
explain verbose update rem1 set a = 1;
QUERY PLAN
─────────────────────────────────────────────────────────────────────────────
Update on public.rem1 (cost=100.00..147.23 rows=1241 width=46)
Remote SQL: UPDATE public.loc1 SET a = $2, b = $3 WHERE ctid = $1
-> Foreign Scan on public.rem1 (cost=100.00..147.23 rows=1241 width=46)
Output: 1, b, ctid, rem1.*
Remote SQL: SELECT a, b, ctid FROM public.loc1 FOR UPDATE
(5 rows)
Am I missing something?
Thanks,
Amit
I forgot to send this by "Reply ALL".
Show quoted text
On Tue, Jun 11, 2019 at 10:51 AM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
Amit-san,
On Tue, Jun 11, 2019 at 10:30 AM Amit Langote <amitlangote09@gmail.com> wrote:
On Mon, Jun 10, 2019 at 9:04 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
On Tue, May 28, 2019 at 12:54 PM Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:On 2019/05/27 22:02, Tom Lane wrote:
Perhaps, if the table has relevant BEFORE triggers, we should just abandon
our attempts to optimize away fetching/storing all columns? It seems like
another potential hazard here is a trigger needing to read a column that
is not mentioned in the SQL query.So, the only problem here is the optimizing away of storing all columns,
which the Mochizuki-san's patch seems enough to fix.Yeah, I think so too, because in UPDATE, we fetch all columns from the
remote (even if the target table doesn't have relevant triggers).Hmm, your parenthetical remark contradicts my observation. I can see
that not all columns are fetched if there are no triggers present.create extension postgres_fdw ;
create server loopback foreign data wrapper postgres_fdw ;
create user mapping for current_user server loopback;
create table loc1 (a int, b int);
create foreign table rem1 (a int, b int generated always as (a+1)
stored) server loopback options (table_name 'loc1');explain verbose update rem1 set a = 1;
QUERY PLAN
─────────────────────────────────────────────────────────────────────────────
Update on public.rem1 (cost=100.00..182.27 rows=2409 width=14)
Remote SQL: UPDATE public.loc1 SET a = $2, b = $3 WHERE ctid = $1
-> Foreign Scan on public.rem1 (cost=100.00..182.27 rows=2409 width=14)
Output: 1, b, ctid
Remote SQL: SELECT b, ctid FROM public.loc1 FOR UPDATE
(5 rows)Sorry, my explanation was not good; I should have said that in UPDATE,
we fetch columns not mentioned in the SQL query as well (even if the
target table doesn't have relevant triggers), so there would be no
hazard Tom mentioned above, IIUC.Best regards,
Etsuro Fujita
Import Notes
Reply to msg id not found: CAPmGK15TMX7HrQvrKmkJp_nMO8MVpUP8NMUC2OAMSFtUsRCNQ@mail.gmail.com
On Tue, Jun 11, 2019 at 10:51 AM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
On Tue, Jun 11, 2019 at 10:30 AM Amit Langote <amitlangote09@gmail.com> wrote:
On Mon, Jun 10, 2019 at 9:04 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
On Tue, May 28, 2019 at 12:54 PM Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:On 2019/05/27 22:02, Tom Lane wrote:
Perhaps, if the table has relevant BEFORE triggers, we should just abandon
our attempts to optimize away fetching/storing all columns? It seems like
another potential hazard here is a trigger needing to read a column that
is not mentioned in the SQL query.So, the only problem here is the optimizing away of storing all columns,
which the Mochizuki-san's patch seems enough to fix.Yeah, I think so too, because in UPDATE, we fetch all columns from the
remote (even if the target table doesn't have relevant triggers).Hmm, your parenthetical remark contradicts my observation. I can see
that not all columns are fetched if there are no triggers present.
[ ... ]
Sorry, my explanation was not good; I should have said that in UPDATE,
we fetch columns not mentioned in the SQL query as well (even if the
target table doesn't have relevant triggers), so there would be no
hazard Tom mentioned above, IIUC.
Sorry but I still don't understand. Sure, *some* columns of the table
not present in the UPDATE statement are fetched, but the column(s)
being assigned to are not fetched.
-- before creating a trigger
explain verbose update rem1 set a = 1;
QUERY PLAN
─────────────────────────────────────────────────────────────────────────────
Update on public.rem1 (cost=100.00..182.27 rows=2409 width=14)
Remote SQL: UPDATE public.loc1 SET a = $2, b = $3 WHERE ctid = $1
-> Foreign Scan on public.rem1 (cost=100.00..182.27 rows=2409 width=14)
Output: 1, b, ctid
Remote SQL: SELECT b, ctid FROM public.loc1 FOR UPDATE
In this case, column 'a' is not present in the rows that are fetched
to be updated, because it's only assigned to and not referenced
anywhere (such as in WHERE clauses). Which is understandable, because
fetching it would be pointless.
If there is a trigger present though, the trigger may want to
reference 'a' in the OLD rows, so it's fetched along with any other
columns that are present in the table, because they may be referenced
too.
-- after creating a trigger
explain verbose update rem1 set a = 1;
QUERY PLAN
─────────────────────────────────────────────────────────────────────────────
Update on public.rem1 (cost=100.00..147.23 rows=1241 width=46)
Remote SQL: UPDATE public.loc1 SET a = $2, b = $3 WHERE ctid = $1
-> Foreign Scan on public.rem1 (cost=100.00..147.23 rows=1241 width=46)
Output: 1, b, ctid, rem1.*
Remote SQL: SELECT a, b, ctid FROM public.loc1 FOR UPDATE
(5 rows)
Thanks,
Amit
Amit-san,
On Tue, Jun 11, 2019 at 1:31 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Tue, Jun 11, 2019 at 10:51 AM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
Sorry, my explanation was not good; I should have said that in UPDATE,
we fetch columns not mentioned in the SQL query as well (even if the
target table doesn't have relevant triggers), so there would be no
hazard Tom mentioned above, IIUC.Sorry but I still don't understand. Sure, *some* columns of the table
not present in the UPDATE statement are fetched, but the column(s)
being assigned to are not fetched.-- before creating a trigger
explain verbose update rem1 set a = 1;
QUERY PLAN
─────────────────────────────────────────────────────────────────────────────
Update on public.rem1 (cost=100.00..182.27 rows=2409 width=14)
Remote SQL: UPDATE public.loc1 SET a = $2, b = $3 WHERE ctid = $1
-> Foreign Scan on public.rem1 (cost=100.00..182.27 rows=2409 width=14)
Output: 1, b, ctid
Remote SQL: SELECT b, ctid FROM public.loc1 FOR UPDATEIn this case, column 'a' is not present in the rows that are fetched
to be updated, because it's only assigned to and not referenced
anywhere (such as in WHERE clauses). Which is understandable, because
fetching it would be pointless.
Right, but what I'm saying here is what you call "some columns". For
UPDATE, the planner adds any unassigned columns to the targetlist (see
expand_targetlist()), so the reltarget for the target relation would
include such columns, leading to fetching them from the remote in
postgres_fdw even if the target table doesn't have relevant triggers.
Best regards,
Etsuro Fujita
Fujita-san,
On Tue, Jun 11, 2019 at 6:09 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
On Tue, Jun 11, 2019 at 1:31 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Tue, Jun 11, 2019 at 10:51 AM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
Sorry, my explanation was not good; I should have said that in UPDATE,
we fetch columns not mentioned in the SQL query as well (even if the
target table doesn't have relevant triggers), so there would be no
hazard Tom mentioned above, IIUC.Sorry but I still don't understand. Sure, *some* columns of the table
not present in the UPDATE statement are fetched, but the column(s)
being assigned to are not fetched.-- before creating a trigger
explain verbose update rem1 set a = 1;
QUERY PLAN
─────────────────────────────────────────────────────────────────────────────
Update on public.rem1 (cost=100.00..182.27 rows=2409 width=14)
Remote SQL: UPDATE public.loc1 SET a = $2, b = $3 WHERE ctid = $1
-> Foreign Scan on public.rem1 (cost=100.00..182.27 rows=2409 width=14)
Output: 1, b, ctid
Remote SQL: SELECT b, ctid FROM public.loc1 FOR UPDATEIn this case, column 'a' is not present in the rows that are fetched
to be updated, because it's only assigned to and not referenced
anywhere (such as in WHERE clauses). Which is understandable, because
fetching it would be pointless.Right, but what I'm saying here is what you call "some columns". For
UPDATE, the planner adds any unassigned columns to the targetlist (see
expand_targetlist()), so the reltarget for the target relation would
include such columns, leading to fetching them from the remote in
postgres_fdw even if the target table doesn't have relevant triggers.
Thanks for clarifying again. I now understand that you didn't mean
*all* columns.
It's just that I was interpreting your words in the context of Tom's
concern, so I thought you are implying that *all* columns are always
fetched, irrespective of whether triggers are present (Tom's concern)
or not. Reading Tom's email again, he didn't say *all* columns, but
maybe meant so, because that's what's needed for triggers to work.
Thanks,
Amit
On Mon, Jun 10, 2019 at 9:04 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
I'll look into the patch more closely tomorrow.
I did that, but couldn't find any issue about the patch. Here is an
updated version of the patch. Changes are:
* Reworded the comments a bit in postgresPlanFoereignModify the
original patch modified
* Added the commit message
Does that make sense? I think this is an oversight in commit
7cbe57c34, so I'll back-patch all the way back to 9.4, if there are no
objections.
Best regards,
Etsuro Fujita
Attachments:
fix_fdw_update_trigger-efujita.patchapplication/octet-stream; name=fix_fdw_update_trigger-efujita.patchDownload
From 0c4a1b955e627d935a531f84387cfec00ad2fa91 Mon Sep 17 00:00:00 2001
From: Etsuro Fujita <efujita@postgresql.org>
Date: Wed, 12 Jun 2019 15:04:34 +0900
Subject: [PATCH] postgres_fdw: Account for triggers in non-direct remote
UPDATE planning.
Previously, in postgresPlanForeignModify, we planned an UPDATE operation
on a foreign table so that we transmit only columns that were explicitly
targets of the UPDATE, so as to avoid unnecessary data transmission, but
if there were BEFORE ROW UPDATE triggers on the foreign table, those
trigger might change values for non-target columns; in which case we
would miss sending changed values for those columns. Prevent optimizing
away transmitting all columns in that case.
This is an oversight in commit 7cbe57c34 which added triggers on foreign
tables, so back-patch all the way back to 9.4 where that came in.
Author: Shohei Mochizuki
Reviewed-by: Amit Langote
Discussion: https://postgr.es/m/201905270152.x4R1q3qi014550@toshiba.co.jp
---
.../postgres_fdw/expected/postgres_fdw.out | 22 +++++++++++++------
contrib/postgres_fdw/postgres_fdw.c | 19 +++++++++++-----
contrib/postgres_fdw/sql/postgres_fdw.sql | 2 ++
3 files changed, 30 insertions(+), 13 deletions(-)
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index e034b030f1..546d97ba98 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -6558,6 +6558,14 @@ SELECT * from loc1;
2 | skidoo triggered !
(2 rows)
+UPDATE rem1 set f1 = 10;
+SELECT * from loc1;
+ f1 | f2
+----+--------------------------------
+ 10 | skidoo triggered ! triggered !
+ 10 | skidoo triggered ! triggered !
+(2 rows)
+
DELETE FROM rem1;
-- Add a second trigger, to check that the changes are propagated correctly
-- from trigger to trigger
@@ -6670,7 +6678,7 @@ NOTICE: trig_row_after(23, skidoo) AFTER ROW INSERT ON rem1
NOTICE: NEW: (13,"test triggered !")
ctid
--------
- (0,27)
+ (0,29)
(1 row)
-- cleanup
@@ -6774,10 +6782,10 @@ BEFORE UPDATE ON rem1
FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
EXPLAIN (verbose, costs off)
UPDATE rem1 set f2 = ''; -- can't be pushed down
- QUERY PLAN
----------------------------------------------------------------------
+ QUERY PLAN
+-----------------------------------------------------------------------
Update on public.rem1
- Remote SQL: UPDATE public.loc1 SET f2 = $2 WHERE ctid = $1
+ Remote SQL: UPDATE public.loc1 SET f1 = $2, f2 = $3 WHERE ctid = $1
-> Foreign Scan on public.rem1
Output: f1, ''::text, ctid, rem1.*
Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
@@ -7404,12 +7412,12 @@ AFTER UPDATE OR DELETE ON bar2
FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
explain (verbose, costs off)
update bar set f2 = f2 + 100;
- QUERY PLAN
---------------------------------------------------------------------------------------
+ QUERY PLAN
+--------------------------------------------------------------------------------------------------------
Update on public.bar
Update on public.bar
Foreign Update on public.bar2
- Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1 RETURNING f1, f2, f3
+ Remote SQL: UPDATE public.loct2 SET f1 = $2, f2 = $3, f3 = $4 WHERE ctid = $1 RETURNING f1, f2, f3
-> Seq Scan on public.bar
Output: bar.f1, (bar.f2 + 100), bar.ctid
-> Foreign Scan on public.bar2
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 1b09aa5a01..0f89358179 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -1686,12 +1686,19 @@ postgresPlanForeignModify(PlannerInfo *root,
/*
* In an INSERT, we transmit all columns that are defined in the foreign
- * table. In an UPDATE, we transmit only columns that were explicitly
- * targets of the UPDATE, so as to avoid unnecessary data transmission.
- * (We can't do that for INSERT since we would miss sending default values
- * for columns not listed in the source statement.)
- */
- if (operation == CMD_INSERT)
+ * table. In an UPDATE, if there are BEFORE ROW UPDATE triggers on the
+ * foreign table, we transmit all columns like INSERT; else we transmit
+ * only columns that were explicitly targets of the UPDATE, so as to avoid
+ * unnecessary data transmission. (We can't do that for INSERT since we
+ * would miss sending default values for columns not listed in the source
+ * statement, and for UPDATE if BEFORE ROW UPDATE triggers since those
+ * triggers might change values for non-target columns, in which case we
+ * would miss sending changed values for those columns.)
+ */
+ if (operation == CMD_INSERT ||
+ (operation == CMD_UPDATE &&
+ rel->trigdesc &&
+ rel->trigdesc->trig_update_before_row))
{
TupleDesc tupdesc = RelationGetDescr(rel);
int attnum;
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 73852f1ae6..3da0293c52 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -1552,6 +1552,8 @@ UPDATE rem1 set f2 = '';
SELECT * from loc1;
UPDATE rem1 set f2 = 'skidoo' RETURNING f2;
SELECT * from loc1;
+UPDATE rem1 set f1 = 10;
+SELECT * from loc1;
DELETE FROM rem1;
--
2.19.2
Fujita-san,
On Wed, Jun 12, 2019 at 3:14 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
I did that, but couldn't find any issue about the patch. Here is an
updated version of the patch.
Thanks for the updating the patch.
Changes are:
* Reworded the comments a bit in postgresPlanFoereignModify the
original patch modified
+ * statement, and for UPDATE if BEFORE ROW UPDATE triggers since those
+ * triggers might change values for non-target columns, in which case we
First line seems to be missing a word or two. Maybe:
+ * statement, and for UPDATE if there are BEFORE ROW UPDATE triggers,
+ * since those triggers might change values for non-target columns, in
Thanks,
Amit
Amit-san,
On Wed, Jun 12, 2019 at 3:33 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Wed, Jun 12, 2019 at 3:14 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
* Reworded the comments a bit in postgresPlanFoereignModify the
original patch modified+ * statement, and for UPDATE if BEFORE ROW UPDATE triggers since those + * triggers might change values for non-target columns, in which case weFirst line seems to be missing a word or two. Maybe:
+ * statement, and for UPDATE if there are BEFORE ROW UPDATE triggers, + * since those triggers might change values for non-target columns, in
Actually, I omitted such words to shorten the comment, but I think
this improves the readability, so I'll update the comment that way.
Thanks for the review!
Best regards,
Etsuro Fujita
Fujita-san,
On Mon, Jun 10, 2019 at 9:04 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:I'll look into the patch more closely tomorrow.
I did that, but couldn't find any issue about the patch. Here is an updated
version of the patch. Changes are:* Reworded the comments a bit in postgresPlanFoereignModify the original
patch modified
* Added the commit message
Thanks for the update.
I think your wording is more understandable than my original patch.
Regards,
--
Shohei Mochizuki
TOSHIBA CORPORATION
Mochizuki-san,
On Wed, Jun 12, 2019 at 6:08 PM <shohei.mochizuki@toshiba.co.jp> wrote:
On Mon, Jun 10, 2019 at 9:04 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:I'll look into the patch more closely tomorrow.
I did that, but couldn't find any issue about the patch. Here is an updated
version of the patch. Changes are:* Reworded the comments a bit in postgresPlanFoereignModify the original
patch modified
* Added the commit message
I think your wording is more understandable than my original patch.
Great! I've pushed the patch after updating the comment as proposed
by Amit-san yesterday, and adding a regression test case checking
EXPLAIN because otherwise we wouldn't have any EXPLAIN results in
v9.4.
Thanks for the report and fix!
Best regards,
Etsuro Fujita
Fujita-san,
From: Etsuro Fujita [mailto:etsuro.fujita@gmail.com]
Mochizuki-san,
On Wed, Jun 12, 2019 at 6:08 PM <shohei.mochizuki@toshiba.co.jp> wrote:
On Mon, Jun 10, 2019 at 9:04 PM Etsuro Fujita
<etsuro.fujita@gmail.com>
wrote:I'll look into the patch more closely tomorrow.
I did that, but couldn't find any issue about the patch. Here is an
updated version of the patch. Changes are:* Reworded the comments a bit in postgresPlanFoereignModify the
original patch modified
* Added the commit messageI think your wording is more understandable than my original patch.
Great! I've pushed the patch after updating the comment as proposed by
Amit-san yesterday, and adding a regression test case checking EXPLAIN
because otherwise we wouldn't have any EXPLAIN results in v9.4.Thanks for the report and fix!
Thanks for the commit!
--
Shohei Mochizuki