join pushdown and issue with foreign update
Hi.
There's issue with join pushdown after
commit 86dc90056dfdbd9d1b891718d2e5614e3e432f35
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Wed Mar 31 11:52:34 2021 -0400
Rework planning and execution of UPDATE and DELETE
To make sure that join pushdown path selected, one can patch
contrib/postgres_fdw/postgres_fdw.c in the following way:
diff --git a/contrib/postgres_fdw/postgres_fdw.c
b/contrib/postgres_fdw/postgres_fdw.c
index c48a421e88b..c2bf6833050 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -5959,6 +5959,8 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
/* Estimate costs for bare join relation */
estimate_path_cost_size(root, joinrel, NIL, NIL, NULL,
&rows, &width,
&startup_cost, &total_cost);
+
+ startup_cost = total_cost = 0;
/* Now update this information in the joinrel */
joinrel->rows = rows;
joinrel->reltarget->width = width;
Now, this simple test shows the issue:
create extension postgres_fdw;
DO $d$
BEGIN
EXECUTE $$CREATE SERVER loopback FOREIGN DATA WRAPPER
postgres_fdw
OPTIONS (dbname '$$||current_database()||$$',
port '$$||current_setting('port')||$$')$$;
END;
$d$;
CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
CREATE TABLE base_tbl (a int, b int);
CREATE FOREIGN TABLE remote_tbl (a int, b int)
SERVER loopback OPTIONS (table_name 'base_tbl');
insert into remote_tbl select generate_series(1,100),
generate_series(1,100);
explain verbose update remote_tbl d set a= case when current_timestamp>
'2012-02-02'::timestamp then 5 else 6 end FROM remote_tbl AS t (a, b)
WHERE d.a = (t.a);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Update on public.remote_tbl d (cost=0.00..42.35 rows=0 width=0)
Remote SQL: UPDATE public.base_tbl SET a = $2 WHERE ctid = $1
-> Foreign Scan (cost=0.00..42.35 rows=8470 width=74)
Output: CASE WHEN (CURRENT_TIMESTAMP > '2012-02-02
00:00:00'::timestamp without time zone) THEN 5 ELSE 6 END, d.ctid, d.*,
t.*
Relations: (public.remote_tbl d) INNER JOIN (public.remote_tbl
t)
Remote SQL: SELECT r1.ctid, CASE WHEN (r1.*)::text IS NOT NULL
THEN ROW(r1.a, r1.b) END, CASE WHEN (r2.*)::text IS NOT NULL THEN
ROW(r2.a, r2.b) END FROM (public.base_tbl r1 INNER JOIN public.base_tbl
r2 ON (((r1.a = r2.a)))) FOR UPDATE OF r1
-> Merge Join (cost=433.03..566.29 rows=8470 width=70)
Output: d.ctid, d.*, t.*
Merge Cond: (d.a = t.a)
-> Sort (cost=211.00..214.10 rows=1241 width=42)
Output: d.ctid, d.*, d.a
Sort Key: d.a
-> Foreign Scan on public.remote_tbl d
(cost=100.00..147.23 rows=1241 width=42)
Output: d.ctid, d.*, d.a
Remote SQL: SELECT a, b, ctid FROM
public.base_tbl FOR UPDATE
-> Sort (cost=222.03..225.44 rows=1365 width=36)
Output: t.*, t.a
Sort Key: t.a
-> Foreign Scan on public.remote_tbl t
(cost=100.00..150.95 rows=1365 width=36)
Output: t.*, t.a
Remote SQL: SELECT a, b FROM public.base_tbl
update remote_tbl d set a= case when current_timestamp>
'2012-02-02'::timestamp then 5 else 6 end FROM remote_tbl AS t (a, b)
WHERE d.a = (t.a);
You'll get
ERROR: input of anonymous composite types is not implemented
CONTEXT: whole-row reference to foreign table "remote_tbl"
make_tuple_from_result_row() (called by fetch_more_data()), will try to
call InputFunctionCall() for ROW(r1.a, r1.b) and will get error in
record_in().
Here ROW(r2.a, r2.b) would have attribute type id, corresponding to
remote_tbl, but ROW(r1.a, r1.b) would have atttypid 2249 (RECORD).
Before 86dc90056dfdbd9d1b891718d2e5614e3e432f35 the plan would be
different and looked like
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Update on public.remote_tbl d (cost=0.00..73.54 rows=14708 width=46)
Remote SQL: UPDATE public.base_tbl SET a = $2 WHERE ctid = $1
-> Foreign Scan (cost=0.00..73.54 rows=14708 width=46)
Output: CASE WHEN (CURRENT_TIMESTAMP > '2012-02-02
00:00:00'::timestamp without time zone) THEN d.a ELSE 6 END, d.b,
d.ctid, t.*
Relations: (public.remote_tbl d) INNER JOIN (public.remote_tbl
t)
Remote SQL: SELECT r1.a, r1.b, r1.ctid, CASE WHEN (r2.*)::text
IS NOT NULL THEN ROW(r2.a, r2.b) END FROM (public.base_tbl r1 INNER JOIN
public.base_tbl r2 ON (((r1.a = r2.a)))) FOR UPDATE OF r1
-> Merge Join (cost=516.00..747.39 rows=14708 width=46)
Output: d.a, d.b, d.ctid, t.*
Merge Cond: (d.a = t.a)
-> Sort (cost=293.97..299.35 rows=2155 width=14)
Output: d.a, d.b, d.ctid
Sort Key: d.a
-> Foreign Scan on public.remote_tbl d
(cost=100.00..174.65 rows=2155 width=14)
Output: d.a, d.b, d.ctid
Remote SQL: SELECT a, b, ctid FROM
public.base_tbl FOR UPDATE
-> Sort (cost=222.03..225.44 rows=1365 width=36)
Output: t.*, t.a
Sort Key: t.a
-> Foreign Scan on public.remote_tbl t
(cost=100.00..150.95 rows=1365 width=36)
Output: t.*, t.a
Remote SQL: SELECT a, b FROM public.base_tbl
Here ROW(r2.a, r2.b) would have attribute type id, corresponding to
remote_tbl.
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Alexander Pyhalov писал 2021-05-31 15:39:
Hi.
There's issue with join pushdown after
commit 86dc90056dfdbd9d1b891718d2e5614e3e432f35
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Wed Mar 31 11:52:34 2021 -0400
...
You'll get
ERROR: input of anonymous composite types is not implemented
CONTEXT: whole-row reference to foreign table "remote_tbl"make_tuple_from_result_row() (called by fetch_more_data()), will try
to call InputFunctionCall() for ROW(r1.a, r1.b) and will get error in
record_in().Here ROW(r2.a, r2.b) would have attribute type id, corresponding to
remote_tbl, but ROW(r1.a, r1.b) would have atttypid 2249 (RECORD).
The issue seems to be that add_row_identity_columns() adds RECORD var to
the query.
Adding var with table's relation type fixes this issue, but breaks
update of
partitioned tables, as we add "wholerow" with type of one child relation
and then
try to use it with another child (of different table type).
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Hi,
On Tue, Jun 1, 2021 at 1:04 AM Alexander Pyhalov
<a.pyhalov@postgrespro.ru> wrote:
Alexander Pyhalov писал 2021-05-31 15:39:
Hi.
There's issue with join pushdown after
commit 86dc90056dfdbd9d1b891718d2e5614e3e432f35
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Wed Mar 31 11:52:34 2021 -0400...
You'll get
ERROR: input of anonymous composite types is not implemented
CONTEXT: whole-row reference to foreign table "remote_tbl"
Interesting, thanks for reporting this. This sounds like a regression
on 86dc90056's part.
make_tuple_from_result_row() (called by fetch_more_data()), will try
to call InputFunctionCall() for ROW(r1.a, r1.b) and will get error in
record_in().Here ROW(r2.a, r2.b) would have attribute type id, corresponding to
remote_tbl, but ROW(r1.a, r1.b) would have atttypid 2249 (RECORD).The issue seems to be that add_row_identity_columns() adds RECORD var to
the query.
Adding var with table's relation type fixes this issue, but breaks
update of
partitioned tables, as we add "wholerow" with type of one child relation
and then
try to use it with another child (of different table type).
Perhaps, we can get away with adding the wholerow Var with the target
relation's reltype when the target foreign table is not a "child"
relation, but the root target relation itself. Maybe like the
attached?
--
Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
wholerow-rowid-var-vartype-fix.patchapplication/octet-stream; name=wholerow-rowid-var-vartype-fix.patchDownload
diff --git a/src/backend/optimizer/util/appendinfo.c b/src/backend/optimizer/util/appendinfo.c
index af46f581ac..6e88fe6175 100644
--- a/src/backend/optimizer/util/appendinfo.c
+++ b/src/backend/optimizer/util/appendinfo.c
@@ -752,7 +752,8 @@ find_appinfos_by_relids(PlannerInfo *root, Relids relids, int *nappinfos)
*
* The Var must be equal(), aside from varno, to any other row-identity
* column with the same rowid_name. Thus, for example, "wholerow"
- * row identities had better use vartype == RECORDOID.
+ * row identities had better use vartype == RECORDOID, except in the cases
+ * where it is clear that there will be only one target relation.
*
* rtindex is currently redundant with rowid_var->varno, but we specify
* it as a separate parameter in case this is ever generalized to support
@@ -846,6 +847,36 @@ add_row_identity_var(PlannerInfo *root, Var *orig_var,
root->processed_tlist = lappend(root->processed_tlist, tle);
}
+/*
+ * add_row_identity_var_wholerow
+ * Wrapper over add_row_identity_var() to register a "wholerow"
+ * row-identity column
+ *
+ * This has been made into a function unlike for other row-identity Vars,
+ * because the vartype that must be assigned to the Var differs based on
+ * whether it's being added for the root or for a child target relation.
+ * FDWs should call this for example instead of making the Var by themselves.
+ */
+void
+add_row_identity_var_wholerow(PlannerInfo *root, Index rtindex,
+ Relation target_relation)
+{
+ Var *var;
+ Oid wholerow_type;
+
+ if (rtindex == root->parse->resultRelation)
+ wholerow_type = RelationGetForm(target_relation)->reltype;
+ else
+ wholerow_type = RECORDOID;
+ var = makeVar(rtindex,
+ InvalidAttrNumber,
+ wholerow_type,
+ -1,
+ InvalidOid,
+ 0);
+ add_row_identity_var(root, var, rtindex, "wholerow");
+}
+
/*
* add_row_identity_columns
*
@@ -909,15 +940,7 @@ add_row_identity_columns(PlannerInfo *root, Index rtindex,
(target_relation->trigdesc &&
(target_relation->trigdesc->trig_delete_after_row ||
target_relation->trigdesc->trig_delete_before_row)))
- {
- var = makeVar(rtindex,
- InvalidAttrNumber,
- RECORDOID,
- -1,
- InvalidOid,
- 0);
- add_row_identity_var(root, var, rtindex, "wholerow");
- }
+ add_row_identity_var_wholerow(root, rtindex, target_relation);
}
}
diff --git a/src/include/optimizer/appendinfo.h b/src/include/optimizer/appendinfo.h
index 39d04d9cc0..97442a4ee1 100644
--- a/src/include/optimizer/appendinfo.h
+++ b/src/include/optimizer/appendinfo.h
@@ -42,6 +42,8 @@ extern AppendRelInfo **find_appinfos_by_relids(PlannerInfo *root,
Relids relids, int *nappinfos);
extern void add_row_identity_var(PlannerInfo *root, Var *rowid_var,
Index rtindex, const char *rowid_name);
+extern void add_row_identity_var_wholerow(PlannerInfo *root, Index rtindex,
+ Relation target_relation);
extern void add_row_identity_columns(PlannerInfo *root, Index rtindex,
RangeTblEntry *target_rte,
Relation target_relation);
Amit Langote писал 2021-06-01 15:47:
Perhaps, we can get away with adding the wholerow Var with the target
relation's reltype when the target foreign table is not a "child"
relation, but the root target relation itself. Maybe like the
attached?
Hi.
I think the patch fixes this issue, but it still preserves chances to
get RECORD in fetch_more_data()
(at least with combination with asymmetric partition-wise join).
What about the following patch?
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Attachments:
0001-Fix-foreign-update.patchtext/x-diff; name=0001-Fix-foreign-update.patchDownload
From 806a9b4fbf376e8fa75669229c8401ea76dd1cb2 Mon Sep 17 00:00:00 2001
From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
Date: Tue, 1 Jun 2021 12:00:08 +0300
Subject: [PATCH] [PGPRO-4769] Fix foreign update
Tags: shardman
---
.../postgres_fdw/expected/postgres_fdw.out | 66 +++++++++----------
src/backend/optimizer/util/appendinfo.c | 53 ++++++++++++---
src/include/optimizer/appendinfo.h | 2 +
3 files changed, 78 insertions(+), 43 deletions(-)
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 7df30010f25..d424a1a5365 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -7258,19 +7258,19 @@ select * from bar where f1 in (select f1 from foo) for share;
-- Check UPDATE with inherited target and an inherited source table
explain (verbose, costs off)
update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
- QUERY PLAN
--------------------------------------------------------------------------------------------------------
+ QUERY PLAN
+-----------------------------------------------------------------------------------------------------
Update on public.bar
Update on public.bar bar_1
Foreign Update on public.bar2 bar_2
Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1
-> Hash Join
- Output: (bar.f2 + 100), foo.ctid, bar.tableoid, bar.ctid, (NULL::record), foo.*, foo.tableoid
+ Output: (bar.f2 + 100), foo.ctid, bar.tableoid, bar.ctid, (NULL::bar2), foo.*, foo.tableoid
Inner Unique: true
Hash Cond: (bar.f1 = foo.f1)
-> Append
-> Seq Scan on public.bar bar_1
- Output: bar_1.f2, bar_1.f1, bar_1.tableoid, bar_1.ctid, NULL::record
+ Output: bar_1.f2, bar_1.f1, bar_1.tableoid, bar_1.ctid, NULL::bar2
-> Foreign Scan on public.bar2 bar_2
Output: bar_2.f2, bar_2.f1, bar_2.tableoid, bar_2.ctid, bar_2.*
Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
@@ -7305,21 +7305,21 @@ update bar set f2 = f2 + 100
from
( select f1 from foo union all select f1+3 from foo ) ss
where bar.f1 = ss.f1;
- QUERY PLAN
-------------------------------------------------------------------------------------------------
+ QUERY PLAN
+----------------------------------------------------------------------------------------------
Update on public.bar
Update on public.bar bar_1
Foreign Update on public.bar2 bar_2
Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1
-> Merge Join
- Output: (bar.f2 + 100), (ROW(foo.f1)), bar.tableoid, bar.ctid, (NULL::record)
+ Output: (bar.f2 + 100), (ROW(foo.f1)), bar.tableoid, bar.ctid, (NULL::bar2)
Merge Cond: (bar.f1 = foo.f1)
-> Sort
- Output: bar.f2, bar.f1, bar.tableoid, bar.ctid, (NULL::record)
+ Output: bar.f2, bar.f1, bar.tableoid, bar.ctid, (NULL::bar2)
Sort Key: bar.f1
-> Append
-> Seq Scan on public.bar bar_1
- Output: bar_1.f2, bar_1.f1, bar_1.tableoid, bar_1.ctid, NULL::record
+ Output: bar_1.f2, bar_1.f1, bar_1.tableoid, bar_1.ctid, NULL::bar2
-> Foreign Scan on public.bar2 bar_2
Output: bar_2.f2, bar_2.f1, bar_2.tableoid, bar_2.ctid, bar_2.*
Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
@@ -7496,10 +7496,10 @@ update bar set f2 = f2 + 100 returning *;
Update on public.bar bar_1
Foreign Update on public.bar2 bar_2
-> Result
- Output: (bar.f2 + 100), bar.tableoid, bar.ctid, (NULL::record)
+ Output: (bar.f2 + 100), bar.tableoid, bar.ctid, (NULL::bar2)
-> Append
-> Seq Scan on public.bar bar_1
- Output: bar_1.f2, bar_1.tableoid, bar_1.ctid, NULL::record
+ Output: bar_1.f2, bar_1.tableoid, bar_1.ctid, NULL::bar2
-> Foreign Update on public.bar2 bar_2
Remote SQL: UPDATE public.loct2 SET f2 = (f2 + 100) RETURNING f1, f2
(11 rows)
@@ -7531,10 +7531,10 @@ update bar set f2 = f2 + 100;
Foreign Update on public.bar2 bar_2
Remote SQL: UPDATE public.loct2 SET f1 = $2, f2 = $3, f3 = $4 WHERE ctid = $1 RETURNING f1, f2, f3
-> Result
- Output: (bar.f2 + 100), bar.tableoid, bar.ctid, (NULL::record)
+ Output: (bar.f2 + 100), bar.tableoid, bar.ctid, (NULL::bar2)
-> Append
-> Seq Scan on public.bar bar_1
- Output: bar_1.f2, bar_1.tableoid, bar_1.ctid, NULL::record
+ Output: bar_1.f2, bar_1.tableoid, bar_1.ctid, NULL::bar2
-> Foreign Scan on public.bar2 bar_2
Output: bar_2.f2, bar_2.tableoid, bar_2.ctid, bar_2.*
Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
@@ -7563,7 +7563,7 @@ delete from bar where f2 < 400;
Remote SQL: DELETE FROM public.loct2 WHERE ctid = $1 RETURNING f1, f2, f3
-> Append
-> Seq Scan on public.bar bar_1
- Output: bar_1.tableoid, bar_1.ctid, NULL::record
+ Output: bar_1.tableoid, bar_1.ctid, NULL::bar2
Filter: (bar_1.f2 < 400)
-> Foreign Scan on public.bar2 bar_2
Output: bar_2.tableoid, bar_2.ctid, bar_2.*
@@ -7599,19 +7599,19 @@ analyze remt1;
analyze remt2;
explain (verbose, costs off)
update parent set b = parent.b || remt2.b from remt2 where parent.a = remt2.a returning *;
- QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
+ QUERY PLAN
+---------------------------------------------------------------------------------------------------------------
Update on public.parent
Output: parent_1.a, parent_1.b, remt2.a, remt2.b
Update on public.parent parent_1
Foreign Update on public.remt1 parent_2
Remote SQL: UPDATE public.loct1 SET b = $2 WHERE ctid = $1 RETURNING a, b
-> Nested Loop
- Output: (parent.b || remt2.b), remt2.*, remt2.a, remt2.b, parent.tableoid, parent.ctid, (NULL::record)
+ Output: (parent.b || remt2.b), remt2.*, remt2.a, remt2.b, parent.tableoid, parent.ctid, (NULL::remt1)
Join Filter: (parent.a = remt2.a)
-> Append
-> Seq Scan on public.parent parent_1
- Output: parent_1.b, parent_1.a, parent_1.tableoid, parent_1.ctid, NULL::record
+ Output: parent_1.b, parent_1.a, parent_1.tableoid, parent_1.ctid, NULL::remt1
-> Foreign Scan on public.remt1 parent_2
Output: parent_2.b, parent_2.a, parent_2.tableoid, parent_2.ctid, parent_2.*
Remote SQL: SELECT a, b, ctid FROM public.loct1 FOR UPDATE
@@ -7871,7 +7871,7 @@ update utrtest set a = 1 where a = 1 or a = 2 returning *;
-> Foreign Update on public.remp utrtest_1
Remote SQL: UPDATE public.loct SET a = 1 WHERE (((a = 1) OR (a = 2))) RETURNING a, b
-> Seq Scan on public.locp utrtest_2
- Output: 1, utrtest_2.tableoid, utrtest_2.ctid, NULL::record
+ Output: 1, utrtest_2.tableoid, utrtest_2.ctid, NULL::utrtest
Filter: ((utrtest_2.a = 1) OR (utrtest_2.a = 2))
(10 rows)
@@ -7910,8 +7910,8 @@ insert into utrtest values (2, 'qux');
-- with a direct modification plan
explain (verbose, costs off)
update utrtest set a = 1 returning *;
- QUERY PLAN
----------------------------------------------------------------------------
+ QUERY PLAN
+----------------------------------------------------------------------------
Update on public.utrtest
Output: utrtest_1.a, utrtest_1.b
Foreign Update on public.remp utrtest_1
@@ -7920,7 +7920,7 @@ update utrtest set a = 1 returning *;
-> Foreign Update on public.remp utrtest_1
Remote SQL: UPDATE public.loct SET a = 1 RETURNING a, b
-> Seq Scan on public.locp utrtest_2
- Output: 1, utrtest_2.tableoid, utrtest_2.ctid, NULL::record
+ Output: 1, utrtest_2.tableoid, utrtest_2.ctid, NULL::utrtest
(9 rows)
update utrtest set a = 1 returning *;
@@ -7946,7 +7946,7 @@ update utrtest set a = 1 from (values (1), (2)) s(x) where a = s.x returning *;
Output: utrtest_1.a, utrtest_1.tableoid, utrtest_1.ctid, utrtest_1.*
Remote SQL: SELECT a, b, ctid FROM public.loct FOR UPDATE
-> Seq Scan on public.locp utrtest_2
- Output: utrtest_2.a, utrtest_2.tableoid, utrtest_2.ctid, NULL::record
+ Output: utrtest_2.a, utrtest_2.tableoid, utrtest_2.ctid, NULL::utrtest
-> Hash
Output: "*VALUES*".*, "*VALUES*".column1
-> Values Scan on "*VALUES*"
@@ -7970,15 +7970,15 @@ insert into utrtest values (3, 'xyzzy');
-- with a direct modification plan
explain (verbose, costs off)
update utrtest set a = 3 returning *;
- QUERY PLAN
----------------------------------------------------------------------------
+ QUERY PLAN
+----------------------------------------------------------------------------
Update on public.utrtest
Output: utrtest_1.a, utrtest_1.b
Update on public.locp utrtest_1
Foreign Update on public.remp utrtest_2
-> Append
-> Seq Scan on public.locp utrtest_1
- Output: 3, utrtest_1.tableoid, utrtest_1.ctid, NULL::record
+ Output: 3, utrtest_1.tableoid, utrtest_1.ctid, NULL::utrtest
-> Foreign Update on public.remp utrtest_2
Remote SQL: UPDATE public.loct SET a = 3 RETURNING a, b
(9 rows)
@@ -7988,19 +7988,19 @@ ERROR: cannot route tuples into foreign table to be updated "remp"
-- with a non-direct modification plan
explain (verbose, costs off)
update utrtest set a = 3 from (values (2), (3)) s(x) where a = s.x returning *;
- QUERY PLAN
------------------------------------------------------------------------------------------------------
+ QUERY PLAN
+------------------------------------------------------------------------------------------------------
Update on public.utrtest
Output: utrtest_1.a, utrtest_1.b, "*VALUES*".column1
Update on public.locp utrtest_1
Foreign Update on public.remp utrtest_2
Remote SQL: UPDATE public.loct SET a = $2 WHERE ctid = $1 RETURNING a, b
-> Hash Join
- Output: 3, "*VALUES*".*, "*VALUES*".column1, utrtest.tableoid, utrtest.ctid, (NULL::record)
+ Output: 3, "*VALUES*".*, "*VALUES*".column1, utrtest.tableoid, utrtest.ctid, (NULL::utrtest)
Hash Cond: (utrtest.a = "*VALUES*".column1)
-> Append
-> Seq Scan on public.locp utrtest_1
- Output: utrtest_1.a, utrtest_1.tableoid, utrtest_1.ctid, NULL::record
+ Output: utrtest_1.a, utrtest_1.tableoid, utrtest_1.ctid, NULL::utrtest
-> Foreign Scan on public.remp utrtest_2
Output: utrtest_2.a, utrtest_2.tableoid, utrtest_2.ctid, utrtest_2.*
Remote SQL: SELECT a, b, ctid FROM public.loct FOR UPDATE
@@ -10157,8 +10157,8 @@ RESET enable_hashjoin;
-- Test that UPDATE/DELETE with inherited target works with async_capable enabled
EXPLAIN (VERBOSE, COSTS OFF)
UPDATE async_pt SET c = c || c WHERE b = 0 RETURNING *;
- QUERY PLAN
-----------------------------------------------------------------------------------------------------------
+ QUERY PLAN
+------------------------------------------------------------------------------------------------------------
Update on public.async_pt
Output: async_pt_1.a, async_pt_1.b, async_pt_1.c
Foreign Update on public.async_p1 async_pt_1
@@ -10170,7 +10170,7 @@ UPDATE async_pt SET c = c || c WHERE b = 0 RETURNING *;
-> Foreign Update on public.async_p2 async_pt_2
Remote SQL: UPDATE public.base_tbl2 SET c = (c || c) WHERE ((b = 0)) RETURNING a, b, c
-> Seq Scan on public.async_p3 async_pt_3
- Output: (async_pt_3.c || async_pt_3.c), async_pt_3.tableoid, async_pt_3.ctid, NULL::record
+ Output: (async_pt_3.c || async_pt_3.c), async_pt_3.tableoid, async_pt_3.ctid, NULL::async_pt
Filter: (async_pt_3.b = 0)
(13 rows)
diff --git a/src/backend/optimizer/util/appendinfo.c b/src/backend/optimizer/util/appendinfo.c
index af46f581ac1..6d440af41d2 100644
--- a/src/backend/optimizer/util/appendinfo.c
+++ b/src/backend/optimizer/util/appendinfo.c
@@ -16,6 +16,7 @@
#include "access/htup_details.h"
#include "access/table.h"
+#include "catalog/partition.h"
#include "foreign/fdwapi.h"
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
@@ -752,7 +753,7 @@ find_appinfos_by_relids(PlannerInfo *root, Relids relids, int *nappinfos)
*
* The Var must be equal(), aside from varno, to any other row-identity
* column with the same rowid_name. Thus, for example, "wholerow"
- * row identities had better use vartype == RECORDOID.
+ * row identities had better use vartype of the base relation.
*
* rtindex is currently redundant with rowid_var->varno, but we specify
* it as a separate parameter in case this is ever generalized to support
@@ -846,6 +847,46 @@ add_row_identity_var(PlannerInfo *root, Var *orig_var,
root->processed_tlist = lappend(root->processed_tlist, tle);
}
+/*
+ * add_row_identity_var_wholerow
+ * Wrapper over add_row_identity_var() to register a "wholerow"
+ * row-identity column
+ *
+ * This has been made into a function unlike for other row-identity Vars,
+ * because the vartype that must be assigned to the Var differs based on
+ * whether it's being added for the root or for a child target relation.
+ * FDWs should call this for example instead of making the Var by themselves.
+ */
+void
+add_row_identity_var_wholerow(PlannerInfo *root, Index rtindex,
+ Relation target_relation)
+{
+ Oid reloid;
+ Relation r;
+ Var *var;
+
+ r = target_relation;
+ reloid = RelationGetRelid(target_relation);
+
+ while (RelationGetForm(r)->relispartition)
+ {
+ reloid = get_partition_parent(reloid, false);
+ if (r != target_relation)
+ table_close(r, AccessShareLock);
+ r = table_open(reloid, AccessShareLock);
+ }
+
+ var = makeVar(rtindex,
+ InvalidAttrNumber,
+ RelationGetForm(r)->reltype,
+ -1,
+ InvalidOid,
+ 0);
+ if (r != target_relation)
+ table_close(r, AccessShareLock);
+ add_row_identity_var(root, var, rtindex, "wholerow");
+}
+
/*
* add_row_identity_columns
*
@@ -909,15 +950,7 @@ add_row_identity_columns(PlannerInfo *root, Index rtindex,
(target_relation->trigdesc &&
(target_relation->trigdesc->trig_delete_after_row ||
target_relation->trigdesc->trig_delete_before_row)))
- {
- var = makeVar(rtindex,
- InvalidAttrNumber,
- RECORDOID,
- -1,
- InvalidOid,
- 0);
- add_row_identity_var(root, var, rtindex, "wholerow");
- }
+ add_row_identity_var_wholerow(root, rtindex, target_relation);
}
}
diff --git a/src/include/optimizer/appendinfo.h b/src/include/optimizer/appendinfo.h
index 39d04d9cc02..97442a4ee17 100644
--- a/src/include/optimizer/appendinfo.h
+++ b/src/include/optimizer/appendinfo.h
@@ -42,6 +42,8 @@ extern AppendRelInfo **find_appinfos_by_relids(PlannerInfo *root,
Relids relids, int *nappinfos);
extern void add_row_identity_var(PlannerInfo *root, Var *rowid_var,
Index rtindex, const char *rowid_name);
+extern void add_row_identity_var_wholerow(PlannerInfo *root, Index rtindex,
+ Relation target_relation);
extern void add_row_identity_columns(PlannerInfo *root, Index rtindex,
RangeTblEntry *target_rte,
Relation target_relation);
--
2.25.1
Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes:
What about the following patch?
ISTM that using a specific rowtype rather than RECORD would be
quite disastrous from the standpoint of bloating the number of
distinct resjunk columns we need for a partition tree with a
lot of children. Maybe we'll have to go that way, but it seems
like an absolute last resort.
I think a preferable fix involves making sure that the correct
record-type typmod is propagated to record_in in this context.
Alternatively, maybe we could insert the foreign table's rowtype
during execution of the input operation, without touching the
plan as such.
Could we start by creating a test case that doesn't involve
uncommittable hacks to the source code?
regards, tom lane
Tom Lane писал 2021-06-01 21:19:
Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes:
What about the following patch?
ISTM that using a specific rowtype rather than RECORD would be
quite disastrous from the standpoint of bloating the number of
distinct resjunk columns we need for a partition tree with a
lot of children. Maybe we'll have to go that way, but it seems
like an absolute last resort.
Why do you think they are distinct?
In suggested patch all of them will have type of the common ancestor
(root of the partition tree).
I think a preferable fix involves making sure that the correct
record-type typmod is propagated to record_in in this context.
Alternatively, maybe we could insert the foreign table's rowtype
during execution of the input operation, without touching the
plan as such.Could we start by creating a test case that doesn't involve
uncommittable hacks to the source code?
Yes, it seems the following works fine to reproduce the issue.
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Attachments:
example.difftext/x-diff; name=example.diffDownload
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 78379bdea5b..4dbb80c3ee8 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3262,3 +3262,16 @@ DROP TABLE join_tbl;
ALTER SERVER loopback OPTIONS (DROP async_capable);
ALTER SERVER loopback2 OPTIONS (DROP async_capable);
+
+CREATE TABLE base_tbl (a int, b int);
+CREATE FOREIGN TABLE remote_tbl (a int, b int)
+ SERVER loopback OPTIONS (table_name 'base_tbl');
+
+insert into remote_tbl select generate_series(1,100), generate_series(1,100);
+
+ANALYZE base_tbl;
+ANALYZE remote_tbl;
+
+EXPLAIN (VERBOSE, COSTS OFF, SUMMARY OFF, TIMING OFF)
+UPDATE remote_tbl d SET a= CASE WHEN current_timestamp> '2012-02-02'::timestamp THEN 5 ELSE 6 END FROM remote_tbl AS t (a, b) WHERE d.a = (t.a);
+UPDATE remote_tbl d SET a= CASE WHEN current_timestamp> '2012-02-02'::timestamp THEN 5 ELSE 6 END FROM remote_tbl AS t (a, b) WHERE d.a = (t.a);
Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes:
Tom Lane писал 2021-06-01 21:19:
ISTM that using a specific rowtype rather than RECORD would be
quite disastrous from the standpoint of bloating the number of
distinct resjunk columns we need for a partition tree with a
lot of children. Maybe we'll have to go that way, but it seems
like an absolute last resort.
Why do you think they are distinct?
In suggested patch all of them will have type of the common ancestor
(root of the partition tree).
Seems moderately unlikely that that will work in cases where the
partition children have rowtypes different from the ancestor
(different column order etc). It'll also cause the problem we
originally sought to avoid for selects across traditional inheritance
trees, where there isn't a common partition ancestor.
regards, tom lane
I wrote:
I think a preferable fix involves making sure that the correct
record-type typmod is propagated to record_in in this context.
Alternatively, maybe we could insert the foreign table's rowtype
during execution of the input operation, without touching the
plan as such.
Here's a draft-quality patch based on that idea. It resolves
the offered test case, but I haven't beat on it beyond that.
regards, tom lane
Attachments:
use-proper-rowtype-for-postgres_fdw-row-ids.patchtext/x-diff; charset=us-ascii; name=use-proper-rowtype-for-postgres_fdw-row-ids.patchDownload
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 7df30010f2..79bc08efb4 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -10231,3 +10231,50 @@ DROP TABLE result_tbl;
DROP TABLE join_tbl;
ALTER SERVER loopback OPTIONS (DROP async_capable);
ALTER SERVER loopback2 OPTIONS (DROP async_capable);
+CREATE TABLE base_tbl (a int, b int);
+CREATE FOREIGN TABLE remote_tbl (a int, b int)
+ SERVER loopback OPTIONS (table_name 'base_tbl');
+INSERT INTO base_tbl SELECT a, a+1 FROM generate_series(1,10) a;
+ANALYZE base_tbl;
+ANALYZE remote_tbl;
+EXPLAIN (VERBOSE, COSTS OFF)
+UPDATE remote_tbl d SET a = CASE WHEN random() >= 0 THEN 5 ELSE 6 END
+ FROM remote_tbl AS t WHERE d.a = t.a;
+ QUERY PLAN
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Update on public.remote_tbl d
+ Remote SQL: UPDATE public.base_tbl SET a = $2 WHERE ctid = $1
+ -> Foreign Scan
+ Output: CASE WHEN (random() >= '0'::double precision) THEN 5 ELSE 6 END, d.ctid, d.*, t.*
+ Relations: (public.remote_tbl d) INNER JOIN (public.remote_tbl t)
+ Remote SQL: SELECT r1.ctid, CASE WHEN (r1.*)::text IS NOT NULL THEN ROW(r1.a, r1.b) END, CASE WHEN (r2.*)::text IS NOT NULL THEN ROW(r2.a, r2.b) END FROM (public.base_tbl r1 INNER JOIN public.base_tbl r2 ON (((r1.a = r2.a)))) FOR UPDATE OF r1
+ -> Hash Join
+ Output: d.ctid, d.*, t.*
+ Hash Cond: (d.a = t.a)
+ -> Foreign Scan on public.remote_tbl d
+ Output: d.ctid, d.*, d.a
+ Remote SQL: SELECT a, b, ctid FROM public.base_tbl FOR UPDATE
+ -> Hash
+ Output: t.*, t.a
+ -> Foreign Scan on public.remote_tbl t
+ Output: t.*, t.a
+ Remote SQL: SELECT a, b FROM public.base_tbl
+(17 rows)
+
+UPDATE remote_tbl d SET a = CASE WHEN random() >= 0 THEN 5 ELSE 6 END
+ FROM remote_tbl AS t WHERE d.a = t.a;
+SELECT * FROM base_tbl ORDER BY b;
+ a | b
+---+----
+ 5 | 2
+ 5 | 3
+ 5 | 4
+ 5 | 5
+ 5 | 6
+ 5 | 7
+ 5 | 8
+ 5 | 9
+ 5 | 10
+ 5 | 11
+(10 rows)
+
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index c48a421e88..24ba60e00a 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -1439,6 +1439,57 @@ postgresGetForeignPlan(PlannerInfo *root,
outer_plan);
}
+/*
+ * Construct a tuple descriptor for the scan tuples handled by a foreign join.
+ */
+static TupleDesc
+get_tupdesc_for_join_scan_tuples(ForeignScanState *node)
+{
+ ForeignScan *fsplan = (ForeignScan *) node->ss.ps.plan;
+ EState *estate = node->ss.ps.state;
+ TupleDesc tupdesc;
+
+ /*
+ * The core code has already set up a scan tuple slot based on
+ * fsplan->fdw_scan_tlist, and this slot's tupdesc is mostly good enough,
+ * but there's one case where it isn't. If we have any whole-row row
+ * identifier Vars, they may have vartype RECORD, and we need to replace
+ * that with the associated table's actual composite type. This ensures
+ * that when we read those ROW() expression values from the remote server,
+ * we can convert them to a composite type the local server knows.
+ */
+ tupdesc = CreateTupleDescCopy(node->ss.ss_ScanTupleSlot->tts_tupleDescriptor);
+ for (int i = 0; i < tupdesc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(tupdesc, i);
+ Var *var;
+ RangeTblEntry *rte;
+ Oid reltype;
+
+ /* Nothing to do if it's not a generic RECORD attribute */
+ if (att->atttypid != RECORDOID || att->atttypmod >= 0)
+ continue;
+
+ /*
+ * If we can't identify the referenced table, do nothing. This'll
+ * likely lead to failure later, but perhaps we can muddle through.
+ */
+ var = (Var *) list_nth_node(TargetEntry, fsplan->fdw_scan_tlist,
+ i)->expr;
+ if (!IsA(var, Var))
+ continue;
+ rte = list_nth(estate->es_range_table, var->varno - 1);
+ if (rte->rtekind != RTE_RELATION)
+ continue;
+ reltype = get_rel_type_id(rte->relid);
+ if (!OidIsValid(reltype))
+ continue;
+ att->atttypid = reltype;
+ /* shouldn't need to change anything else */
+ }
+ return tupdesc;
+}
+
/*
* postgresBeginForeignScan
* Initiate an executor scan of a foreign PostgreSQL table.
@@ -1523,7 +1574,7 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
else
{
fsstate->rel = NULL;
- fsstate->tupdesc = node->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
+ fsstate->tupdesc = get_tupdesc_for_join_scan_tuples(node);
}
fsstate->attinmeta = TupleDescGetAttInMetadata(fsstate->tupdesc);
@@ -2631,7 +2682,7 @@ postgresBeginDirectModify(ForeignScanState *node, int eflags)
TupleDesc tupdesc;
if (fsplan->scan.scanrelid == 0)
- tupdesc = node->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
+ tupdesc = get_tupdesc_for_join_scan_tuples(node);
else
tupdesc = RelationGetDescr(dmstate->rel);
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 78379bdea5..a7070870c7 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3262,3 +3262,20 @@ DROP TABLE join_tbl;
ALTER SERVER loopback OPTIONS (DROP async_capable);
ALTER SERVER loopback2 OPTIONS (DROP async_capable);
+
+CREATE TABLE base_tbl (a int, b int);
+CREATE FOREIGN TABLE remote_tbl (a int, b int)
+ SERVER loopback OPTIONS (table_name 'base_tbl');
+
+INSERT INTO base_tbl SELECT a, a+1 FROM generate_series(1,10) a;
+
+ANALYZE base_tbl;
+ANALYZE remote_tbl;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+UPDATE remote_tbl d SET a = CASE WHEN random() >= 0 THEN 5 ELSE 6 END
+ FROM remote_tbl AS t WHERE d.a = t.a;
+UPDATE remote_tbl d SET a = CASE WHEN random() >= 0 THEN 5 ELSE 6 END
+ FROM remote_tbl AS t WHERE d.a = t.a;
+
+SELECT * FROM base_tbl ORDER BY b;
On 2/6/21 02:32, Tom Lane wrote:
I wrote:
I think a preferable fix involves making sure that the correct
record-type typmod is propagated to record_in in this context.
Alternatively, maybe we could insert the foreign table's rowtype
during execution of the input operation, without touching the
plan as such.Here's a draft-quality patch based on that idea. It resolves
the offered test case, but I haven't beat on it beyond that.regards, tom lane
I played with your patch and couldn't find any errors. But what if ROW
operation were allowed to be pushed to a foreign server?
Potentially, I can imagine pushed-down JOIN with arbitrary ROW function
in its target list.
Amit's approach looks more safe for me.
--
regards,
Andrey Lepikhov
Postgres Professional
On Wed, Jun 2, 2021 at 6:32 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
I think a preferable fix involves making sure that the correct
record-type typmod is propagated to record_in in this context.
Alternatively, maybe we could insert the foreign table's rowtype
during execution of the input operation, without touching the
plan as such.Here's a draft-quality patch based on that idea.
This looks good to me. Yeah, I agree that reversing our decision to
mark row-id wholerow Vars in as RECORD rather than a specific reltype
will have to wait until we hear more complaints than just this one,
which seems fixable with a patch like this.
It resolves
the offered test case, but I haven't beat on it beyond that.
Given that we don't (no longer) support pushing down the join of child
target relations with other relations, I don't think we have other
cases that are affected at this point. I have a feeling that your
patch will have fixed things enough that the same problem will not
occur when we have join pushdown under UPDATE occurring in more cases.
--
Amit Langote
EDB: http://www.enterprisedb.com
Tom Lane писал 2021-06-02 00:32:
I wrote:
I think a preferable fix involves making sure that the correct
record-type typmod is propagated to record_in in this context.
Alternatively, maybe we could insert the foreign table's rowtype
during execution of the input operation, without touching the
plan as such.Here's a draft-quality patch based on that idea. It resolves
the offered test case, but I haven't beat on it beyond that.regards, tom lane
Hi.
The patch seems to work fine for mentioned case.
For now I'm working on function pushdown. When record-returning function
(like unnest())
is pushed down, on this stage we've already lost any type information,
so get the issue again.
So far I'm not sure how to fix the issue, perhaps just avoid pushing
foreign join if we have
record, corresponding to function RTE var in joinrel->reltarget?
--
Best regards,
Alexander Pyhalov,
Postgres Professional
On Wed, Jun 2, 2021 at 4:39 PM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
On 2/6/21 02:32, Tom Lane wrote:
I wrote:
I think a preferable fix involves making sure that the correct
record-type typmod is propagated to record_in in this context.
Alternatively, maybe we could insert the foreign table's rowtype
during execution of the input operation, without touching the
plan as such.Here's a draft-quality patch based on that idea. It resolves
the offered test case, but I haven't beat on it beyond that.I played with your patch and couldn't find any errors. But what if ROW
operation were allowed to be pushed to a foreign server?Potentially, I can imagine pushed-down JOIN with arbitrary ROW function
in its target list.
Are you saying that a pushed down ROW() expression may not correspond
with the Var chosen by the following code?
+ /*
+ * If we can't identify the referenced table, do nothing. This'll
+ * likely lead to failure later, but perhaps we can muddle through.
+ */
+ var = (Var *) list_nth_node(TargetEntry, fsplan->fdw_scan_tlist,
+ i)->expr;
+ if (!IsA(var, Var))
+ continue;
+ rte = list_nth(estate->es_range_table, var->varno - 1);
+ if (rte->rtekind != RTE_RELATION)
+ continue;
+ reltype = get_rel_type_id(rte->relid);
+ if (!OidIsValid(reltype))
+ continue;
+ att->atttypid = reltype;
That may be a valid concern. I wonder if it would make sense to also
check varattno == 0 here somewhere for good measure.
--
Amit Langote
EDB: http://www.enterprisedb.com
Amit Langote <amitlangote09@gmail.com> writes:
On Wed, Jun 2, 2021 at 4:39 PM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:I played with your patch and couldn't find any errors. But what if ROW
operation were allowed to be pushed to a foreign server?
Potentially, I can imagine pushed-down JOIN with arbitrary ROW function
in its target list.
I thought about this for awhile and I don't think it's a real concern.
There's nothing stopping us from pushing an expression of the form
"func(row(...))" or "row(...) op row(...)", because we're not asking
to retrieve the value of the ROW() expression. Whether the remote
server can handle that is strictly its concern. (Probably, it's
going to do something involving a locally-assigned typmod to keep
track of the rowtype, but it's not our problem.) Where things get
sticky is if we try to *retrieve the value* of a ROW() expression.
And except in this specific context, I don't see why we'd do that.
There's no advantage compared to retrieving the component Vars
or expressions.
... I wonder if it would make sense to also
check varattno == 0 here somewhere for good measure.
Yeah, I considered doing that but left it off in this version.
It's not clear to me how there could be a table column of type RECORD,
so it seemed unnecessary. On the other hand, it's also cheap
insurance, so I'll put it back.
regards, tom lane