postgres_fdw: wrong results with self join + enable_nestloop off
Hi,
We have observed that running the same self JOIN query on postgres FDW
setup is returning different results with set enable_nestloop off & on. I
am at today's latest commit:- 928e05ddfd4031c67e101c5e74dbb5c8ec4f9e23
I created a local FDW setup. And ran this experiment on the same. Kindly
refer to the P.S section for details.
|********************************************************************|
*Below is the output difference along with query plan:-*
postgres@71609=#set enable_nestloop=off;
SET
postgres@71609=#select * from pg_tbl_foreign tbl1 join pg_tbl_foreign tbl2
on tbl1.id1 < 5 and now() < '23-Feb-2020'::timestamp;
id1 | id2 | id1 | id2
-----+-----+-----+-----
1 | 10 | 1 | 10
2 | 20 | 1 | 10
3 | 30 | 1 | 10
1 | 10 | 2 | 20
2 | 20 | 2 | 20
3 | 30 | 2 | 20
1 | 10 | 3 | 30
2 | 20 | 3 | 30
3 | 30 | 3 | 30
(9 rows)
postgres@71609=#explain (analyze, verbose) select * from pg_tbl_foreign
tbl1 join pg_tbl_foreign tbl2 on tbl1.id1 < 5 and now() <
'23-Feb-2020'::timestamp;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Foreign Scan (cost=100.00..49310.40 rows=2183680 width=16) (actual
time=0.514..0.515 rows=9 loops=1)
Output: tbl1.id1, tbl1.id2, tbl2.id1, tbl2.id2
Relations: (public.pg_tbl_foreign tbl1) INNER JOIN
(public.pg_tbl_foreign tbl2)
Remote SQL: SELECT r1.id1, r1.id2, r2.id1, r2.id2 FROM (public.pg_tbl r1
INNER JOIN public.pg_tbl r2 ON (((r1.id1 < 5))))
Planning Time: 0.139 ms
Execution Time: 0.984 ms
(6 rows)
postgres@71609=#set enable_nestloop=on;
SET
postgres@71609=#select * from pg_tbl_foreign tbl1 join pg_tbl_foreign tbl2
on tbl1.id1 < 5 and now() < '23-Feb-2020'::timestamp;
id1 | id2 | id1 | id2
-----+-----+-----+-----
(0 rows)
postgres@71609=#explain (analyze, verbose) select * from pg_tbl_foreign
tbl1 join pg_tbl_foreign tbl2 on tbl1.id1 < 5 and now() <
'23-Feb-2020'::timestamp;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Result (cost=200.00..27644.00 rows=2183680 width=16) (actual
time=0.003..0.004 rows=0 loops=1)
Output: tbl1.id1, tbl1.id2, tbl2.id1, tbl2.id2
One-Time Filter: (now() < '2020-02-23 00:00:00'::timestamp without time
zone)
-> Nested Loop (cost=200.00..27644.00 rows=2183680 width=16) (never
executed)
Output: tbl1.id1, tbl1.id2, tbl2.id1, tbl2.id2
-> Foreign Scan on public.pg_tbl_foreign tbl2
(cost=100.00..186.80 rows=2560 width=8) (never executed)
Output: tbl2.id1, tbl2.id2
Remote SQL: SELECT id1, id2 FROM public.pg_tbl
-> Materialize (cost=100.00..163.32 rows=853 width=8) (never
executed)
Output: tbl1.id1, tbl1.id2
-> Foreign Scan on public.pg_tbl_foreign tbl1
(cost=100.00..159.06 rows=853 width=8) (never executed)
Output: tbl1.id1, tbl1.id2
Remote SQL: SELECT id1, id2 FROM public.pg_tbl WHERE
((id1 < 5))
Planning Time: 0.178 ms
Execution Time: 0.292 ms
(15 rows)
|********************************************************************|
I debugged this issue and was able to find a fix for the same. Kindly
please refer to the attached fix. With the fix I am able to resolve the
issue. But I am not that confident whether this change would affect some
other existing functionally but it has helped me resolve this result
difference in output.
*What is the technical issue?*
The problem here is the use of extract_actual_clauses. Because of which the
plan creation misses adding the second condition of AND i.e "now() <
'23-Feb-2020'::timestamp" in the plan. Because it is not considered a
pseudo constant and extract_actual_clause is passed with false as the
second parameter and it gets skipped from the list. As a result that
condition is never taken into consideration as either one-time filter
(before or after) or part of SQL remote execution
*Why do I think the fix is correct?*
The fix is simple, where we have created a new function similar to
extract_actual_clause which just extracts all the conditions from the list
with no checks and returns the list to the caller. As a result all
conditions would be taken into consideration in the query plan.
*After my fix patch:-*
postgres@78754=#set enable_nestloop=off;
SET
postgres@78754=#select * from pg_tbl_foreign tbl1 join pg_tbl_foreign tbl2
on tbl1.id1 < 5 and now() < '23-Feb-2020'::timestamp;
id1 | id2 | id1 | id2
-----+-----+-----+-----
(0 rows)
^
postgres@78754=#explain (analyze, verbose) select * from pg_tbl_foreign
tbl1 join pg_tbl_foreign tbl2 on tbl1.id1 < 5 and now() <
'23-Feb-2020'::timestamp;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Foreign Scan (cost=100.00..49310.40 rows=2183680 width=16) (actual
time=0.652..0.652 rows=0 loops=1)
Output: tbl1.id1, tbl1.id2, tbl2.id1, tbl2.id2
Filter: (now() < '2020-02-23 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 9
Relations: (public.pg_tbl_foreign tbl1) INNER JOIN
(public.pg_tbl_foreign tbl2)
Remote SQL: SELECT r1.id1, r1.id2, r2.id1, r2.id2 FROM (public.pg_tbl r1
INNER JOIN public.pg_tbl r2 ON (((r1.id1 < 5))))
Planning Time: 0.133 ms
Execution Time: 1.127 ms
(8 rows)
postgres@78754=#set enable_nestloop=on;
SET
postgres@78754=#select * from pg_tbl_foreign tbl1 join pg_tbl_foreign tbl2
on tbl1.id1 < 5 and now() < '23-Feb-2020'::timestamp;
id1 | id2 | id1 | id2
-----+-----+-----+-----
(0 rows)
postgres@78754=#explain (analyze, verbose) select * from pg_tbl_foreign
tbl1 join pg_tbl_foreign tbl2 on tbl1.id1 < 5 and now() <
'23-Feb-2020'::timestamp;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Result (cost=200.00..27644.00 rows=2183680 width=16) (actual
time=0.004..0.005 rows=0 loops=1)
Output: tbl1.id1, tbl1.id2, tbl2.id1, tbl2.id2
One-Time Filter: (now() < '2020-02-23 00:00:00'::timestamp without time
zone)
-> Nested Loop (cost=200.00..27644.00 rows=2183680 width=16) (never
executed)
Output: tbl1.id1, tbl1.id2, tbl2.id1, tbl2.id2
-> Foreign Scan on public.pg_tbl_foreign tbl2
(cost=100.00..186.80 rows=2560 width=8) (never executed)
Output: tbl2.id1, tbl2.id2
Remote SQL: SELECT id1, id2 FROM public.pg_tbl
-> Materialize (cost=100.00..163.32 rows=853 width=8) (never
executed)
Output: tbl1.id1, tbl1.id2
-> Foreign Scan on public.pg_tbl_foreign tbl1
(cost=100.00..159.06 rows=853 width=8) (never executed)
Output: tbl1.id1, tbl1.id2
Remote SQL: SELECT id1, id2 FROM public.pg_tbl WHERE
((id1 < 5))
Planning Time: 0.134 ms
Execution Time: 0.347 ms
(15 rows)
|********************************************************************|
Kindly please comment if I am in the correct direction or not?
Regards,
Nishant Sharma.
Developer at EnterpriseDB, Pune, India.
P.S
Steps that I used to create local postgres FDW setup ( followed link -
https://www.postgresql.org/docs/current/postgres-fdw.html
<https://www.postgresql.org/docs/current/postgres-fdw.html):-> )
1) ./configure --prefix=/home/edb/POSTGRES_INSTALL/MASTER
--with-pgport=9996 --with-openssl --with-libxml --with-zlib --with-tcl
--with-perl --with-libxslt --with-ossp-uuid --with-ldap --with-pam
--enable-nls --enable-debug --enable-depend --enable-dtrace --with-selinux
--with-icu --enable-tap-tests --enable-cassert CFLAGS="-g -O0"
2) make
3) make install
4) cd contrib/postgres_fdw/
5) make install
6) Start the server
7)
[edb@localhost MASTER]$ bin/psql postgres edb;
psql (16devel)
Type "help" for help.
postgres@70613=#create database remote_db;
CREATE DATABASE
postgres@70613=#quit
[edb@localhost MASTER]$ bin/psql remote_db edb;
psql (16devel)
Type "help" for help.
remote_db@70613=#CREATE USER fdw_user;
CREATE ROLE
remote_db@70613=#GRANT ALL ON SCHEMA public TO fdw_user;
GRANT
remote_db@70613=#quit
[edb@localhost MASTER]$ bin/psql remote_db fdw_user;
psql (16devel)
Type "help" for help.
remote_db@70613=#create table pg_tbl(id1 int, id2 int);
CREATE TABLE
remote_db@70613=#insert into pg_tbl values(1, 10);
INSERT 0 1
remote_db@70613=#insert into pg_tbl values(2, 20);
INSERT 0 1
remote_db@70613=#insert into pg_tbl values(3, 30);
INSERT 0 1
8)
New terminal/Tab:-
[edb@localhost MASTER]$ bin/psql postgres edb;
postgres@71609=#create extension postgres_fdw;
CREATE EXTENSION
postgres@71609=#CREATE SERVER localhost_fdw FOREIGN DATA WRAPPER
postgres_fdw OPTIONS (dbname 'remote_db', host 'localhost', port '9996');
CREATE SERVER
postgres@71609=#CREATE USER MAPPING for edb SERVER localhost_fdw OPTIONS
(user 'fdw_user', password '');
CREATE USER MAPPING
postgres@71609=#GRANT ALL ON FOREIGN SERVER localhost_fdw TO edb;
GRANT
postgres@71609=#CREATE FOREIGN TABLE pg_tbl_foreign(id1 int, id2 int)
SERVER localhost_fdw OPTIONS (schema_name 'public', table_name 'pg_tbl');
CREATE FOREIGN TABLE
postgres@71609=#select * from pg_tbl_foreign;
id1 | id2
-----+-----
1 | 10
2 | 20
3 | 30
(3 rows)
Attachments:
PG_PATCH_set_nestloop_off_issue_fix.patchapplication/octet-stream; name=PG_PATCH_set_nestloop_off_issue_fix.patchDownload
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index f5926ab..a67e7a1 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -1317,8 +1317,8 @@ postgresGetForeignPlan(PlannerInfo *root,
* Instead we get the conditions to apply from the fdw_private
* structure.
*/
- remote_exprs = extract_actual_clauses(fpinfo->remote_conds, false);
- local_exprs = extract_actual_clauses(fpinfo->local_conds, false);
+ remote_exprs = extract_the_clauses(fpinfo->remote_conds);
+ local_exprs = extract_the_clauses(fpinfo->local_conds);
/*
* We leave fdw_recheck_quals empty in this case, since we never need
diff --git a/src/backend/optimizer/util/restrictinfo.c b/src/backend/optimizer/util/restrictinfo.c
index c44bd2f..a48e366 100644
--- a/src/backend/optimizer/util/restrictinfo.c
+++ b/src/backend/optimizer/util/restrictinfo.c
@@ -479,6 +479,25 @@ extract_actual_clauses(List *restrictinfo_list,
}
/*
+ * extract_the_clauses
+ *
+ * Extract bare clauses from 'restrictinfo_list'
+ */
+List *
+extract_the_clauses(List *restrictinfo_list)
+{
+ List *result = NIL;
+ ListCell *l;
+
+ foreach(l, restrictinfo_list)
+ {
+ RestrictInfo *rinfo = lfirst_node(RestrictInfo, l);
+ result = lappend(result, rinfo->clause);
+ }
+ return result;
+}
+
+/*
* extract_actual_join_clauses
*
* Extract bare clauses from 'restrictinfo_list', separating those that
diff --git a/src/include/optimizer/restrictinfo.h b/src/include/optimizer/restrictinfo.h
index c9e3077..e319bf4 100644
--- a/src/include/optimizer/restrictinfo.h
+++ b/src/include/optimizer/restrictinfo.h
@@ -35,6 +35,7 @@ extern bool restriction_is_securely_promotable(RestrictInfo *restrictinfo,
extern List *get_actual_clauses(List *restrictinfo_list);
extern List *extract_actual_clauses(List *restrictinfo_list,
bool pseudoconstant);
+extern List *extract_the_clauses(List *restrictinfo_list);
extern void extract_actual_join_clauses(List *restrictinfo_list,
Relids joinrelids,
List **joinquals,
Hi Nishant,
On Fri, Apr 14, 2023 at 8:39 PM Nishant Sharma
<nishant.sharma@enterprisedb.com> wrote:
I debugged this issue and was able to find a fix for the same. Kindly please refer to the attached fix. With the fix I am able to resolve the issue.
Thanks for the report and patch!
What is the technical issue?
The problem here is the use of extract_actual_clauses. Because of which the plan creation misses adding the second condition of AND i.e "now() < '23-Feb-2020'::timestamp" in the plan. Because it is not considered a pseudo constant and extract_actual_clause is passed with false as the second parameter and it gets skipped from the list. As a result that condition is never taken into consideration as either one-time filter (before or after) or part of SQL remote executionWhy do I think the fix is correct?
The fix is simple, where we have created a new function similar to extract_actual_clause which just extracts all the conditions from the list with no checks and returns the list to the caller. As a result all conditions would be taken into consideration in the query plan.
I think that the root cause for this issue would be in the
create_scan_plan handling of pseudoconstant quals when creating a
foreign-join (or custom-join) plan. Anyway, I will look at your patch
closely, first.
Best regards,
Etsuro Fujita
Thanks Etsuro for your response!
One small typo correction in my answer to "What is the technical issue?"
"it is *not* considered a pseudo constant" --> "it is considered a pseudo
constant"
Regards,
Nishant.
On Fri, Apr 14, 2023 at 6:21 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:
Show quoted text
Hi Nishant,
On Fri, Apr 14, 2023 at 8:39 PM Nishant Sharma
<nishant.sharma@enterprisedb.com> wrote:I debugged this issue and was able to find a fix for the same. Kindly
please refer to the attached fix. With the fix I am able to resolve the
issue.Thanks for the report and patch!
What is the technical issue?
The problem here is the use of extract_actual_clauses. Because of whichthe plan creation misses adding the second condition of AND i.e "now() <
'23-Feb-2020'::timestamp" in the plan. Because it is not considered a
pseudo constant and extract_actual_clause is passed with false as the
second parameter and it gets skipped from the list. As a result that
condition is never taken into consideration as either one-time filter
(before or after) or part of SQL remote executionWhy do I think the fix is correct?
The fix is simple, where we have created a new function similar toextract_actual_clause which just extracts all the conditions from the list
with no checks and returns the list to the caller. As a result all
conditions would be taken into consideration in the query plan.I think that the root cause for this issue would be in the
create_scan_plan handling of pseudoconstant quals when creating a
foreign-join (or custom-join) plan. Anyway, I will look at your patch
closely, first.Best regards,
Etsuro Fujita
Hi Etsuro Fujita,
Any updates? -- did you get a chance to look into this?
Regards,
Nishant.
On Mon, Apr 17, 2023 at 11:00 AM Nishant Sharma <
nishant.sharma@enterprisedb.com> wrote:
Show quoted text
Thanks Etsuro for your response!
One small typo correction in my answer to "What is the technical issue?"
"it is *not* considered a pseudo constant" --> "it is considered a pseudo
constant"Regards,
Nishant.On Fri, Apr 14, 2023 at 6:21 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:Hi Nishant,
On Fri, Apr 14, 2023 at 8:39 PM Nishant Sharma
<nishant.sharma@enterprisedb.com> wrote:I debugged this issue and was able to find a fix for the same. Kindly
please refer to the attached fix. With the fix I am able to resolve the
issue.Thanks for the report and patch!
What is the technical issue?
The problem here is the use of extract_actual_clauses. Because of whichthe plan creation misses adding the second condition of AND i.e "now() <
'23-Feb-2020'::timestamp" in the plan. Because it is not considered a
pseudo constant and extract_actual_clause is passed with false as the
second parameter and it gets skipped from the list. As a result that
condition is never taken into consideration as either one-time filter
(before or after) or part of SQL remote executionWhy do I think the fix is correct?
The fix is simple, where we have created a new function similar toextract_actual_clause which just extracts all the conditions from the list
with no checks and returns the list to the caller. As a result all
conditions would be taken into consideration in the query plan.I think that the root cause for this issue would be in the
create_scan_plan handling of pseudoconstant quals when creating a
foreign-join (or custom-join) plan. Anyway, I will look at your patch
closely, first.Best regards,
Etsuro Fujita
On Mon, Apr 24, 2023 at 3:31 PM Nishant Sharma
<nishant.sharma@enterprisedb.com> wrote:
Any updates? -- did you get a chance to look into this?
Sorry, I have not looked into this yet, because I have been busy with
some other work recently. I plan to do so early next week.
Best regards,
Etsuro Fujita
On Fri, Apr 14, 2023 at 8:51 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:
I think that the root cause for this issue would be in the
create_scan_plan handling of pseudoconstant quals when creating a
foreign-join (or custom-join) plan.
Yes exactly. In create_scan_plan, we are supposed to extract all the
pseudoconstant clauses and use them as one-time quals in a gating Result
node. Currently we check against rel->baserestrictinfo and ppi_clauses
for the pseudoconstant clauses. But for scans of foreign joins, we do
not have any restriction clauses in these places and thus the gating
Result node as well as the pseudoconstant clauses would just be lost.
I looked at Nishant's patch. IIUC it treats the pseudoconstant clauses
as local conditions. While it can fix the wrong results issue, I think
maybe it's better to still treat the pseudoconstant clauses as one-time
quals in a gating node. So I wonder if we can store the restriction
clauses for foreign joins in ForeignPath, just as what we do for normal
JoinPath, and then check against them for pseudoconstant clauses in
create_scan_plan, something like attached.
BTW, while going through the codes I noticed one place in
add_foreign_final_paths that uses NULL for List *. I changed it to NIL.
Thanks
Richard
Attachments:
v1-0001-Fix-how-we-handle-pseudoconstant-clauses-for-scans-of-foreign-joins.patchapplication/octet-stream; name=v1-0001-Fix-how-we-handle-pseudoconstant-clauses-for-scans-of-foreign-joins.patchDownload
From 32c854736b90aacad38715ab6c02b73140c8c429 Mon Sep 17 00:00:00 2001
From: Richard Guo <guofenglinux@gmail.com>
Date: Tue, 25 Apr 2023 16:10:42 +0800
Subject: [PATCH v1] Fix how we handle pseudoconstant clauses for scans of
foreign joins
When creating a scan plan, we need to extract all the pseudoconstant clauses
which are supposed to be used as one-time quals in a gating Result node.
Normally it suffices to check against baserestrictinfo of the parent relation
of 'best_path', as well as the join clauses available from the outer relations
if this is a parameterized scan. But for scans of foreign joins, we do not
have any restriction clauses in these places, and that leads to gating clauses
being lost in this case.
To fix, store the restriction clauses for foreign joins in ForeignPath and
check against them for gating clauses.
---
.../postgres_fdw/expected/postgres_fdw.out | 21 +++++++++++++++++++
contrib/postgres_fdw/postgres_fdw.c | 13 +++++++-----
contrib/postgres_fdw/sql/postgres_fdw.sql | 6 ++++++
src/backend/optimizer/plan/createplan.c | 9 +++++++-
src/backend/optimizer/util/pathnode.c | 4 ++++
src/include/nodes/pathnodes.h | 5 +++++
src/include/optimizer/pathnode.h | 1 +
7 files changed, 53 insertions(+), 6 deletions(-)
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index fd5752bd5b..c8b072c255 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -893,6 +893,27 @@ SELECT * FROM ft2 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft1 WHERE c1 < 5));
4 | 4 | 00004 | Mon Jan 05 00:00:00 1970 PST | Mon Jan 05 00:00:00 1970 | 4 | 4 | foo
(4 rows)
+-- bug due to sloppy handling of pseudoconstant clauses for foreign joins
+EXPLAIN (VERBOSE, COSTS OFF)
+ SELECT * FROM ft2 a, ft2 b
+ WHERE b.c1 = a.c1 AND now() < '25-April-2023'::timestamp;
+ QUERY PLAN
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Result
+ Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8, b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
+ One-Time Filter: (now() < 'Tue Apr 25 00:00:00 2023'::timestamp without time zone)
+ -> Foreign Scan
+ Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8, b.c1, b.c2, b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
+ Relations: (public.ft2 a) INNER JOIN (public.ft2 b)
+ Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8, r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8 FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r1."C 1" = r2."C 1"))))
+(7 rows)
+
+SELECT * FROM ft2 a, ft2 b
+WHERE b.c1 = a.c1 AND now() < '25-April-2023'::timestamp;
+ c1 | c2 | c3 | c4 | c5 | c6 | c7 | c8 | c1 | c2 | c3 | c4 | c5 | c6 | c7 | c8
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----
+(0 rows)
+
-- we should not push order by clause with volatile expressions or unsafe
-- collations
EXPLAIN (VERBOSE, COSTS OFF)
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 95dbe8b06c..2c79981289 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -524,7 +524,7 @@ static List *get_useful_pathkeys_for_relation(PlannerInfo *root,
RelOptInfo *rel);
static List *get_useful_ecs_for_relation(PlannerInfo *root, RelOptInfo *rel);
static void add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
- Path *epq_path);
+ Path *epq_path, List *joinrestrictinfo);
static void add_foreign_grouping_paths(PlannerInfo *root,
RelOptInfo *input_rel,
RelOptInfo *grouped_rel,
@@ -1038,7 +1038,7 @@ postgresGetForeignPaths(PlannerInfo *root,
add_path(baserel, (Path *) path);
/* Add paths with pathkeys */
- add_paths_with_pathkeys_for_rel(root, baserel, NULL);
+ add_paths_with_pathkeys_for_rel(root, baserel, NULL, NULL);
/*
* If we're not using remote estimates, stop here. We have no way to
@@ -5992,7 +5992,7 @@ foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel, JoinType jointype,
static void
add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
- Path *epq_path)
+ Path *epq_path, List *joinrestrictinfo)
{
List *useful_pathkeys_list = NIL; /* List of all pathkeys */
ListCell *lc;
@@ -6096,6 +6096,7 @@ add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
total_cost,
useful_pathkeys,
rel->lateral_relids,
+ joinrestrictinfo,
sorted_epq_path,
NIL));
}
@@ -6348,6 +6349,7 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
total_cost,
NIL, /* no pathkeys */
joinrel->lateral_relids,
+ extra->restrictlist,
epq_path,
NIL); /* no fdw_private */
@@ -6355,7 +6357,8 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
add_path(joinrel, (Path *) joinpath);
/* Consider pathkeys for the join relation */
- add_paths_with_pathkeys_for_rel(root, joinrel, epq_path);
+ add_paths_with_pathkeys_for_rel(root, joinrel, epq_path,
+ extra->restrictlist);
/* XXX Consider parameterized paths for the join relation */
}
@@ -6981,7 +6984,7 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo *input_rel,
path->total_cost,
path->pathkeys,
NULL, /* no extra plan */
- NULL); /* no fdw_private */
+ NIL); /* no fdw_private */
/* and add it to the final_rel */
add_path(final_rel, (Path *) final_path);
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index c05046f867..246f4c15d8 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -355,6 +355,12 @@ WHERE a.c2 = 6 AND b.c1 = a.c1 AND a.c8 = 'foo' AND b.c7 = upper(a.c7);
-- bug before 9.3.5 due to sloppy handling of remote-estimate parameters
SELECT * FROM ft1 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft2 WHERE c1 < 5));
SELECT * FROM ft2 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft1 WHERE c1 < 5));
+-- bug due to sloppy handling of pseudoconstant clauses for foreign joins
+EXPLAIN (VERBOSE, COSTS OFF)
+ SELECT * FROM ft2 a, ft2 b
+ WHERE b.c1 = a.c1 AND now() < '25-April-2023'::timestamp;
+SELECT * FROM ft2 a, ft2 b
+WHERE b.c1 = a.c1 AND now() < '25-April-2023'::timestamp;
-- we should not push order by clause with volatile expressions or unsafe
-- collations
EXPLAIN (VERBOSE, COSTS OFF)
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 910ffbf1e1..7e20e89813 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -599,8 +599,15 @@ create_scan_plan(PlannerInfo *root, Path *best_path, int flags)
* Detect whether we have any pseudoconstant quals to deal with. Then, if
* we'll need a gating Result node, it will be able to project, so there
* are no requirements on the child's tlist.
+ *
+ * Note that for scans of foreign joins, we do not have restriction clauses
+ * stored in baserestrictinfo and we do not consider parameterization.
+ * Instead we need to check against joinrestrictinfo stored in ForeignPath.
*/
- gating_clauses = get_gating_quals(root, scan_clauses);
+ if (best_path->pathtype == T_ForeignScan && IS_JOIN_REL(rel))
+ gating_clauses = get_gating_quals(root, ((ForeignPath *) best_path)->joinrestrictinfo);
+ else
+ gating_clauses = get_gating_quals(root, scan_clauses);
if (gating_clauses)
flags = 0;
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 5f5596841c..92f18c6846 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -2249,6 +2249,7 @@ create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
pathnode->path.total_cost = total_cost;
pathnode->path.pathkeys = pathkeys;
+ pathnode->joinrestrictinfo = NIL;
pathnode->fdw_outerpath = fdw_outerpath;
pathnode->fdw_private = fdw_private;
@@ -2272,6 +2273,7 @@ create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
double rows, Cost startup_cost, Cost total_cost,
List *pathkeys,
Relids required_outer,
+ List *joinrestrictinfo,
Path *fdw_outerpath,
List *fdw_private)
{
@@ -2299,6 +2301,7 @@ create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
pathnode->path.total_cost = total_cost;
pathnode->path.pathkeys = pathkeys;
+ pathnode->joinrestrictinfo = joinrestrictinfo;
pathnode->fdw_outerpath = fdw_outerpath;
pathnode->fdw_private = fdw_private;
@@ -2344,6 +2347,7 @@ create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
pathnode->path.total_cost = total_cost;
pathnode->path.pathkeys = pathkeys;
+ pathnode->joinrestrictinfo = NIL;
pathnode->fdw_outerpath = fdw_outerpath;
pathnode->fdw_private = fdw_private;
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index cf28416da8..5adf35ac1d 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1828,6 +1828,10 @@ typedef struct SubqueryScanPath
* ForeignPath represents a potential scan of a foreign table, foreign join
* or foreign upper-relation.
*
+ * In the case of foreign join, joinrestrictinfo stores the RestrictInfos to
+ * apply to this join. It is needed because we need to extract gating clauses
+ * from it.
+ *
* fdw_private stores FDW private data about the scan. While fdw_private is
* not actually touched by the core code during normal operations, it's
* generally a good idea to use a representation that can be dumped by
@@ -1837,6 +1841,7 @@ typedef struct SubqueryScanPath
typedef struct ForeignPath
{
Path path;
+ List *joinrestrictinfo; /* RestrictInfos to apply to join */
Path *fdw_outerpath;
List *fdw_private;
} ForeignPath;
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 69be701b16..9f8f1b2798 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -134,6 +134,7 @@ extern ForeignPath *create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
double rows, Cost startup_cost, Cost total_cost,
List *pathkeys,
Relids required_outer,
+ List *joinrestrictinfo,
Path *fdw_outerpath,
List *fdw_private);
extern ForeignPath *create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
--
2.31.0
+1 for fixing this in the backend code rather than FDW code.
Thanks, Richard, for working on this. The patch looks good to me at
a glance.
On Tue, Apr 25, 2023 at 3:36 PM Richard Guo <guofenglinux@gmail.com> wrote:
On Fri, Apr 14, 2023 at 8:51 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:I think that the root cause for this issue would be in the
create_scan_plan handling of pseudoconstant quals when creating a
foreign-join (or custom-join) plan.Yes exactly. In create_scan_plan, we are supposed to extract all the
pseudoconstant clauses and use them as one-time quals in a gating Result
node. Currently we check against rel->baserestrictinfo and ppi_clauses
for the pseudoconstant clauses. But for scans of foreign joins, we do
not have any restriction clauses in these places and thus the gating
Result node as well as the pseudoconstant clauses would just be lost.I looked at Nishant's patch. IIUC it treats the pseudoconstant clauses
as local conditions. While it can fix the wrong results issue, I think
maybe it's better to still treat the pseudoconstant clauses as one-time
quals in a gating node. So I wonder if we can store the restriction
clauses for foreign joins in ForeignPath, just as what we do for normal
JoinPath, and then check against them for pseudoconstant clauses in
create_scan_plan, something like attached.BTW, while going through the codes I noticed one place in
add_foreign_final_paths that uses NULL for List *. I changed it to NIL.Thanks
Richard
--
--
Thanks & Regards,
Suraj kharage,
edbpostgres.com
I also agree that Richard's patch is better. As it fixes the issue at the
backend and does not treat pseudoconstant as local condition.
I have tested Richard's patch and observe that it is resolving the problem.
Patch looks good to me as well.
*I only had a minor comment on below change:-*
*- gating_clauses = get_gating_quals(root, scan_clauses);+ if
(best_path->pathtype == T_ForeignScan && IS_JOIN_REL(rel))+
gating_clauses = get_gating_quals(root, ((ForeignPath *)
best_path)->joinrestrictinfo);+ else+ gating_clauses =
get_gating_quals(root, scan_clauses);*
Instead of using 'if' and creating a special case here can't we do
something in the above switch?
Regards,
Nishant.
P.S
I tried something quickly but I am seeing a crash:-
* case T_IndexOnlyScan: scan_clauses
= castNode(IndexPath, best_path)->indexinfo->indrestrictinfo;
break;+ case T_ForeignScan:+
/*+ * Note that for scans of foreign joins, we do
not have restriction clauses+ * stored in
baserestrictinfo and we do not consider parameterization.+
* Instead we need to check against joinrestrictinfo stored in
ForeignPath.+ */+ if
(IS_JOIN_REL(rel))+ scan_clauses =
((ForeignPath *) best_path)->joinrestrictinfo;+ else+
scan_clauses = rel->baserestrictinfo;+
break; default:
scan_clauses = rel->baserestrictinfo; break;*
On Fri, Jun 2, 2023 at 9:00 AM Suraj Kharage <suraj.kharage@enterprisedb.com>
wrote:
Show quoted text
+1 for fixing this in the backend code rather than FDW code.
Thanks, Richard, for working on this. The patch looks good to me at
a glance.On Tue, Apr 25, 2023 at 3:36 PM Richard Guo <guofenglinux@gmail.com>
wrote:On Fri, Apr 14, 2023 at 8:51 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:I think that the root cause for this issue would be in the
create_scan_plan handling of pseudoconstant quals when creating a
foreign-join (or custom-join) plan.Yes exactly. In create_scan_plan, we are supposed to extract all the
pseudoconstant clauses and use them as one-time quals in a gating Result
node. Currently we check against rel->baserestrictinfo and ppi_clauses
for the pseudoconstant clauses. But for scans of foreign joins, we do
not have any restriction clauses in these places and thus the gating
Result node as well as the pseudoconstant clauses would just be lost.I looked at Nishant's patch. IIUC it treats the pseudoconstant clauses
as local conditions. While it can fix the wrong results issue, I think
maybe it's better to still treat the pseudoconstant clauses as one-time
quals in a gating node. So I wonder if we can store the restriction
clauses for foreign joins in ForeignPath, just as what we do for normal
JoinPath, and then check against them for pseudoconstant clauses in
create_scan_plan, something like attached.BTW, while going through the codes I noticed one place in
add_foreign_final_paths that uses NULL for List *. I changed it to NIL.Thanks
Richard--
--Thanks & Regards,
Suraj kharage,edbpostgres.com
On Fri, Jun 2, 2023 at 11:30 AM Suraj Kharage <
suraj.kharage@enterprisedb.com> wrote:
+1 for fixing this in the backend code rather than FDW code.
Thanks, Richard, for working on this. The patch looks good to me at
a glance.
Thank you Suraj for the review!
Thanks
Richard
On Fri, Jun 2, 2023 at 8:31 PM Nishant Sharma <
nishant.sharma@enterprisedb.com> wrote:
*I only had a minor comment on below change:-*
*- gating_clauses = get_gating_quals(root, scan_clauses);+ if
(best_path->pathtype == T_ForeignScan && IS_JOIN_REL(rel))+
gating_clauses = get_gating_quals(root, ((ForeignPath *)
best_path)->joinrestrictinfo);+ else+ gating_clauses =
get_gating_quals(root, scan_clauses);*Instead of using 'if' and creating a special case here can't we do
something in the above switch?
I thought about that too. IIRC I did not do it in that way because
postgresGetForeignPlan expects that there is no scan_clauses for a join
rel. So doing that would trigger the Assert there.
/*
* For a join rel, baserestrictinfo is NIL and we are not considering
* parameterization right now, so there should be no scan_clauses for
* a joinrel or an upper rel either.
*/
Assert(!scan_clauses);
Thanks
Richard
Hi,
On Fri, Jun 2, 2023 at 9:31 PM Nishant Sharma
<nishant.sharma@enterprisedb.com> wrote:
I also agree that Richard's patch is better. As it fixes the issue at the backend and does not treat pseudoconstant as local condition.
I have tested Richard's patch and observe that it is resolving the problem. Patch looks good to me as well.
If the patch is intended for HEAD only, I also think it goes in the
right direction. But if it is intended for back branches as well, I
do not think so, because it would cause ABI breakage due to changes
made to the ForeignPath struct and the create_foreign_join_path() API.
(For the former, I think we could avoid doing so by adding the new
member at the end of the struct, not in the middle, though.)
To avoid this issue, I am wondering if we should modify
add_paths_to_joinrel() in back branches so that it just disallows the
FDW to consider pushing down joins when the restrictlist has
pseudoconstant clauses. Attached is a patch for that.
My apologies for not reviewing your patch and the long long delay.
Best regards,
Etsuro Fujita
Attachments:
disallow-join-pushdown-if-pseudoconstants.patchapplication/octet-stream; name=disallow-join-pushdown-if-pseudoconstants.patchDownload
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index c8c4614b54..98df6371c0 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2316,6 +2316,32 @@ SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM
1
(10 rows)
+-- join with pseudoconstant quals, not pushed down.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1 AND CURRENT_USER = SESSION_USER) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ QUERY PLAN
+-------------------------------------------------------------------------------
+ Limit
+ Output: t1.c1, t2.c1, t1.c3
+ -> Sort
+ Output: t1.c1, t2.c1, t1.c3
+ Sort Key: t1.c3, t1.c1
+ -> Result
+ Output: t1.c1, t2.c1, t1.c3
+ One-Time Filter: (CURRENT_USER = SESSION_USER)
+ -> Hash Join
+ Output: t1.c1, t1.c3, t2.c1
+ Hash Cond: (t2.c1 = t1.c1)
+ -> Foreign Scan on public.ft2 t2
+ Output: t2.c1
+ Remote SQL: SELECT "C 1" FROM "S 1"."T 1"
+ -> Hash
+ Output: t1.c1, t1.c3
+ -> Foreign Scan on public.ft1 t1
+ Output: t1.c1, t1.c3
+ Remote SQL: SELECT "C 1", c3 FROM "S 1"."T 1"
+(19 rows)
+
-- non-Var items in targetlist of the nullable rel of a join preventing
-- push-down in some cases
-- unable to push {ft1, ft2}
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index b54903ad8f..ef3ef4e2ed 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -640,6 +640,9 @@ SELECT t1c1, avg(t1c1 + t2c1) FROM (SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2
EXPLAIN (VERBOSE, COSTS OFF)
SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM ft1 t2, ft2 t3 WHERE t2.c1 = t3.c1 AND t2.c2 = t1.c2) q ORDER BY t1."C 1" OFFSET 10 LIMIT 10;
SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM ft1 t2, ft2 t3 WHERE t2.c1 = t3.c1 AND t2.c2 = t1.c2) q ORDER BY t1."C 1" OFFSET 10 LIMIT 10;
+-- join with pseudoconstant quals, not pushed down.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1 AND CURRENT_USER = SESSION_USER) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-- non-Var items in targetlist of the nullable rel of a join preventing
-- push-down in some cases
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index cd80e61fd7..092be6fbfc 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -24,6 +24,7 @@
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/planmain.h"
+#include "optimizer/restrictinfo.h"
#include "utils/typcache.h"
/* Hook for plugins to get control in add_paths_to_joinrel() */
@@ -325,9 +326,15 @@ add_paths_to_joinrel(PlannerInfo *root,
* 5. If inner and outer relations are foreign tables (or joins) belonging
* to the same server and assigned to the same user to check access
* permissions as, give the FDW a chance to push down joins.
+ *
+ * createplan.c doesn't currently support handling of pseudoconstant
+ * clauses in ForeignScan nodes that push down joins to remote servers; if
+ * the restrictlist has such clauses then disallow the FDW to consider
+ * pushing down joins.
*/
if (joinrel->fdwroutine &&
- joinrel->fdwroutine->GetForeignJoinPaths)
+ joinrel->fdwroutine->GetForeignJoinPaths &&
+ !has_pseudoconstant_clauses(root, restrictlist))
joinrel->fdwroutine->GetForeignJoinPaths(root, joinrel,
outerrel, innerrel,
jointype, &extra);
diff --git a/src/backend/optimizer/util/restrictinfo.c b/src/backend/optimizer/util/restrictinfo.c
index d6d26a2b51..537c3d4373 100644
--- a/src/backend/optimizer/util/restrictinfo.c
+++ b/src/backend/optimizer/util/restrictinfo.c
@@ -508,6 +508,36 @@ extract_actual_clauses(List *restrictinfo_list,
return result;
}
+/*
+ * has_pseudoconstant_clauses
+ *
+ * Returns true if 'restrictinfo_list' includes pseudoconstant clauses.
+ *
+ * This is used when we determine whether to give the FDW a chance to push
+ * down joins to the remote server in add_paths_to_joinrel().
+ */
+bool
+has_pseudoconstant_clauses(PlannerInfo *root,
+ List *restrictinfo_list)
+{
+ ListCell *l;
+
+ /* No need to look if we know there are no pseudoconstants */
+ if (!root->hasPseudoConstantQuals)
+ return false;
+
+ /* See if there are pseudoconstants in the RestrictInfo list */
+ foreach(l, restrictinfo_list)
+ {
+ RestrictInfo *rinfo = lfirst_node(RestrictInfo, l);
+
+ if (rinfo->pseudoconstant &&
+ !rinfo_is_constant_true(rinfo))
+ return true;
+ }
+ return false;
+}
+
/*
* extract_actual_join_clauses
*
diff --git a/src/include/optimizer/restrictinfo.h b/src/include/optimizer/restrictinfo.h
index e140e619ac..14cdce750c 100644
--- a/src/include/optimizer/restrictinfo.h
+++ b/src/include/optimizer/restrictinfo.h
@@ -43,6 +43,8 @@ extern void extract_actual_join_clauses(List *restrictinfo_list,
Relids joinrelids,
List **joinquals,
List **otherquals);
+extern bool has_pseudoconstant_clauses(PlannerInfo *root,
+ List *restrictinfo_list);
extern bool join_clause_is_movable_to(RestrictInfo *rinfo, RelOptInfo *baserel);
extern bool join_clause_is_movable_into(RestrictInfo *rinfo,
Relids currentrelids,
On Mon, Jun 5, 2023 at 9:19 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:
If the patch is intended for HEAD only, I also think it goes in the
right direction. But if it is intended for back branches as well, I
do not think so, because it would cause ABI breakage due to changes
made to the ForeignPath struct and the create_foreign_join_path() API.
(For the former, I think we could avoid doing so by adding the new
member at the end of the struct, not in the middle, though.)
Thanks for pointing this out. You're right. The patch has backport
issue because of the ABI breakage. So it can only be applied on HEAD.
To avoid this issue, I am wondering if we should modify
add_paths_to_joinrel() in back branches so that it just disallows the
FDW to consider pushing down joins when the restrictlist has
pseudoconstant clauses. Attached is a patch for that.
I think we can do that in back branches. But I'm a little concerned
that we'd miss a better plan if FDW cannot push down joins in such
cases. I may be worrying over nothing though if it's not common that
the restrictlist has pseudoconstant clauses.
Thanks
Richard
Hi,
Etsuro's patch is also showing the correct output for "set
enable_nestloop=off". Looks good to me for back branches due to backport
issues.
But below are a few observations for the same:-
1) I looked into the query plan for both "set enable_nestloop" on & off
case and observe that they are the same. That is, what we see with "set
enable_nestloop=on".
2) In back branches for "set enable_nestloop" on & off value, at least this
type of query execution won't make any difference. No comparison of plans
to be selected based on total cost of two plans old (Nested Loop with
Foreign Scans) & new (Only Foreign Scan) will be done, because we are
avoiding the call to "postgresGetForeignJoinPaths()" up front when we have
pseudo constants.
Regards,
Nishant.
On Tue, Jun 6, 2023 at 8:50 AM Richard Guo <guofenglinux@gmail.com> wrote:
Show quoted text
On Mon, Jun 5, 2023 at 9:19 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:If the patch is intended for HEAD only, I also think it goes in the
right direction. But if it is intended for back branches as well, I
do not think so, because it would cause ABI breakage due to changes
made to the ForeignPath struct and the create_foreign_join_path() API.
(For the former, I think we could avoid doing so by adding the new
member at the end of the struct, not in the middle, though.)Thanks for pointing this out. You're right. The patch has backport
issue because of the ABI breakage. So it can only be applied on HEAD.To avoid this issue, I am wondering if we should modify
add_paths_to_joinrel() in back branches so that it just disallows the
FDW to consider pushing down joins when the restrictlist has
pseudoconstant clauses. Attached is a patch for that.I think we can do that in back branches. But I'm a little concerned
that we'd miss a better plan if FDW cannot push down joins in such
cases. I may be worrying over nothing though if it's not common that
the restrictlist has pseudoconstant clauses.Thanks
Richard
Hi Richard,
On Tue, Jun 6, 2023 at 12:20 PM Richard Guo <guofenglinux@gmail.com> wrote:
On Mon, Jun 5, 2023 at 9:19 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
To avoid this issue, I am wondering if we should modify
add_paths_to_joinrel() in back branches so that it just disallows the
FDW to consider pushing down joins when the restrictlist has
pseudoconstant clauses. Attached is a patch for that.
I think we can do that in back branches. But I'm a little concerned
that we'd miss a better plan if FDW cannot push down joins in such
cases. I may be worrying over nothing though if it's not common that
the restrictlist has pseudoconstant clauses.
Yeah, it is unfortunate that we would not get better plans. Given
that it took quite a long time to find this issue, I suppose that
users seldom do foreign joins with pseudoconstant clauses, though.
Anyway thanks for working on this, Richard!
Best regards,
Etsuro Fujita
Hi,
On Wed, Jun 7, 2023 at 7:28 PM Nishant Sharma
<nishant.sharma@enterprisedb.com> wrote:
Etsuro's patch is also showing the correct output for "set enable_nestloop=off". Looks good to me for back branches due to backport issues.
But below are a few observations for the same:-
1) I looked into the query plan for both "set enable_nestloop" on & off case and observe that they are the same. That is, what we see with "set enable_nestloop=on".
2) In back branches for "set enable_nestloop" on & off value, at least this type of query execution won't make any difference. No comparison of plans to be selected based on total cost of two plans old (Nested Loop with Foreign Scans) & new (Only Foreign Scan) will be done, because we are avoiding the call to "postgresGetForeignJoinPaths()" up front when we have pseudo constants.
Thanks for looking!
Best regards,
Etsuro Fujita
On Mon, Jun 5, 2023 at 10:19 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
To avoid this issue, I am wondering if we should modify
add_paths_to_joinrel() in back branches so that it just disallows the
FDW to consider pushing down joins when the restrictlist has
pseudoconstant clauses. Attached is a patch for that.
I think that custom scans have the same issue, so I modified the patch
further so that it also disallows custom-scan providers to consider
join pushdown in add_paths_to_joinrel() if necessary. Attached is a
new version of the patch.
Best regards,
Etsuro Fujita
Attachments:
disallow-join-pushdown-if-pseudoconstants-v2.patchapplication/octet-stream; name=disallow-join-pushdown-if-pseudoconstants-v2.patchDownload
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index c8c4614b54..98df6371c0 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2316,6 +2316,32 @@ SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM
1
(10 rows)
+-- join with pseudoconstant quals, not pushed down.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1 AND CURRENT_USER = SESSION_USER) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ QUERY PLAN
+-------------------------------------------------------------------------------
+ Limit
+ Output: t1.c1, t2.c1, t1.c3
+ -> Sort
+ Output: t1.c1, t2.c1, t1.c3
+ Sort Key: t1.c3, t1.c1
+ -> Result
+ Output: t1.c1, t2.c1, t1.c3
+ One-Time Filter: (CURRENT_USER = SESSION_USER)
+ -> Hash Join
+ Output: t1.c1, t1.c3, t2.c1
+ Hash Cond: (t2.c1 = t1.c1)
+ -> Foreign Scan on public.ft2 t2
+ Output: t2.c1
+ Remote SQL: SELECT "C 1" FROM "S 1"."T 1"
+ -> Hash
+ Output: t1.c1, t1.c3
+ -> Foreign Scan on public.ft1 t1
+ Output: t1.c1, t1.c3
+ Remote SQL: SELECT "C 1", c3 FROM "S 1"."T 1"
+(19 rows)
+
-- non-Var items in targetlist of the nullable rel of a join preventing
-- push-down in some cases
-- unable to push {ft1, ft2}
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index b54903ad8f..ef3ef4e2ed 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -640,6 +640,9 @@ SELECT t1c1, avg(t1c1 + t2c1) FROM (SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2
EXPLAIN (VERBOSE, COSTS OFF)
SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM ft1 t2, ft2 t3 WHERE t2.c1 = t3.c1 AND t2.c2 = t1.c2) q ORDER BY t1."C 1" OFFSET 10 LIMIT 10;
SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM ft1 t2, ft2 t3 WHERE t2.c1 = t3.c1 AND t2.c2 = t1.c2) q ORDER BY t1."C 1" OFFSET 10 LIMIT 10;
+-- join with pseudoconstant quals, not pushed down.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1 AND CURRENT_USER = SESSION_USER) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-- non-Var items in targetlist of the nullable rel of a join preventing
-- push-down in some cases
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 5ba266fdb6..98713f5fd6 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -24,6 +24,7 @@
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/planmain.h"
+#include "optimizer/restrictinfo.h"
#include "utils/typcache.h"
/* Hook for plugins to get control in add_paths_to_joinrel() */
@@ -130,6 +131,7 @@ add_paths_to_joinrel(PlannerInfo *root,
{
JoinPathExtraData extra;
bool mergejoin_allowed = true;
+ bool consider_join_pushdown = false;
ListCell *lc;
Relids joinrelids;
@@ -321,13 +323,26 @@ add_paths_to_joinrel(PlannerInfo *root,
hash_inner_and_outer(root, joinrel, outerrel, innerrel,
jointype, &extra);
+ /*
+ * createplan.c doesn't currently support handling of pseudoconstant
+ * clauses in ForeignScan/CustomScan nodes that push down joins; if the
+ * restrictlist has such clauses then disallow extensions to consider
+ * pushing down joins.
+ */
+ if ((joinrel->fdwroutine &&
+ joinrel->fdwroutine->GetForeignJoinPaths) ||
+ set_join_pathlist_hook)
+ consider_join_pushdown = !has_pseudoconstant_clauses(root,
+ restrictlist);
+
/*
* 5. If inner and outer relations are foreign tables (or joins) belonging
* to the same server and assigned to the same user to check access
* permissions as, give the FDW a chance to push down joins.
*/
if (joinrel->fdwroutine &&
- joinrel->fdwroutine->GetForeignJoinPaths)
+ joinrel->fdwroutine->GetForeignJoinPaths &&
+ consider_join_pushdown)
joinrel->fdwroutine->GetForeignJoinPaths(root, joinrel,
outerrel, innerrel,
jointype, &extra);
@@ -335,7 +350,8 @@ add_paths_to_joinrel(PlannerInfo *root,
/*
* 6. Finally, give extensions a chance to manipulate the path list.
*/
- if (set_join_pathlist_hook)
+ if (set_join_pathlist_hook &&
+ consider_join_pushdown)
set_join_pathlist_hook(root, joinrel, outerrel, innerrel,
jointype, &extra);
}
diff --git a/src/backend/optimizer/util/restrictinfo.c b/src/backend/optimizer/util/restrictinfo.c
index d6d26a2b51..701f8812c9 100644
--- a/src/backend/optimizer/util/restrictinfo.c
+++ b/src/backend/optimizer/util/restrictinfo.c
@@ -508,6 +508,36 @@ extract_actual_clauses(List *restrictinfo_list,
return result;
}
+/*
+ * has_pseudoconstant_clauses
+ *
+ * Returns true if 'restrictinfo_list' includes pseudoconstant clauses.
+ *
+ * This is used when we determine whether to allow extensions to consider
+ * pushing down joins to remote sides in add_paths_to_joinrel().
+ */
+bool
+has_pseudoconstant_clauses(PlannerInfo *root,
+ List *restrictinfo_list)
+{
+ ListCell *l;
+
+ /* No need to look if we know there are no pseudoconstants */
+ if (!root->hasPseudoConstantQuals)
+ return false;
+
+ /* See if there are pseudoconstants in the RestrictInfo list */
+ foreach(l, restrictinfo_list)
+ {
+ RestrictInfo *rinfo = lfirst_node(RestrictInfo, l);
+
+ if (rinfo->pseudoconstant &&
+ !rinfo_is_constant_true(rinfo))
+ return true;
+ }
+ return false;
+}
+
/*
* extract_actual_join_clauses
*
diff --git a/src/include/optimizer/restrictinfo.h b/src/include/optimizer/restrictinfo.h
index e140e619ac..14cdce750c 100644
--- a/src/include/optimizer/restrictinfo.h
+++ b/src/include/optimizer/restrictinfo.h
@@ -43,6 +43,8 @@ extern void extract_actual_join_clauses(List *restrictinfo_list,
Relids joinrelids,
List **joinquals,
List **otherquals);
+extern bool has_pseudoconstant_clauses(PlannerInfo *root,
+ List *restrictinfo_list);
extern bool join_clause_is_movable_to(RestrictInfo *rinfo, RelOptInfo *baserel);
extern bool join_clause_is_movable_into(RestrictInfo *rinfo,
Relids currentrelids,
Looks good to me. Tested on master and it works.
New patch used a bool flag to avoid calls for both FDW and custom hook's
call. And a slight change in comment of "has_pseudoconstant_clauses"
function.
Regards,
Nishant.
On Wed, Jun 14, 2023 at 12:19 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:
Show quoted text
On Mon, Jun 5, 2023 at 10:19 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:To avoid this issue, I am wondering if we should modify
add_paths_to_joinrel() in back branches so that it just disallows the
FDW to consider pushing down joins when the restrictlist has
pseudoconstant clauses. Attached is a patch for that.I think that custom scans have the same issue, so I modified the patch
further so that it also disallows custom-scan providers to consider
join pushdown in add_paths_to_joinrel() if necessary. Attached is a
new version of the patch.Best regards,
Etsuro Fujita
On Wed, Jun 14, 2023 at 2:49 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:
On Mon, Jun 5, 2023 at 10:19 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:To avoid this issue, I am wondering if we should modify
add_paths_to_joinrel() in back branches so that it just disallows the
FDW to consider pushing down joins when the restrictlist has
pseudoconstant clauses. Attached is a patch for that.I think that custom scans have the same issue, so I modified the patch
further so that it also disallows custom-scan providers to consider
join pushdown in add_paths_to_joinrel() if necessary. Attached is a
new version of the patch.
Good point. The v2 patch looks good to me for back branches.
I'm wondering what the plan is for HEAD. Should we also disallow
foreign/custom join pushdown in the case that there is any
pseudoconstant restriction clause, or instead still allow join pushdown
in that case? If it is the latter, I think we can do something like my
patch upthread does. But that patch needs to be revised to consider
custom scans, maybe by storing the restriction clauses also in
CustomPath?
Thanks
Richard
Hi Richard,
On Sun, Jun 25, 2023 at 3:05 PM Richard Guo <guofenglinux@gmail.com> wrote:
On Wed, Jun 14, 2023 at 2:49 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
On Mon, Jun 5, 2023 at 10:19 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
To avoid this issue, I am wondering if we should modify
add_paths_to_joinrel() in back branches so that it just disallows the
FDW to consider pushing down joins when the restrictlist has
pseudoconstant clauses. Attached is a patch for that.I think that custom scans have the same issue, so I modified the patch
further so that it also disallows custom-scan providers to consider
join pushdown in add_paths_to_joinrel() if necessary. Attached is a
Good point. The v2 patch looks good to me for back branches.
Cool! Thanks for looking!
I'm wondering what the plan is for HEAD. Should we also disallow
foreign/custom join pushdown in the case that there is any
pseudoconstant restriction clause, or instead still allow join pushdown
in that case? If it is the latter, I think we can do something like my
patch upthread does. But that patch needs to be revised to consider
custom scans, maybe by storing the restriction clauses also in
CustomPath?
I think we should choose the latter, so I modified your patch as
mentioned, after re-creating it on top of my patch. Attached is a new
version (0002-Allow-join-pushdown-even-if-pseudoconstant-quals-v2.patch).
I am attaching my patch as well
(0001-Disable-join-pushdown-if-pseudoconstant-quals-v2.patch).
Other changes made to your patch:
* I renamed the new member of the ForeignPath struct to
fdw_restrictinfo. (And I named that of the CustomPath struct
custom_restrictinfo.)
* In your patch, only for create_foreign_join_path(), the API was
modified so that the caller provides the new member of ForeignPath,
but I modified that for
create_foreignscan_path()/create_foreign_upper_path() as well, for
consistency.
* In this bit I changed the last argument to NIL, which would be
nitpicking, though.
@@ -1038,7 +1038,7 @@ postgresGetForeignPaths(PlannerInfo *root,
add_path(baserel, (Path *) path);
/* Add paths with pathkeys */
- add_paths_with_pathkeys_for_rel(root, baserel, NULL);
+ add_paths_with_pathkeys_for_rel(root, baserel, NULL, NULL);
* I dropped this test case, because it would not be stable if the
system clock was too slow.
+-- bug due to sloppy handling of pseudoconstant clauses for foreign joins
+EXPLAIN (VERBOSE, COSTS OFF)
+ SELECT * FROM ft2 a, ft2 b
+ WHERE b.c1 = a.c1 AND now() < '25-April-2023'::timestamp;
+SELECT * FROM ft2 a, ft2 b
+WHERE b.c1 = a.c1 AND now() < '25-April-2023'::timestamp;
That is it.
Sorry for the long long delay.
Best regards,
Etsuro Fujita
Attachments:
0001-Disable-join-pushdown-if-pseudoconstant-quals-v2.patchapplication/octet-stream; name=0001-Disable-join-pushdown-if-pseudoconstant-quals-v2.patchDownload
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 852b5b4707..f6d3b8ec08 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2316,6 +2316,32 @@ SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM
1
(10 rows)
+-- join with pseudoconstant quals, not pushed down.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1 AND CURRENT_USER = SESSION_USER) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
+ QUERY PLAN
+-------------------------------------------------------------------------------
+ Limit
+ Output: t1.c1, t2.c1, t1.c3
+ -> Sort
+ Output: t1.c1, t2.c1, t1.c3
+ Sort Key: t1.c3, t1.c1
+ -> Result
+ Output: t1.c1, t2.c1, t1.c3
+ One-Time Filter: (CURRENT_USER = SESSION_USER)
+ -> Hash Join
+ Output: t1.c1, t1.c3, t2.c1
+ Hash Cond: (t2.c1 = t1.c1)
+ -> Foreign Scan on public.ft2 t2
+ Output: t2.c1
+ Remote SQL: SELECT "C 1" FROM "S 1"."T 1"
+ -> Hash
+ Output: t1.c1, t1.c3
+ -> Foreign Scan on public.ft1 t1
+ Output: t1.c1, t1.c3
+ Remote SQL: SELECT "C 1", c3 FROM "S 1"."T 1"
+(19 rows)
+
-- non-Var items in targetlist of the nullable rel of a join preventing
-- push-down in some cases
-- unable to push {ft1, ft2}
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 2fe8abc7af..436feee396 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -640,6 +640,9 @@ SELECT t1c1, avg(t1c1 + t2c1) FROM (SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2
EXPLAIN (VERBOSE, COSTS OFF)
SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM ft1 t2, ft2 t3 WHERE t2.c1 = t3.c1 AND t2.c2 = t1.c2) q ORDER BY t1."C 1" OFFSET 10 LIMIT 10;
SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM ft1 t2, ft2 t3 WHERE t2.c1 = t3.c1 AND t2.c2 = t1.c2) q ORDER BY t1."C 1" OFFSET 10 LIMIT 10;
+-- join with pseudoconstant quals, not pushed down.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1 AND CURRENT_USER = SESSION_USER) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-- non-Var items in targetlist of the nullable rel of a join preventing
-- push-down in some cases
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index f047ad9ba4..c2034268ca 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -24,6 +24,7 @@
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/planmain.h"
+#include "optimizer/restrictinfo.h"
#include "utils/typcache.h"
/* Hook for plugins to get control in add_paths_to_joinrel() */
@@ -130,6 +131,7 @@ add_paths_to_joinrel(PlannerInfo *root,
{
JoinPathExtraData extra;
bool mergejoin_allowed = true;
+ bool consider_join_pushdown = false;
ListCell *lc;
Relids joinrelids;
@@ -321,13 +323,26 @@ add_paths_to_joinrel(PlannerInfo *root,
hash_inner_and_outer(root, joinrel, outerrel, innerrel,
jointype, &extra);
+ /*
+ * createplan.c does not currently support handling of pseudoconstant
+ * clauses in ForeignScan/CustomScan nodes that push down joins; if the
+ * restrictlist has such clauses then disallow extensions to consider
+ * pushing down joins.
+ */
+ if ((joinrel->fdwroutine &&
+ joinrel->fdwroutine->GetForeignJoinPaths) ||
+ set_join_pathlist_hook)
+ consider_join_pushdown = !has_pseudoconstant_clauses(root,
+ restrictlist);
+
/*
* 5. If inner and outer relations are foreign tables (or joins) belonging
* to the same server and assigned to the same user to check access
* permissions as, give the FDW a chance to push down joins.
*/
if (joinrel->fdwroutine &&
- joinrel->fdwroutine->GetForeignJoinPaths)
+ joinrel->fdwroutine->GetForeignJoinPaths &&
+ consider_join_pushdown)
joinrel->fdwroutine->GetForeignJoinPaths(root, joinrel,
outerrel, innerrel,
jointype, &extra);
@@ -335,7 +350,8 @@ add_paths_to_joinrel(PlannerInfo *root,
/*
* 6. Finally, give extensions a chance to manipulate the path list.
*/
- if (set_join_pathlist_hook)
+ if (set_join_pathlist_hook &&
+ consider_join_pushdown)
set_join_pathlist_hook(root, joinrel, outerrel, innerrel,
jointype, &extra);
}
diff --git a/src/backend/optimizer/util/restrictinfo.c b/src/backend/optimizer/util/restrictinfo.c
index d6d26a2b51..701f8812c9 100644
--- a/src/backend/optimizer/util/restrictinfo.c
+++ b/src/backend/optimizer/util/restrictinfo.c
@@ -508,6 +508,36 @@ extract_actual_clauses(List *restrictinfo_list,
return result;
}
+/*
+ * has_pseudoconstant_clauses
+ *
+ * Returns true if 'restrictinfo_list' includes pseudoconstant clauses.
+ *
+ * This is used when we determine whether to allow extensions to consider
+ * pushing down joins to remote sides in add_paths_to_joinrel().
+ */
+bool
+has_pseudoconstant_clauses(PlannerInfo *root,
+ List *restrictinfo_list)
+{
+ ListCell *l;
+
+ /* No need to look if we know there are no pseudoconstants */
+ if (!root->hasPseudoConstantQuals)
+ return false;
+
+ /* See if there are pseudoconstants in the RestrictInfo list */
+ foreach(l, restrictinfo_list)
+ {
+ RestrictInfo *rinfo = lfirst_node(RestrictInfo, l);
+
+ if (rinfo->pseudoconstant &&
+ !rinfo_is_constant_true(rinfo))
+ return true;
+ }
+ return false;
+}
+
/*
* extract_actual_join_clauses
*
diff --git a/src/include/optimizer/restrictinfo.h b/src/include/optimizer/restrictinfo.h
index e140e619ac..14cdce750c 100644
--- a/src/include/optimizer/restrictinfo.h
+++ b/src/include/optimizer/restrictinfo.h
@@ -43,6 +43,8 @@ extern void extract_actual_join_clauses(List *restrictinfo_list,
Relids joinrelids,
List **joinquals,
List **otherquals);
+extern bool has_pseudoconstant_clauses(PlannerInfo *root,
+ List *restrictinfo_list);
extern bool join_clause_is_movable_to(RestrictInfo *rinfo, RelOptInfo *baserel);
extern bool join_clause_is_movable_into(RestrictInfo *rinfo,
Relids currentrelids,
0002-Allow-join-pushdown-even-if-pseudoconstant-quals-v2.patchapplication/octet-stream; name=0002-Allow-join-pushdown-even-if-pseudoconstant-quals-v2.patchDownload
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 9e330b9934..f14e6fe2e0 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -581,6 +581,7 @@ fileGetForeignPaths(PlannerInfo *root,
NIL, /* no pathkeys */
baserel->lateral_relids,
NULL, /* no extra plan */
+ NIL,
coptions));
/*
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index f6d3b8ec08..77df7eb8e4 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2316,31 +2316,21 @@ SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM
1
(10 rows)
--- join with pseudoconstant quals, not pushed down.
+-- join with pseudoconstant quals
EXPLAIN (VERBOSE, COSTS OFF)
SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1 AND CURRENT_USER = SESSION_USER) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
- QUERY PLAN
--------------------------------------------------------------------------------
+ QUERY PLAN
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit
Output: t1.c1, t2.c1, t1.c3
- -> Sort
+ -> Result
Output: t1.c1, t2.c1, t1.c3
- Sort Key: t1.c3, t1.c1
- -> Result
- Output: t1.c1, t2.c1, t1.c3
- One-Time Filter: (CURRENT_USER = SESSION_USER)
- -> Hash Join
- Output: t1.c1, t1.c3, t2.c1
- Hash Cond: (t2.c1 = t1.c1)
- -> Foreign Scan on public.ft2 t2
- Output: t2.c1
- Remote SQL: SELECT "C 1" FROM "S 1"."T 1"
- -> Hash
- Output: t1.c1, t1.c3
- -> Foreign Scan on public.ft1 t1
- Output: t1.c1, t1.c3
- Remote SQL: SELECT "C 1", c3 FROM "S 1"."T 1"
-(19 rows)
+ One-Time Filter: (CURRENT_USER = SESSION_USER)
+ -> Foreign Scan
+ Output: t1.c1, t1.c3, t2.c1
+ Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+ Remote SQL: SELECT r1."C 1", r2."C 1", r1.c3 FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r2."C 1" = r1."C 1")))) ORDER BY r1.c3 ASC NULLS LAST, r1."C 1" ASC NULLS LAST
+(9 rows)
-- non-Var items in targetlist of the nullable rel of a join preventing
-- push-down in some cases
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index c5cada55fb..2804a13313 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -524,7 +524,7 @@ static List *get_useful_pathkeys_for_relation(PlannerInfo *root,
RelOptInfo *rel);
static List *get_useful_ecs_for_relation(PlannerInfo *root, RelOptInfo *rel);
static void add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
- Path *epq_path);
+ Path *epq_path, List *restrictlist);
static void add_foreign_grouping_paths(PlannerInfo *root,
RelOptInfo *input_rel,
RelOptInfo *grouped_rel,
@@ -1034,11 +1034,12 @@ postgresGetForeignPaths(PlannerInfo *root,
NIL, /* no pathkeys */
baserel->lateral_relids,
NULL, /* no extra plan */
+ NIL,
NIL); /* no fdw_private list */
add_path(baserel, (Path *) path);
/* Add paths with pathkeys */
- add_paths_with_pathkeys_for_rel(root, baserel, NULL);
+ add_paths_with_pathkeys_for_rel(root, baserel, NULL, NIL);
/*
* If we're not using remote estimates, stop here. We have no way to
@@ -1206,6 +1207,7 @@ postgresGetForeignPaths(PlannerInfo *root,
NIL, /* no pathkeys */
param_info->ppi_req_outer,
NULL,
+ NIL,
NIL); /* no fdw_private list */
add_path(baserel, (Path *) path);
}
@@ -5991,7 +5993,7 @@ foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel, JoinType jointype,
static void
add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
- Path *epq_path)
+ Path *epq_path, List *restrictlist)
{
List *useful_pathkeys_list = NIL; /* List of all pathkeys */
ListCell *lc;
@@ -6085,6 +6087,7 @@ add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
useful_pathkeys,
rel->lateral_relids,
sorted_epq_path,
+ NIL,
NIL));
else
add_path(rel, (Path *)
@@ -6096,6 +6099,7 @@ add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
useful_pathkeys,
rel->lateral_relids,
sorted_epq_path,
+ restrictlist,
NIL));
}
}
@@ -6348,13 +6352,15 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
NIL, /* no pathkeys */
joinrel->lateral_relids,
epq_path,
+ extra->restrictlist,
NIL); /* no fdw_private */
/* Add generated path into joinrel by add_path(). */
add_path(joinrel, (Path *) joinpath);
/* Consider pathkeys for the join relation */
- add_paths_with_pathkeys_for_rel(root, joinrel, epq_path);
+ add_paths_with_pathkeys_for_rel(root, joinrel, epq_path,
+ extra->restrictlist);
/* XXX Consider parameterized paths for the join relation */
}
@@ -6735,6 +6741,7 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
total_cost,
NIL, /* no pathkeys */
NULL,
+ NIL,
NIL); /* no fdw_private */
/* Add generated path into grouped_rel by add_path(). */
@@ -6868,6 +6875,7 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
total_cost,
root->sort_pathkeys,
NULL, /* no extra plan */
+ NIL,
fdw_private);
/* and add it to the ordered_rel */
@@ -6983,7 +6991,8 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo *input_rel,
path->total_cost,
path->pathkeys,
NULL, /* no extra plan */
- NULL); /* no fdw_private */
+ NIL,
+ NIL); /* no fdw_private */
/* and add it to the final_rel */
add_path(final_rel, (Path *) final_path);
@@ -7103,6 +7112,7 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo *input_rel,
total_cost,
pathkeys,
NULL, /* no extra plan */
+ NIL,
fdw_private);
/* and add it to the final_rel */
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 436feee396..cfb1b57e33 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -640,7 +640,7 @@ SELECT t1c1, avg(t1c1 + t2c1) FROM (SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2
EXPLAIN (VERBOSE, COSTS OFF)
SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM ft1 t2, ft2 t3 WHERE t2.c1 = t3.c1 AND t2.c2 = t1.c2) q ORDER BY t1."C 1" OFFSET 10 LIMIT 10;
SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM ft1 t2, ft2 t3 WHERE t2.c1 = t3.c1 AND t2.c2 = t1.c2) q ORDER BY t1."C 1" OFFSET 10 LIMIT 10;
--- join with pseudoconstant quals, not pushed down.
+-- join with pseudoconstant quals
EXPLAIN (VERBOSE, COSTS OFF)
SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1 AND CURRENT_USER = SESSION_USER) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
diff --git a/doc/src/sgml/custom-scan.sgml b/doc/src/sgml/custom-scan.sgml
index 93d96f2f56..d30aaeb024 100644
--- a/doc/src/sgml/custom-scan.sgml
+++ b/doc/src/sgml/custom-scan.sgml
@@ -62,6 +62,7 @@ typedef struct CustomPath
Path path;
uint32 flags;
List *custom_paths;
+ List *custom_restrictinfo;
List *custom_private;
const CustomPathMethods *methods;
} CustomPath;
@@ -85,6 +86,11 @@ typedef struct CustomPath
An optional <structfield>custom_paths</structfield> is a list of <structname>Path</structname>
nodes used by this custom-path node; these will be transformed into
<structname>Plan</structname> nodes by planner.
+ As described below, custom paths can be created for join relations as
+ well. In such a case, <structfield>custom_restrictinfo</structfield>
+ should be used to store a list of <structname>RestrictInfo</structname>
+ nodes to apply to the join, which will be used by planner to convert the
+ custom path to a finished plan. Otherwise it should be NIL.
<structfield>custom_private</structfield> can be used to store the custom path's
private data. Private data should be stored in a form that can be handled
by <literal>nodeToString</literal>, so that debugging routines that attempt to
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index c2034268ca..f047ad9ba4 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -24,7 +24,6 @@
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/planmain.h"
-#include "optimizer/restrictinfo.h"
#include "utils/typcache.h"
/* Hook for plugins to get control in add_paths_to_joinrel() */
@@ -131,7 +130,6 @@ add_paths_to_joinrel(PlannerInfo *root,
{
JoinPathExtraData extra;
bool mergejoin_allowed = true;
- bool consider_join_pushdown = false;
ListCell *lc;
Relids joinrelids;
@@ -323,26 +321,13 @@ add_paths_to_joinrel(PlannerInfo *root,
hash_inner_and_outer(root, joinrel, outerrel, innerrel,
jointype, &extra);
- /*
- * createplan.c does not currently support handling of pseudoconstant
- * clauses in ForeignScan/CustomScan nodes that push down joins; if the
- * restrictlist has such clauses then disallow extensions to consider
- * pushing down joins.
- */
- if ((joinrel->fdwroutine &&
- joinrel->fdwroutine->GetForeignJoinPaths) ||
- set_join_pathlist_hook)
- consider_join_pushdown = !has_pseudoconstant_clauses(root,
- restrictlist);
-
/*
* 5. If inner and outer relations are foreign tables (or joins) belonging
* to the same server and assigned to the same user to check access
* permissions as, give the FDW a chance to push down joins.
*/
if (joinrel->fdwroutine &&
- joinrel->fdwroutine->GetForeignJoinPaths &&
- consider_join_pushdown)
+ joinrel->fdwroutine->GetForeignJoinPaths)
joinrel->fdwroutine->GetForeignJoinPaths(root, joinrel,
outerrel, innerrel,
jointype, &extra);
@@ -350,8 +335,7 @@ add_paths_to_joinrel(PlannerInfo *root,
/*
* 6. Finally, give extensions a chance to manipulate the path list.
*/
- if (set_join_pathlist_hook &&
- consider_join_pushdown)
+ if (set_join_pathlist_hook)
set_join_pathlist_hook(root, joinrel, outerrel, innerrel,
jointype, &extra);
}
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index af48109058..f40910a8bb 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -599,8 +599,26 @@ create_scan_plan(PlannerInfo *root, Path *best_path, int flags)
* Detect whether we have any pseudoconstant quals to deal with. Then, if
* we'll need a gating Result node, it will be able to project, so there
* are no requirements on the child's tlist.
+ *
+ * If replacing a join, the FDW or custom scan provider would have stored
+ * the joinrestrictinfo list in the best path; check against it in that
+ * case.
*/
- gating_clauses = get_gating_quals(root, scan_clauses);
+ if (IS_JOIN_REL(rel))
+ {
+ List *join_clauses;
+
+ Assert(best_path->pathtype == T_ForeignScan ||
+ best_path->pathtype == T_CustomScan);
+ if (best_path->pathtype == T_ForeignScan)
+ join_clauses = ((ForeignPath *) best_path)->fdw_restrictinfo;
+ else
+ join_clauses = ((CustomPath *) best_path)->custom_restrictinfo;
+
+ gating_clauses = get_gating_quals(root, join_clauses);
+ }
+ else
+ gating_clauses = get_gating_quals(root, scan_clauses);
if (gating_clauses)
flags = 0;
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index f123fcb41e..fb3b854c55 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -2229,6 +2229,7 @@ create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
List *pathkeys,
Relids required_outer,
Path *fdw_outerpath,
+ List *fdw_restrictinfo,
List *fdw_private)
{
ForeignPath *pathnode = makeNode(ForeignPath);
@@ -2250,6 +2251,7 @@ create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
pathnode->path.pathkeys = pathkeys;
pathnode->fdw_outerpath = fdw_outerpath;
+ pathnode->fdw_restrictinfo = fdw_restrictinfo;
pathnode->fdw_private = fdw_private;
return pathnode;
@@ -2273,6 +2275,7 @@ create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
List *pathkeys,
Relids required_outer,
Path *fdw_outerpath,
+ List *fdw_restrictinfo,
List *fdw_private)
{
ForeignPath *pathnode = makeNode(ForeignPath);
@@ -2300,6 +2303,7 @@ create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
pathnode->path.pathkeys = pathkeys;
pathnode->fdw_outerpath = fdw_outerpath;
+ pathnode->fdw_restrictinfo = fdw_restrictinfo;
pathnode->fdw_private = fdw_private;
return pathnode;
@@ -2322,6 +2326,7 @@ create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
double rows, Cost startup_cost, Cost total_cost,
List *pathkeys,
Path *fdw_outerpath,
+ List *fdw_restrictinfo,
List *fdw_private)
{
ForeignPath *pathnode = makeNode(ForeignPath);
@@ -2345,6 +2350,7 @@ create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
pathnode->path.pathkeys = pathkeys;
pathnode->fdw_outerpath = fdw_outerpath;
+ pathnode->fdw_restrictinfo = fdw_restrictinfo;
pathnode->fdw_private = fdw_private;
return pathnode;
diff --git a/src/backend/optimizer/util/restrictinfo.c b/src/backend/optimizer/util/restrictinfo.c
index 701f8812c9..d6d26a2b51 100644
--- a/src/backend/optimizer/util/restrictinfo.c
+++ b/src/backend/optimizer/util/restrictinfo.c
@@ -508,36 +508,6 @@ extract_actual_clauses(List *restrictinfo_list,
return result;
}
-/*
- * has_pseudoconstant_clauses
- *
- * Returns true if 'restrictinfo_list' includes pseudoconstant clauses.
- *
- * This is used when we determine whether to allow extensions to consider
- * pushing down joins to remote sides in add_paths_to_joinrel().
- */
-bool
-has_pseudoconstant_clauses(PlannerInfo *root,
- List *restrictinfo_list)
-{
- ListCell *l;
-
- /* No need to look if we know there are no pseudoconstants */
- if (!root->hasPseudoConstantQuals)
- return false;
-
- /* See if there are pseudoconstants in the RestrictInfo list */
- foreach(l, restrictinfo_list)
- {
- RestrictInfo *rinfo = lfirst_node(RestrictInfo, l);
-
- if (rinfo->pseudoconstant &&
- !rinfo_is_constant_true(rinfo))
- return true;
- }
- return false;
-}
-
/*
* extract_actual_join_clauses
*
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index c17b53f7ad..50c1f4f01a 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1822,6 +1822,10 @@ typedef struct SubqueryScanPath
* ForeignPath represents a potential scan of a foreign table, foreign join
* or foreign upper-relation.
*
+ * In the case of foreign join, fdw_restrictinfo stores the RestrictInfos to
+ * apply to this join. It is needed because we need to extract gating clauses
+ * from it.
+ *
* fdw_private stores FDW private data about the scan. While fdw_private is
* not actually touched by the core code during normal operations, it's
* generally a good idea to use a representation that can be dumped by
@@ -1832,6 +1836,7 @@ typedef struct ForeignPath
{
Path path;
Path *fdw_outerpath;
+ List *fdw_restrictinfo;
List *fdw_private;
} ForeignPath;
@@ -1862,6 +1867,7 @@ typedef struct CustomPath
uint32 flags; /* mask of CUSTOMPATH_* flags, see
* nodes/extensible.h */
List *custom_paths; /* list of child Path nodes, if any */
+ List *custom_restrictinfo;
List *custom_private;
const struct CustomPathMethods *methods;
} CustomPath;
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 001e75b5b7..6e557bebc4 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -128,6 +128,7 @@ extern ForeignPath *create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
List *pathkeys,
Relids required_outer,
Path *fdw_outerpath,
+ List *fdw_restrictinfo,
List *fdw_private);
extern ForeignPath *create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
PathTarget *target,
@@ -135,12 +136,14 @@ extern ForeignPath *create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
List *pathkeys,
Relids required_outer,
Path *fdw_outerpath,
+ List *fdw_restrictinfo,
List *fdw_private);
extern ForeignPath *create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
PathTarget *target,
double rows, Cost startup_cost, Cost total_cost,
List *pathkeys,
Path *fdw_outerpath,
+ List *fdw_restrictinfo,
List *fdw_private);
extern Relids calc_nestloop_required_outer(Relids outerrelids,
diff --git a/src/include/optimizer/restrictinfo.h b/src/include/optimizer/restrictinfo.h
index 14cdce750c..e140e619ac 100644
--- a/src/include/optimizer/restrictinfo.h
+++ b/src/include/optimizer/restrictinfo.h
@@ -43,8 +43,6 @@ extern void extract_actual_join_clauses(List *restrictinfo_list,
Relids joinrelids,
List **joinquals,
List **otherquals);
-extern bool has_pseudoconstant_clauses(PlannerInfo *root,
- List *restrictinfo_list);
extern bool join_clause_is_movable_to(RestrictInfo *rinfo, RelOptInfo *baserel);
extern bool join_clause_is_movable_into(RestrictInfo *rinfo,
Relids currentrelids,
On Fri, Jul 21, 2023 at 8:51 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:
I think we should choose the latter, so I modified your patch as
mentioned, after re-creating it on top of my patch. Attached is a new
version (0002-Allow-join-pushdown-even-if-pseudoconstant-quals-v2.patch).
I am attaching my patch as well
(0001-Disable-join-pushdown-if-pseudoconstant-quals-v2.patch).Other changes made to your patch:
* I renamed the new member of the ForeignPath struct to
fdw_restrictinfo. (And I named that of the CustomPath struct
custom_restrictinfo.)
That's much better, and more consistent with other members in
ForeignPath/CustomPath. Thanks!
* In your patch, only for create_foreign_join_path(), the API was
modified so that the caller provides the new member of ForeignPath,
but I modified that for
create_foreignscan_path()/create_foreign_upper_path() as well, for
consistency.
LGTM.
* In this bit I changed the last argument to NIL, which would be
nitpicking, though.@@ -1038,7 +1038,7 @@ postgresGetForeignPaths(PlannerInfo *root,
add_path(baserel, (Path *) path);/* Add paths with pathkeys */ - add_paths_with_pathkeys_for_rel(root, baserel, NULL); + add_paths_with_pathkeys_for_rel(root, baserel, NULL, NULL);
Good catch! This was my oversight.
* I dropped this test case, because it would not be stable if the
system clock was too slow.
Agreed. And the test case from 0001 should be sufficient.
So the two patches both look good to me now.
Thanks
Richard
Hi Richard,
On Mon, Jul 24, 2023 at 11:45 AM Richard Guo <guofenglinux@gmail.com> wrote:
On Fri, Jul 21, 2023 at 8:51 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
* In this bit I changed the last argument to NIL, which would be
nitpicking, though.@@ -1038,7 +1038,7 @@ postgresGetForeignPaths(PlannerInfo *root,
add_path(baserel, (Path *) path);/* Add paths with pathkeys */ - add_paths_with_pathkeys_for_rel(root, baserel, NULL); + add_paths_with_pathkeys_for_rel(root, baserel, NULL, NULL);
This was my oversight.
No. IIUC, I think that that would work well as-proposed, but I
changed it as such, for readability.
So the two patches both look good to me now.
Cool! I pushed the first patch after polishing it a little bit, so
here is a rebased version of the second patch, in which I modified the
ForeignPath and CustomPath cases in reparameterize_path_by_child() to
reflect the new members fdw_restrictinfo and custom_restrictinfo, for
safety, and tweaked a comment a bit.
Thanks for looking!
Best regards,
Etsuro Fujita
Attachments:
0002-Allow-join-pushdown-even-if-pseudoconstant-quals-v3.patchapplication/octet-stream; name=0002-Allow-join-pushdown-even-if-pseudoconstant-quals-v3.patchDownload
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 9e330b9934..f14e6fe2e0 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -581,6 +581,7 @@ fileGetForeignPaths(PlannerInfo *root,
NIL, /* no pathkeys */
baserel->lateral_relids,
NULL, /* no extra plan */
+ NIL,
coptions));
/*
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index f6d3b8ec08..77df7eb8e4 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2316,31 +2316,21 @@ SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM
1
(10 rows)
--- join with pseudoconstant quals, not pushed down.
+-- join with pseudoconstant quals
EXPLAIN (VERBOSE, COSTS OFF)
SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1 AND CURRENT_USER = SESSION_USER) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
- QUERY PLAN
--------------------------------------------------------------------------------
+ QUERY PLAN
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit
Output: t1.c1, t2.c1, t1.c3
- -> Sort
+ -> Result
Output: t1.c1, t2.c1, t1.c3
- Sort Key: t1.c3, t1.c1
- -> Result
- Output: t1.c1, t2.c1, t1.c3
- One-Time Filter: (CURRENT_USER = SESSION_USER)
- -> Hash Join
- Output: t1.c1, t1.c3, t2.c1
- Hash Cond: (t2.c1 = t1.c1)
- -> Foreign Scan on public.ft2 t2
- Output: t2.c1
- Remote SQL: SELECT "C 1" FROM "S 1"."T 1"
- -> Hash
- Output: t1.c1, t1.c3
- -> Foreign Scan on public.ft1 t1
- Output: t1.c1, t1.c3
- Remote SQL: SELECT "C 1", c3 FROM "S 1"."T 1"
-(19 rows)
+ One-Time Filter: (CURRENT_USER = SESSION_USER)
+ -> Foreign Scan
+ Output: t1.c1, t1.c3, t2.c1
+ Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+ Remote SQL: SELECT r1."C 1", r2."C 1", r1.c3 FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r2."C 1" = r1."C 1")))) ORDER BY r1.c3 ASC NULLS LAST, r1."C 1" ASC NULLS LAST
+(9 rows)
-- non-Var items in targetlist of the nullable rel of a join preventing
-- push-down in some cases
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index c5cada55fb..2804a13313 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -524,7 +524,7 @@ static List *get_useful_pathkeys_for_relation(PlannerInfo *root,
RelOptInfo *rel);
static List *get_useful_ecs_for_relation(PlannerInfo *root, RelOptInfo *rel);
static void add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
- Path *epq_path);
+ Path *epq_path, List *restrictlist);
static void add_foreign_grouping_paths(PlannerInfo *root,
RelOptInfo *input_rel,
RelOptInfo *grouped_rel,
@@ -1034,11 +1034,12 @@ postgresGetForeignPaths(PlannerInfo *root,
NIL, /* no pathkeys */
baserel->lateral_relids,
NULL, /* no extra plan */
+ NIL,
NIL); /* no fdw_private list */
add_path(baserel, (Path *) path);
/* Add paths with pathkeys */
- add_paths_with_pathkeys_for_rel(root, baserel, NULL);
+ add_paths_with_pathkeys_for_rel(root, baserel, NULL, NIL);
/*
* If we're not using remote estimates, stop here. We have no way to
@@ -1206,6 +1207,7 @@ postgresGetForeignPaths(PlannerInfo *root,
NIL, /* no pathkeys */
param_info->ppi_req_outer,
NULL,
+ NIL,
NIL); /* no fdw_private list */
add_path(baserel, (Path *) path);
}
@@ -5991,7 +5993,7 @@ foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel, JoinType jointype,
static void
add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
- Path *epq_path)
+ Path *epq_path, List *restrictlist)
{
List *useful_pathkeys_list = NIL; /* List of all pathkeys */
ListCell *lc;
@@ -6085,6 +6087,7 @@ add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
useful_pathkeys,
rel->lateral_relids,
sorted_epq_path,
+ NIL,
NIL));
else
add_path(rel, (Path *)
@@ -6096,6 +6099,7 @@ add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
useful_pathkeys,
rel->lateral_relids,
sorted_epq_path,
+ restrictlist,
NIL));
}
}
@@ -6348,13 +6352,15 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
NIL, /* no pathkeys */
joinrel->lateral_relids,
epq_path,
+ extra->restrictlist,
NIL); /* no fdw_private */
/* Add generated path into joinrel by add_path(). */
add_path(joinrel, (Path *) joinpath);
/* Consider pathkeys for the join relation */
- add_paths_with_pathkeys_for_rel(root, joinrel, epq_path);
+ add_paths_with_pathkeys_for_rel(root, joinrel, epq_path,
+ extra->restrictlist);
/* XXX Consider parameterized paths for the join relation */
}
@@ -6735,6 +6741,7 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
total_cost,
NIL, /* no pathkeys */
NULL,
+ NIL,
NIL); /* no fdw_private */
/* Add generated path into grouped_rel by add_path(). */
@@ -6868,6 +6875,7 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
total_cost,
root->sort_pathkeys,
NULL, /* no extra plan */
+ NIL,
fdw_private);
/* and add it to the ordered_rel */
@@ -6983,7 +6991,8 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo *input_rel,
path->total_cost,
path->pathkeys,
NULL, /* no extra plan */
- NULL); /* no fdw_private */
+ NIL,
+ NIL); /* no fdw_private */
/* and add it to the final_rel */
add_path(final_rel, (Path *) final_path);
@@ -7103,6 +7112,7 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo *input_rel,
total_cost,
pathkeys,
NULL, /* no extra plan */
+ NIL,
fdw_private);
/* and add it to the final_rel */
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 436feee396..cfb1b57e33 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -640,7 +640,7 @@ SELECT t1c1, avg(t1c1 + t2c1) FROM (SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2
EXPLAIN (VERBOSE, COSTS OFF)
SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM ft1 t2, ft2 t3 WHERE t2.c1 = t3.c1 AND t2.c2 = t1.c2) q ORDER BY t1."C 1" OFFSET 10 LIMIT 10;
SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM ft1 t2, ft2 t3 WHERE t2.c1 = t3.c1 AND t2.c2 = t1.c2) q ORDER BY t1."C 1" OFFSET 10 LIMIT 10;
--- join with pseudoconstant quals, not pushed down.
+-- join with pseudoconstant quals
EXPLAIN (VERBOSE, COSTS OFF)
SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1 AND CURRENT_USER = SESSION_USER) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
diff --git a/doc/src/sgml/custom-scan.sgml b/doc/src/sgml/custom-scan.sgml
index 93d96f2f56..d30aaeb024 100644
--- a/doc/src/sgml/custom-scan.sgml
+++ b/doc/src/sgml/custom-scan.sgml
@@ -62,6 +62,7 @@ typedef struct CustomPath
Path path;
uint32 flags;
List *custom_paths;
+ List *custom_restrictinfo;
List *custom_private;
const CustomPathMethods *methods;
} CustomPath;
@@ -85,6 +86,11 @@ typedef struct CustomPath
An optional <structfield>custom_paths</structfield> is a list of <structname>Path</structname>
nodes used by this custom-path node; these will be transformed into
<structname>Plan</structname> nodes by planner.
+ As described below, custom paths can be created for join relations as
+ well. In such a case, <structfield>custom_restrictinfo</structfield>
+ should be used to store a list of <structname>RestrictInfo</structname>
+ nodes to apply to the join, which will be used by planner to convert the
+ custom path to a finished plan. Otherwise it should be NIL.
<structfield>custom_private</structfield> can be used to store the custom path's
private data. Private data should be stored in a form that can be handled
by <literal>nodeToString</literal>, so that debugging routines that attempt to
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 4b58936fa4..f047ad9ba4 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -24,7 +24,6 @@
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/planmain.h"
-#include "optimizer/restrictinfo.h"
#include "utils/typcache.h"
/* Hook for plugins to get control in add_paths_to_joinrel() */
@@ -131,7 +130,6 @@ add_paths_to_joinrel(PlannerInfo *root,
{
JoinPathExtraData extra;
bool mergejoin_allowed = true;
- bool consider_join_pushdown = false;
ListCell *lc;
Relids joinrelids;
@@ -323,25 +321,13 @@ add_paths_to_joinrel(PlannerInfo *root,
hash_inner_and_outer(root, joinrel, outerrel, innerrel,
jointype, &extra);
- /*
- * createplan.c does not currently support handling of pseudoconstant
- * clauses assigned to joins pushed down by extensions; check if the
- * restrictlist has such clauses, and if so, disallow pushing down joins.
- */
- if ((joinrel->fdwroutine &&
- joinrel->fdwroutine->GetForeignJoinPaths) ||
- set_join_pathlist_hook)
- consider_join_pushdown = !has_pseudoconstant_clauses(root,
- restrictlist);
-
/*
* 5. If inner and outer relations are foreign tables (or joins) belonging
* to the same server and assigned to the same user to check access
* permissions as, give the FDW a chance to push down joins.
*/
if (joinrel->fdwroutine &&
- joinrel->fdwroutine->GetForeignJoinPaths &&
- consider_join_pushdown)
+ joinrel->fdwroutine->GetForeignJoinPaths)
joinrel->fdwroutine->GetForeignJoinPaths(root, joinrel,
outerrel, innerrel,
jointype, &extra);
@@ -349,8 +335,7 @@ add_paths_to_joinrel(PlannerInfo *root,
/*
* 6. Finally, give extensions a chance to manipulate the path list.
*/
- if (set_join_pathlist_hook &&
- consider_join_pushdown)
+ if (set_join_pathlist_hook)
set_join_pathlist_hook(root, joinrel, outerrel, innerrel,
jointype, &extra);
}
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index af48109058..ad811a2cd5 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -599,8 +599,27 @@ create_scan_plan(PlannerInfo *root, Path *best_path, int flags)
* Detect whether we have any pseudoconstant quals to deal with. Then, if
* we'll need a gating Result node, it will be able to project, so there
* are no requirements on the child's tlist.
+ *
+ * If this replaces a join, it must be a foreign scan or a custom scan,
+ * and the FDW or custom scan provider would have stored in the best path
+ * the list of RestrictInfo nodes to apply to the join; check against that
+ * list in that case.
*/
- gating_clauses = get_gating_quals(root, scan_clauses);
+ if (IS_JOIN_REL(rel))
+ {
+ List *join_clauses;
+
+ Assert(best_path->pathtype == T_ForeignScan ||
+ best_path->pathtype == T_CustomScan);
+ if (best_path->pathtype == T_ForeignScan)
+ join_clauses = ((ForeignPath *) best_path)->fdw_restrictinfo;
+ else
+ join_clauses = ((CustomPath *) best_path)->custom_restrictinfo;
+
+ gating_clauses = get_gating_quals(root, join_clauses);
+ }
+ else
+ gating_clauses = get_gating_quals(root, scan_clauses);
if (gating_clauses)
flags = 0;
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index f123fcb41e..da2ae771d5 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -2229,6 +2229,7 @@ create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
List *pathkeys,
Relids required_outer,
Path *fdw_outerpath,
+ List *fdw_restrictinfo,
List *fdw_private)
{
ForeignPath *pathnode = makeNode(ForeignPath);
@@ -2250,6 +2251,7 @@ create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
pathnode->path.pathkeys = pathkeys;
pathnode->fdw_outerpath = fdw_outerpath;
+ pathnode->fdw_restrictinfo = fdw_restrictinfo;
pathnode->fdw_private = fdw_private;
return pathnode;
@@ -2273,6 +2275,7 @@ create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
List *pathkeys,
Relids required_outer,
Path *fdw_outerpath,
+ List *fdw_restrictinfo,
List *fdw_private)
{
ForeignPath *pathnode = makeNode(ForeignPath);
@@ -2300,6 +2303,7 @@ create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
pathnode->path.pathkeys = pathkeys;
pathnode->fdw_outerpath = fdw_outerpath;
+ pathnode->fdw_restrictinfo = fdw_restrictinfo;
pathnode->fdw_private = fdw_private;
return pathnode;
@@ -2322,6 +2326,7 @@ create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
double rows, Cost startup_cost, Cost total_cost,
List *pathkeys,
Path *fdw_outerpath,
+ List *fdw_restrictinfo,
List *fdw_private)
{
ForeignPath *pathnode = makeNode(ForeignPath);
@@ -2345,6 +2350,7 @@ create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
pathnode->path.pathkeys = pathkeys;
pathnode->fdw_outerpath = fdw_outerpath;
+ pathnode->fdw_restrictinfo = fdw_restrictinfo;
pathnode->fdw_private = fdw_private;
return pathnode;
@@ -4150,6 +4156,7 @@ do { \
FLAT_COPY_PATH(fpath, path, ForeignPath);
if (fpath->fdw_outerpath)
REPARAMETERIZE_CHILD_PATH(fpath->fdw_outerpath);
+ ADJUST_CHILD_ATTRS(fpath->fdw_restrictinfo);
/* Hand over to FDW if needed. */
rfpc_func =
@@ -4167,6 +4174,7 @@ do { \
FLAT_COPY_PATH(cpath, path, CustomPath);
REPARAMETERIZE_CHILD_PATH_LIST(cpath->custom_paths);
+ ADJUST_CHILD_ATTRS(cpath->custom_restrictinfo);
if (cpath->methods &&
cpath->methods->ReparameterizeCustomPathByChild)
cpath->custom_private =
diff --git a/src/backend/optimizer/util/restrictinfo.c b/src/backend/optimizer/util/restrictinfo.c
index c1fbbb6bfe..d6d26a2b51 100644
--- a/src/backend/optimizer/util/restrictinfo.c
+++ b/src/backend/optimizer/util/restrictinfo.c
@@ -549,36 +549,6 @@ extract_actual_join_clauses(List *restrictinfo_list,
}
}
-/*
- * has_pseudoconstant_clauses
- *
- * Returns true if 'restrictinfo_list' includes pseudoconstant clauses.
- *
- * This is used when we determine whether to allow extensions to consider
- * pushing down joins in add_paths_to_joinrel().
- */
-bool
-has_pseudoconstant_clauses(PlannerInfo *root,
- List *restrictinfo_list)
-{
- ListCell *l;
-
- /* No need to look if we know there are no pseudoconstants */
- if (!root->hasPseudoConstantQuals)
- return false;
-
- /* See if there are pseudoconstants in the RestrictInfo list */
- foreach(l, restrictinfo_list)
- {
- RestrictInfo *rinfo = lfirst_node(RestrictInfo, l);
-
- if (rinfo->pseudoconstant &&
- !rinfo_is_constant_true(rinfo))
- return true;
- }
- return false;
-}
-
/*
* join_clause_is_movable_to
* Test whether a join clause is a safe candidate for parameterization
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index c17b53f7ad..50c1f4f01a 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1822,6 +1822,10 @@ typedef struct SubqueryScanPath
* ForeignPath represents a potential scan of a foreign table, foreign join
* or foreign upper-relation.
*
+ * In the case of foreign join, fdw_restrictinfo stores the RestrictInfos to
+ * apply to this join. It is needed because we need to extract gating clauses
+ * from it.
+ *
* fdw_private stores FDW private data about the scan. While fdw_private is
* not actually touched by the core code during normal operations, it's
* generally a good idea to use a representation that can be dumped by
@@ -1832,6 +1836,7 @@ typedef struct ForeignPath
{
Path path;
Path *fdw_outerpath;
+ List *fdw_restrictinfo;
List *fdw_private;
} ForeignPath;
@@ -1862,6 +1867,7 @@ typedef struct CustomPath
uint32 flags; /* mask of CUSTOMPATH_* flags, see
* nodes/extensible.h */
List *custom_paths; /* list of child Path nodes, if any */
+ List *custom_restrictinfo;
List *custom_private;
const struct CustomPathMethods *methods;
} CustomPath;
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 001e75b5b7..6e557bebc4 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -128,6 +128,7 @@ extern ForeignPath *create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
List *pathkeys,
Relids required_outer,
Path *fdw_outerpath,
+ List *fdw_restrictinfo,
List *fdw_private);
extern ForeignPath *create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
PathTarget *target,
@@ -135,12 +136,14 @@ extern ForeignPath *create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
List *pathkeys,
Relids required_outer,
Path *fdw_outerpath,
+ List *fdw_restrictinfo,
List *fdw_private);
extern ForeignPath *create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
PathTarget *target,
double rows, Cost startup_cost, Cost total_cost,
List *pathkeys,
Path *fdw_outerpath,
+ List *fdw_restrictinfo,
List *fdw_private);
extern Relids calc_nestloop_required_outer(Relids outerrelids,
diff --git a/src/include/optimizer/restrictinfo.h b/src/include/optimizer/restrictinfo.h
index 14cdce750c..e140e619ac 100644
--- a/src/include/optimizer/restrictinfo.h
+++ b/src/include/optimizer/restrictinfo.h
@@ -43,8 +43,6 @@ extern void extract_actual_join_clauses(List *restrictinfo_list,
Relids joinrelids,
List **joinquals,
List **otherquals);
-extern bool has_pseudoconstant_clauses(PlannerInfo *root,
- List *restrictinfo_list);
extern bool join_clause_is_movable_to(RestrictInfo *rinfo, RelOptInfo *baserel);
extern bool join_clause_is_movable_into(RestrictInfo *rinfo,
Relids currentrelids,
On Fri, Jul 28, 2023 at 4:56 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:
Cool! I pushed the first patch after polishing it a little bit, so
here is a rebased version of the second patch, in which I modified the
ForeignPath and CustomPath cases in reparameterize_path_by_child() to
reflect the new members fdw_restrictinfo and custom_restrictinfo, for
safety, and tweaked a comment a bit.
Hmm, it seems that ForeignPath for a foreign join does not support
parameterized paths for now, as in postgresGetForeignJoinPaths() we have
this check:
/*
* This code does not work for joins with lateral references, since those
* must have parameterized paths, which we don't generate yet.
*/
if (!bms_is_empty(joinrel->lateral_relids))
return;
And in create_foreign_join_path() we just set the path.param_info to
NULL.
pathnode->path.param_info = NULL; /* XXX see above */
So I doubt that it's necessary to adjust fdw_restrictinfo in
reparameterize_path_by_child, because it seems to me that
fdw_restrictinfo must be empty there. Maybe we can add an Assert there
as below:
- ADJUST_CHILD_ATTRS(fpath->fdw_restrictinfo);
+
+ /*
+ * Parameterized foreign joins are not supported. So this
+ * ForeignPath cannot be a foreign join and fdw_restrictinfo
+ * must be empty.
+ */
+ Assert(fpath->fdw_restrictinfo == NIL);
That being said, it's also no harm to handle fdw_restrictinfo in
reparameterize_path_by_child as the patch does. So I'm OK we do that
for safety.
Thanks
Richard
Hi Richard,
On Mon, Jul 31, 2023 at 5:52 PM Richard Guo <guofenglinux@gmail.com> wrote:
On Fri, Jul 28, 2023 at 4:56 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
here is a rebased version of the second patch, in which I modified the
ForeignPath and CustomPath cases in reparameterize_path_by_child() to
reflect the new members fdw_restrictinfo and custom_restrictinfo, for
safety, and tweaked a comment a bit.
Hmm, it seems that ForeignPath for a foreign join does not support
parameterized paths for now, as in postgresGetForeignJoinPaths() we have
this check:/*
* This code does not work for joins with lateral references, since those
* must have parameterized paths, which we don't generate yet.
*/
if (!bms_is_empty(joinrel->lateral_relids))
return;And in create_foreign_join_path() we just set the path.param_info to
NULL.pathnode->path.param_info = NULL; /* XXX see above */
So I doubt that it's necessary to adjust fdw_restrictinfo in
reparameterize_path_by_child, because it seems to me that
fdw_restrictinfo must be empty there. Maybe we can add an Assert there
as below:- ADJUST_CHILD_ATTRS(fpath->fdw_restrictinfo); + + /* + * Parameterized foreign joins are not supported. So this + * ForeignPath cannot be a foreign join and fdw_restrictinfo + * must be empty. + */ + Assert(fpath->fdw_restrictinfo == NIL);That being said, it's also no harm to handle fdw_restrictinfo in
reparameterize_path_by_child as the patch does. So I'm OK we do that
for safety.
Ok, but maybe my explanation was not good, so let me explain. The
reason why I modified the code as such is to make the handling of
fdw_restrictinfo consistent with that of fdw_outerpath: we have the
code to reparameterize fdw_outerpath, which should be NULL though, as
we do not currently support parameterized foreign joins.
I modified the code a bit further to use an if-test to avoid a useless
function call, and added/tweaked comments and docs further. Attached
is a new version of the patch. I am planning to commit this, if there
are no objections.
Thanks!
Best regards,
Etsuro Fujita
Attachments:
0002-Allow-join-pushdown-even-if-pseudoconstant-quals-v4.patchapplication/octet-stream; name=0002-Allow-join-pushdown-even-if-pseudoconstant-quals-v4.patchDownload
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 9e330b9934..2189be8a3c 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -581,6 +581,7 @@ fileGetForeignPaths(PlannerInfo *root,
NIL, /* no pathkeys */
baserel->lateral_relids,
NULL, /* no extra plan */
+ NIL, /* no fdw_restrictinfo list */
coptions));
/*
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index f6d3b8ec08..77df7eb8e4 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2316,31 +2316,21 @@ SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM
1
(10 rows)
--- join with pseudoconstant quals, not pushed down.
+-- join with pseudoconstant quals
EXPLAIN (VERBOSE, COSTS OFF)
SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1 AND CURRENT_USER = SESSION_USER) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
- QUERY PLAN
--------------------------------------------------------------------------------
+ QUERY PLAN
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit
Output: t1.c1, t2.c1, t1.c3
- -> Sort
+ -> Result
Output: t1.c1, t2.c1, t1.c3
- Sort Key: t1.c3, t1.c1
- -> Result
- Output: t1.c1, t2.c1, t1.c3
- One-Time Filter: (CURRENT_USER = SESSION_USER)
- -> Hash Join
- Output: t1.c1, t1.c3, t2.c1
- Hash Cond: (t2.c1 = t1.c1)
- -> Foreign Scan on public.ft2 t2
- Output: t2.c1
- Remote SQL: SELECT "C 1" FROM "S 1"."T 1"
- -> Hash
- Output: t1.c1, t1.c3
- -> Foreign Scan on public.ft1 t1
- Output: t1.c1, t1.c3
- Remote SQL: SELECT "C 1", c3 FROM "S 1"."T 1"
-(19 rows)
+ One-Time Filter: (CURRENT_USER = SESSION_USER)
+ -> Foreign Scan
+ Output: t1.c1, t1.c3, t2.c1
+ Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
+ Remote SQL: SELECT r1."C 1", r2."C 1", r1.c3 FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r2."C 1" = r1."C 1")))) ORDER BY r1.c3 ASC NULLS LAST, r1."C 1" ASC NULLS LAST
+(9 rows)
-- non-Var items in targetlist of the nullable rel of a join preventing
-- push-down in some cases
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index c5cada55fb..ef58b8b0ea 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -524,7 +524,7 @@ static List *get_useful_pathkeys_for_relation(PlannerInfo *root,
RelOptInfo *rel);
static List *get_useful_ecs_for_relation(PlannerInfo *root, RelOptInfo *rel);
static void add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
- Path *epq_path);
+ Path *epq_path, List *restrictlist);
static void add_foreign_grouping_paths(PlannerInfo *root,
RelOptInfo *input_rel,
RelOptInfo *grouped_rel,
@@ -1034,11 +1034,12 @@ postgresGetForeignPaths(PlannerInfo *root,
NIL, /* no pathkeys */
baserel->lateral_relids,
NULL, /* no extra plan */
+ NIL, /* no fdw_restrictinfo list */
NIL); /* no fdw_private list */
add_path(baserel, (Path *) path);
/* Add paths with pathkeys */
- add_paths_with_pathkeys_for_rel(root, baserel, NULL);
+ add_paths_with_pathkeys_for_rel(root, baserel, NULL, NIL);
/*
* If we're not using remote estimates, stop here. We have no way to
@@ -1206,6 +1207,7 @@ postgresGetForeignPaths(PlannerInfo *root,
NIL, /* no pathkeys */
param_info->ppi_req_outer,
NULL,
+ NIL, /* no fdw_restrictinfo list */
NIL); /* no fdw_private list */
add_path(baserel, (Path *) path);
}
@@ -5991,7 +5993,7 @@ foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel, JoinType jointype,
static void
add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
- Path *epq_path)
+ Path *epq_path, List *restrictlist)
{
List *useful_pathkeys_list = NIL; /* List of all pathkeys */
ListCell *lc;
@@ -6085,6 +6087,7 @@ add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
useful_pathkeys,
rel->lateral_relids,
sorted_epq_path,
+ NIL, /* no fdw_restrictinfo list */
NIL));
else
add_path(rel, (Path *)
@@ -6096,6 +6099,7 @@ add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
useful_pathkeys,
rel->lateral_relids,
sorted_epq_path,
+ restrictlist,
NIL));
}
}
@@ -6348,13 +6352,15 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
NIL, /* no pathkeys */
joinrel->lateral_relids,
epq_path,
+ extra->restrictlist,
NIL); /* no fdw_private */
/* Add generated path into joinrel by add_path(). */
add_path(joinrel, (Path *) joinpath);
/* Consider pathkeys for the join relation */
- add_paths_with_pathkeys_for_rel(root, joinrel, epq_path);
+ add_paths_with_pathkeys_for_rel(root, joinrel, epq_path,
+ extra->restrictlist);
/* XXX Consider parameterized paths for the join relation */
}
@@ -6735,6 +6741,7 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
total_cost,
NIL, /* no pathkeys */
NULL,
+ NIL, /* no fdw_restrictinfo list */
NIL); /* no fdw_private */
/* Add generated path into grouped_rel by add_path(). */
@@ -6868,6 +6875,7 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
total_cost,
root->sort_pathkeys,
NULL, /* no extra plan */
+ NIL, /* no fdw_restrictinfo list */
fdw_private);
/* and add it to the ordered_rel */
@@ -6983,7 +6991,8 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo *input_rel,
path->total_cost,
path->pathkeys,
NULL, /* no extra plan */
- NULL); /* no fdw_private */
+ NIL, /* no fdw_restrictinfo list */
+ NIL); /* no fdw_private */
/* and add it to the final_rel */
add_path(final_rel, (Path *) final_path);
@@ -7103,6 +7112,7 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo *input_rel,
total_cost,
pathkeys,
NULL, /* no extra plan */
+ NIL, /* no fdw_restrictinfo list */
fdw_private);
/* and add it to the final_rel */
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 436feee396..cfb1b57e33 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -640,7 +640,7 @@ SELECT t1c1, avg(t1c1 + t2c1) FROM (SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2
EXPLAIN (VERBOSE, COSTS OFF)
SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM ft1 t2, ft2 t3 WHERE t2.c1 = t3.c1 AND t2.c2 = t1.c2) q ORDER BY t1."C 1" OFFSET 10 LIMIT 10;
SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM ft1 t2, ft2 t3 WHERE t2.c1 = t3.c1 AND t2.c2 = t1.c2) q ORDER BY t1."C 1" OFFSET 10 LIMIT 10;
--- join with pseudoconstant quals, not pushed down.
+-- join with pseudoconstant quals
EXPLAIN (VERBOSE, COSTS OFF)
SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1 AND CURRENT_USER = SESSION_USER) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
diff --git a/doc/src/sgml/custom-scan.sgml b/doc/src/sgml/custom-scan.sgml
index cd989e7d3c..6bf9647d95 100644
--- a/doc/src/sgml/custom-scan.sgml
+++ b/doc/src/sgml/custom-scan.sgml
@@ -62,6 +62,7 @@ typedef struct CustomPath
Path path;
uint32 flags;
List *custom_paths;
+ List *custom_restrictinfo;
List *custom_private;
const CustomPathMethods *methods;
} CustomPath;
@@ -85,6 +86,10 @@ typedef struct CustomPath
An optional <structfield>custom_paths</structfield> is a list of <structname>Path</structname>
nodes used by this custom-path node; these will be transformed into
<structname>Plan</structname> nodes by planner.
+ As described below, custom paths can be created for join relations as
+ well. In such a case, <structfield>custom_restrictinfo</structfield>
+ should be used to store the set of relevant join clauses for the join the
+ custom path replaces. Otherwise it should be NIL.
<structfield>custom_private</structfield> can be used to store the custom path's
private data. Private data should be stored in a form that can be handled
by <literal>nodeToString</literal>, so that debugging routines that attempt to
@@ -114,6 +119,17 @@ extern PGDLLIMPORT set_join_pathlist_hook_type set_join_pathlist_hook;
responsibility of the hook to minimize duplicated work.
</para>
+ <para>
+ Note also that the set of relevant join clauses to apply to the join,
+ which is passed as <literal>extra->restrictlist</literal>, varies
+ depending on the combination of inner and outer relations. A
+ <structname>CustomPath</structname> path generated for the
+ <literal>joinrel</literal> must contain the set of relevant join clauses
+ it uses, which will be used by the planner to convert the
+ <structname>CustomPath</structname> path into a plan, if it is selected
+ by the planner as the best path for the <literal>joinrel</literal>.
+ </para>
+
<sect2 id="custom-scan-path-callbacks">
<title>Custom Scan Path Callbacks</title>
diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml
index ac1717bc3c..72e005029d 100644
--- a/doc/src/sgml/fdwhandler.sgml
+++ b/doc/src/sgml/fdwhandler.sgml
@@ -333,6 +333,17 @@ GetForeignJoinPaths(PlannerInfo *root,
the responsibility of the FDW to minimize duplicated work.
</para>
+ <para>
+ Note also that the set of relevant join clauses to apply to the join,
+ which is passed as <literal>extra->restrictlist</literal>, varies
+ depending on the combination of inner and outer relations. A
+ <structname>ForeignPath</structname> path generated for the
+ <literal>joinrel</literal> must contain the set of relevant join clauses
+ it uses, which will be used by the planner to convert the
+ <structname>ForeignPath</structname> path into a plan, if it is selected
+ by the planner as the best path for the <literal>joinrel</literal>.
+ </para>
+
<para>
If a <structname>ForeignPath</structname> path is chosen for the join, it will
represent the entire join process; paths generated for the component
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 4b58936fa4..f047ad9ba4 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -24,7 +24,6 @@
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/planmain.h"
-#include "optimizer/restrictinfo.h"
#include "utils/typcache.h"
/* Hook for plugins to get control in add_paths_to_joinrel() */
@@ -131,7 +130,6 @@ add_paths_to_joinrel(PlannerInfo *root,
{
JoinPathExtraData extra;
bool mergejoin_allowed = true;
- bool consider_join_pushdown = false;
ListCell *lc;
Relids joinrelids;
@@ -323,25 +321,13 @@ add_paths_to_joinrel(PlannerInfo *root,
hash_inner_and_outer(root, joinrel, outerrel, innerrel,
jointype, &extra);
- /*
- * createplan.c does not currently support handling of pseudoconstant
- * clauses assigned to joins pushed down by extensions; check if the
- * restrictlist has such clauses, and if so, disallow pushing down joins.
- */
- if ((joinrel->fdwroutine &&
- joinrel->fdwroutine->GetForeignJoinPaths) ||
- set_join_pathlist_hook)
- consider_join_pushdown = !has_pseudoconstant_clauses(root,
- restrictlist);
-
/*
* 5. If inner and outer relations are foreign tables (or joins) belonging
* to the same server and assigned to the same user to check access
* permissions as, give the FDW a chance to push down joins.
*/
if (joinrel->fdwroutine &&
- joinrel->fdwroutine->GetForeignJoinPaths &&
- consider_join_pushdown)
+ joinrel->fdwroutine->GetForeignJoinPaths)
joinrel->fdwroutine->GetForeignJoinPaths(root, joinrel,
outerrel, innerrel,
jointype, &extra);
@@ -349,8 +335,7 @@ add_paths_to_joinrel(PlannerInfo *root,
/*
* 6. Finally, give extensions a chance to manipulate the path list.
*/
- if (set_join_pathlist_hook &&
- consider_join_pushdown)
+ if (set_join_pathlist_hook)
set_join_pathlist_hook(root, joinrel, outerrel, innerrel,
jointype, &extra);
}
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index af48109058..ad811a2cd5 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -599,8 +599,27 @@ create_scan_plan(PlannerInfo *root, Path *best_path, int flags)
* Detect whether we have any pseudoconstant quals to deal with. Then, if
* we'll need a gating Result node, it will be able to project, so there
* are no requirements on the child's tlist.
+ *
+ * If this replaces a join, it must be a foreign scan or a custom scan,
+ * and the FDW or custom scan provider would have stored in the best path
+ * the list of RestrictInfo nodes to apply to the join; check against that
+ * list in that case.
*/
- gating_clauses = get_gating_quals(root, scan_clauses);
+ if (IS_JOIN_REL(rel))
+ {
+ List *join_clauses;
+
+ Assert(best_path->pathtype == T_ForeignScan ||
+ best_path->pathtype == T_CustomScan);
+ if (best_path->pathtype == T_ForeignScan)
+ join_clauses = ((ForeignPath *) best_path)->fdw_restrictinfo;
+ else
+ join_clauses = ((CustomPath *) best_path)->custom_restrictinfo;
+
+ gating_clauses = get_gating_quals(root, join_clauses);
+ }
+ else
+ gating_clauses = get_gating_quals(root, scan_clauses);
if (gating_clauses)
flags = 0;
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 754f0b9f34..211ba65389 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -2229,6 +2229,7 @@ create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
List *pathkeys,
Relids required_outer,
Path *fdw_outerpath,
+ List *fdw_restrictinfo,
List *fdw_private)
{
ForeignPath *pathnode = makeNode(ForeignPath);
@@ -2250,6 +2251,7 @@ create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
pathnode->path.pathkeys = pathkeys;
pathnode->fdw_outerpath = fdw_outerpath;
+ pathnode->fdw_restrictinfo = fdw_restrictinfo;
pathnode->fdw_private = fdw_private;
return pathnode;
@@ -2273,6 +2275,7 @@ create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
List *pathkeys,
Relids required_outer,
Path *fdw_outerpath,
+ List *fdw_restrictinfo,
List *fdw_private)
{
ForeignPath *pathnode = makeNode(ForeignPath);
@@ -2300,6 +2303,7 @@ create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
pathnode->path.pathkeys = pathkeys;
pathnode->fdw_outerpath = fdw_outerpath;
+ pathnode->fdw_restrictinfo = fdw_restrictinfo;
pathnode->fdw_private = fdw_private;
return pathnode;
@@ -2322,6 +2326,7 @@ create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
double rows, Cost startup_cost, Cost total_cost,
List *pathkeys,
Path *fdw_outerpath,
+ List *fdw_restrictinfo,
List *fdw_private)
{
ForeignPath *pathnode = makeNode(ForeignPath);
@@ -2345,6 +2350,7 @@ create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
pathnode->path.pathkeys = pathkeys;
pathnode->fdw_outerpath = fdw_outerpath;
+ pathnode->fdw_restrictinfo = fdw_restrictinfo;
pathnode->fdw_private = fdw_private;
return pathnode;
@@ -4149,6 +4155,8 @@ do { \
FLAT_COPY_PATH(fpath, path, ForeignPath);
if (fpath->fdw_outerpath)
REPARAMETERIZE_CHILD_PATH(fpath->fdw_outerpath);
+ if (fpath->fdw_restrictinfo)
+ ADJUST_CHILD_ATTRS(fpath->fdw_restrictinfo);
/* Hand over to FDW if needed. */
rfpc_func =
@@ -4166,6 +4174,8 @@ do { \
FLAT_COPY_PATH(cpath, path, CustomPath);
REPARAMETERIZE_CHILD_PATH_LIST(cpath->custom_paths);
+ if (cpath->custom_restrictinfo)
+ ADJUST_CHILD_ATTRS(cpath->custom_restrictinfo);
if (cpath->methods &&
cpath->methods->ReparameterizeCustomPathByChild)
cpath->custom_private =
diff --git a/src/backend/optimizer/util/restrictinfo.c b/src/backend/optimizer/util/restrictinfo.c
index c1fbbb6bfe..d6d26a2b51 100644
--- a/src/backend/optimizer/util/restrictinfo.c
+++ b/src/backend/optimizer/util/restrictinfo.c
@@ -549,36 +549,6 @@ extract_actual_join_clauses(List *restrictinfo_list,
}
}
-/*
- * has_pseudoconstant_clauses
- *
- * Returns true if 'restrictinfo_list' includes pseudoconstant clauses.
- *
- * This is used when we determine whether to allow extensions to consider
- * pushing down joins in add_paths_to_joinrel().
- */
-bool
-has_pseudoconstant_clauses(PlannerInfo *root,
- List *restrictinfo_list)
-{
- ListCell *l;
-
- /* No need to look if we know there are no pseudoconstants */
- if (!root->hasPseudoConstantQuals)
- return false;
-
- /* See if there are pseudoconstants in the RestrictInfo list */
- foreach(l, restrictinfo_list)
- {
- RestrictInfo *rinfo = lfirst_node(RestrictInfo, l);
-
- if (rinfo->pseudoconstant &&
- !rinfo_is_constant_true(rinfo))
- return true;
- }
- return false;
-}
-
/*
* join_clause_is_movable_to
* Test whether a join clause is a safe candidate for parameterization
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index a1dc1d07e1..e118aded2d 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1822,6 +1822,10 @@ typedef struct SubqueryScanPath
* ForeignPath represents a potential scan of a foreign table, foreign join
* or foreign upper-relation.
*
+ * In the case of a foreign join, fdw_restrictinfo stores the RestrictInfos to
+ * apply to the join, which are used by createplan.c to get pseudoconstant
+ * clauses evaluated as one-time quals in a gating Result node.
+ *
* fdw_private stores FDW private data about the scan. While fdw_private is
* not actually touched by the core code during normal operations, it's
* generally a good idea to use a representation that can be dumped by
@@ -1832,6 +1836,7 @@ typedef struct ForeignPath
{
Path path;
Path *fdw_outerpath;
+ List *fdw_restrictinfo;
List *fdw_private;
} ForeignPath;
@@ -1849,6 +1854,10 @@ typedef struct ForeignPath
* relation by set_rel_pathlist_hook or set_join_pathlist_hook functions,
* respectively.
*
+ * In the case of a table join, custom_restrictinfo stores the RestrictInfos
+ * to apply to the join, which are used by createplan.c to get pseudoconstant
+ * clauses evaluated as one-time quals in a gating Result node.
+ *
* Core code must avoid assuming that the CustomPath is only as large as
* the structure declared here; providers are allowed to make it the first
* element in a larger structure. (Since the planner never copies Paths,
@@ -1865,6 +1874,7 @@ typedef struct CustomPath
uint32 flags; /* mask of CUSTOMPATH_* flags, see
* nodes/extensible.h */
List *custom_paths; /* list of child Path nodes, if any */
+ List *custom_restrictinfo;
List *custom_private;
const struct CustomPathMethods *methods;
} CustomPath;
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 001e75b5b7..6e557bebc4 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -128,6 +128,7 @@ extern ForeignPath *create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
List *pathkeys,
Relids required_outer,
Path *fdw_outerpath,
+ List *fdw_restrictinfo,
List *fdw_private);
extern ForeignPath *create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
PathTarget *target,
@@ -135,12 +136,14 @@ extern ForeignPath *create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
List *pathkeys,
Relids required_outer,
Path *fdw_outerpath,
+ List *fdw_restrictinfo,
List *fdw_private);
extern ForeignPath *create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
PathTarget *target,
double rows, Cost startup_cost, Cost total_cost,
List *pathkeys,
Path *fdw_outerpath,
+ List *fdw_restrictinfo,
List *fdw_private);
extern Relids calc_nestloop_required_outer(Relids outerrelids,
diff --git a/src/include/optimizer/restrictinfo.h b/src/include/optimizer/restrictinfo.h
index 14cdce750c..e140e619ac 100644
--- a/src/include/optimizer/restrictinfo.h
+++ b/src/include/optimizer/restrictinfo.h
@@ -43,8 +43,6 @@ extern void extract_actual_join_clauses(List *restrictinfo_list,
Relids joinrelids,
List **joinquals,
List **otherquals);
-extern bool has_pseudoconstant_clauses(PlannerInfo *root,
- List *restrictinfo_list);
extern bool join_clause_is_movable_to(RestrictInfo *rinfo, RelOptInfo *baserel);
extern bool join_clause_is_movable_into(RestrictInfo *rinfo,
Relids currentrelids,
On Tue, Aug 8, 2023 at 4:40 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:
I modified the code a bit further to use an if-test to avoid a useless
function call, and added/tweaked comments and docs further. Attached
is a new version of the patch. I am planning to commit this, if there
are no objections.
+1 to the v4 patch. It looks good to me.
Thanks
Richard
On Tue, Aug 8, 2023 at 6:30 PM Richard Guo <guofenglinux@gmail.com> wrote:
On Tue, Aug 8, 2023 at 4:40 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
I modified the code a bit further to use an if-test to avoid a useless
function call, and added/tweaked comments and docs further. Attached
is a new version of the patch. I am planning to commit this, if there
are no objections.
+1 to the v4 patch. It looks good to me.
Pushed after some copy-and-paste editing of comments/documents.
Thanks!
Best regards,
Etsuro Fujita
Hi Etsuro, all
The commit[1]https://github.com/postgres/postgres/commit/b0e390e6d1d68b92e9983840941f8f6d9e083fe0 seems to break some queries in Citus[2]https://github.com/citusdata/citus/issues/7119, which is an
extension which relies on set_join_pathlist_hook.
Although the comment says */*Finally, give extensions a chance to
manipulate the path list.*/ *we use it to extract lots of information
about the joins and do the planning based on the information.
Now, for some joins where consider_join_pushdown=false, we cannot get the
information that we used to get, which prevents doing distributed planning
for certain queries.
We wonder if it is possible to allow extensions to access the join info
under all circumstances, as it used to be? Basically, removing the
additional check:
diff --git a/src/backend/optimizer/path/joinpath.c
b/src/backend/optimizer/path/joinpath.c
index 03b3185984..080e76cbe9 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -349,8 +349,7 @@ add_paths_to_joinrel(PlannerInfo *root,
/*
* 6. Finally, give extensions a chance to manipulate the path list.
*/
- if (set_join_pathlist_hook &&
- consider_join_pushdown)
+ if (set_join_pathlist_hook)
set_join_pathlist_hook(root, joinrel, outerrel, innerrel,
jointype,
&extra);
}
Thanks,
Onder
[1]: https://github.com/postgres/postgres/commit/b0e390e6d1d68b92e9983840941f8f6d9e083fe0
https://github.com/postgres/postgres/commit/b0e390e6d1d68b92e9983840941f8f6d9e083fe0
[2]: https://github.com/citusdata/citus/issues/7119
Etsuro Fujita <etsuro.fujita@gmail.com>, 15 Ağu 2023 Sal, 11:05 tarihinde
şunu yazdı:
Show quoted text
On Tue, Aug 8, 2023 at 6:30 PM Richard Guo <guofenglinux@gmail.com> wrote:
On Tue, Aug 8, 2023 at 4:40 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:
I modified the code a bit further to use an if-test to avoid a useless
function call, and added/tweaked comments and docs further. Attached
is a new version of the patch. I am planning to commit this, if there
are no objections.+1 to the v4 patch. It looks good to me.
Pushed after some copy-and-paste editing of comments/documents.
Thanks!
Best regards,
Etsuro Fujita
Hi,
On Tue, Aug 15, 2023 at 11:02 PM Önder Kalacı <onderkalaci@gmail.com> wrote:
The commit[1] seems to break some queries in Citus[2], which is an extension which relies on set_join_pathlist_hook.
Although the comment says /*Finally, give extensions a chance to manipulate the path list.*/ we use it to extract lots of information about the joins and do the planning based on the information.
Now, for some joins where consider_join_pushdown=false, we cannot get the information that we used to get, which prevents doing distributed planning for certain queries.
We wonder if it is possible to allow extensions to access the join info under all circumstances, as it used to be? Basically, removing the additional check:
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c index 03b3185984..080e76cbe9 100644 --- a/src/backend/optimizer/path/joinpath.c +++ b/src/backend/optimizer/path/joinpath.c @@ -349,8 +349,7 @@ add_paths_to_joinrel(PlannerInfo *root, /* * 6. Finally, give extensions a chance to manipulate the path list. */ - if (set_join_pathlist_hook && - consider_join_pushdown) + if (set_join_pathlist_hook) set_join_pathlist_hook(root, joinrel, outerrel, innerrel, jointype, &extra);
Maybe we could do so by leaving to extensions the decision whether
they replace joins with pseudoconstant clauses, but I am not sure that
that is a good idea, because that would require the authors to modify
and recompile their extensions to fix the issue... So I fixed the
core side.
I am not familiar with the Citus extension, but such pseudoconstant
clauses are handled within the Citus extension?
Thanks for the report!
Best regards,
Etsuro Fujita
Hi Etsuro,
Thanks for the response!
Maybe we could do so by leaving to extensions the decision whether
they replace joins with pseudoconstant clauses, but I am not sure that
that is a good idea, because that would require the authors to modify
and recompile their extensions to fix the issue...
I think I cannot easily follow this argument. The decision to push down the
join
(or not) doesn't seem to be related to calling set_join_pathlist_hook. It
seems like the
extension should decide what to do with the hook.
That seems the generic theme of the hooks that Postgres provides. For
example, the extension
is allowed to even override the whole planner/executor, and there is no
condition that would
prevent it from happening. In other words, an extension can easily return
wrong results with the
wrong actions taken with the hooks, and that should be responsibility of
the extension, not Postgres
I am not familiar with the Citus extension, but such pseudoconstant
clauses are handled within the Citus extension?
As I noted earlier, Citus relies on this hook for collecting information
about all the joins that Postgres
knows about, there is nothing specific to pseudoconstants. Some parts of
creating the (distributed)
plan relies on the information gathered from this hook. So, if information
about some of the joins
are not passed to the extension, then the decisions that the extension
gives are broken (and as a result
the queries are broken).
Thanks,
Onder
Hi Onder,
On Wed, Aug 16, 2023 at 10:58 PM Önder Kalacı <onderkalaci@gmail.com> wrote:
Maybe we could do so by leaving to extensions the decision whether
they replace joins with pseudoconstant clauses, but I am not sure that
that is a good idea, because that would require the authors to modify
and recompile their extensions to fix the issue...
I think I cannot easily follow this argument. The decision to push down the join
(or not) doesn't seem to be related to calling set_join_pathlist_hook. It seems like the
extension should decide what to do with the hook.That seems the generic theme of the hooks that Postgres provides. For example, the extension
is allowed to even override the whole planner/executor, and there is no condition that would
prevent it from happening. In other words, an extension can easily return wrong results with the
wrong actions taken with the hooks, and that should be responsibility of the extension, not Postgres
I am not familiar with the Citus extension, but such pseudoconstant
clauses are handled within the Citus extension?
As I noted earlier, Citus relies on this hook for collecting information about all the joins that Postgres
knows about, there is nothing specific to pseudoconstants. Some parts of creating the (distributed)
plan relies on the information gathered from this hook. So, if information about some of the joins
are not passed to the extension, then the decisions that the extension gives are broken (and as a result
the queries are broken).
Thanks for the explanation!
Maybe my explanation was not enough, so let me explain:
* I think you could use the set_join_pathlist_hook hook as you like at
your own responsibility, but typical use cases of the hook that are
designed to support in the core system would be just add custom paths
for replacing joins with scans, as described in custom-scan.sgml (this
note is about set_rel_pathlist_hook, but it should also apply to
set_join_pathlist_hook):
Although this hook function can be used to examine, modify, or remove
paths generated by the core system, a custom scan provider will typically
confine itself to generating <structname>CustomPath</structname>
objects and adding
them to <literal>rel</literal> using <function>add_path</function>.
* The problem we had with the set_join_pathlist_hook hook is that in
such a typical use case, previously, if the replaced joins had any
pseudoconstant clauses, the planner would produce incorrect query
plans, due to the lack of support for handling such quals in
createplan.c. We could fix the extensions side, as you proposed, but
the cause of the issue is 100% the planner's deficiency, so it would
be unreasonable to force the authors to do so, which would also go
against our policy of ABI compatibility. So I fixed the core side, as
in the FDW case, so that extensions created for such a typical use
case, which I guess are the majority of the hook extensions, need not
be modified/recompiled. I think it is unfortunate that that breaks
the use case of the Citus extension, though.
BTW: commit 9e9931d2b removed the restriction on the call to the hook
extensions, so you might want to back-patch it. Though, I think it
would be better if the hook was well implemented from the beginning.
Best regards,
Etsuro Fujita
Hi,
On 2023-08-19 20:09:25 +0900, Etsuro Fujita wrote:
Maybe my explanation was not enough, so let me explain:
* I think you could use the set_join_pathlist_hook hook as you like at
your own responsibility, but typical use cases of the hook that are
designed to support in the core system would be just add custom paths
for replacing joins with scans, as described in custom-scan.sgml (this
note is about set_rel_pathlist_hook, but it should also apply to
set_join_pathlist_hook):Although this hook function can be used to examine, modify, or remove
paths generated by the core system, a custom scan provider will typically
confine itself to generating <structname>CustomPath</structname>
objects and adding
them to <literal>rel</literal> using <function>add_path</function>.
That supports citus' use more than not: "this hook function can be used to
examine ... paths generated by the core system".
* The problem we had with the set_join_pathlist_hook hook is that in
such a typical use case, previously, if the replaced joins had any
pseudoconstant clauses, the planner would produce incorrect query
plans, due to the lack of support for handling such quals in
createplan.c. We could fix the extensions side, as you proposed, but
the cause of the issue is 100% the planner's deficiency, so it would
be unreasonable to force the authors to do so, which would also go
against our policy of ABI compatibility. So I fixed the core side, as
in the FDW case, so that extensions created for such a typical use
case, which I guess are the majority of the hook extensions, need not
be modified/recompiled. I think it is unfortunate that that breaks
the use case of the Citus extension, though.
I'm not neutral - I don't work on citus, but work in the same Unit as
Onder. With that said: I don't think that's really a justification for
breaking a pre-existing, not absurd, use case in a minor release.
Except that this was only noticed after it was released in a set of minor
versions, I would say that 6f80a8d9c should just straight up be reverted.
Skimming the thread there wasn't really any analysis done about breaking
extensions etc - and that ought to be done before a substantial semantics
change in a somewhat commonly used hook. I'm inclined to think that that
might still be the right path.
BTW: commit 9e9931d2b removed the restriction on the call to the hook
extensions, so you might want to back-patch it.
Citus is an extension, not a fork, there's not really a way to just backpatch
a random commit.
Though, I think it would be better if the hook was well implemented from the
beginning.
Sure, but that's neither here nor there.
Greetings,
Andres Freund
Hi,
Show quoted text
On Sun, Aug 20, 2023 at 4:34 AM Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2023-08-19 20:09:25 +0900, Etsuro Fujita wrote:
Maybe my explanation was not enough, so let me explain:
* I think you could use the set_join_pathlist_hook hook as you like at
your own responsibility, but typical use cases of the hook that are
designed to support in the core system would be just add custom paths
for replacing joins with scans, as described in custom-scan.sgml (this
note is about set_rel_pathlist_hook, but it should also apply to
set_join_pathlist_hook):Although this hook function can be used to examine, modify, or remove
paths generated by the core system, a custom scan provider will typically
confine itself to generating <structname>CustomPath</structname>
objects and adding
them to <literal>rel</literal> using <function>add_path</function>.That supports citus' use more than not: "this hook function can be used to
examine ... paths generated by the core system".* The problem we had with the set_join_pathlist_hook hook is that in
such a typical use case, previously, if the replaced joins had any
pseudoconstant clauses, the planner would produce incorrect query
plans, due to the lack of support for handling such quals in
createplan.c. We could fix the extensions side, as you proposed, but
the cause of the issue is 100% the planner's deficiency, so it would
be unreasonable to force the authors to do so, which would also go
against our policy of ABI compatibility. So I fixed the core side, as
in the FDW case, so that extensions created for such a typical use
case, which I guess are the majority of the hook extensions, need not
be modified/recompiled. I think it is unfortunate that that breaks
the use case of the Citus extension, though.I'm not neutral - I don't work on citus, but work in the same Unit as
Onder. With that said: I don't think that's really a justification for
breaking a pre-existing, not absurd, use case in a minor release.Except that this was only noticed after it was released in a set of minor
versions, I would say that 6f80a8d9c should just straight up be reverted.
Skimming the thread there wasn't really any analysis done about breaking
extensions etc - and that ought to be done before a substantial semantics
change in a somewhat commonly used hook. I'm inclined to think that that
might still be the right path.BTW: commit 9e9931d2b removed the restriction on the call to the hook
extensions, so you might want to back-patch it.Citus is an extension, not a fork, there's not really a way to just backpatch
a random commit.Though, I think it would be better if the hook was well implemented from the
beginning.Sure, but that's neither here nor there.
Greetings,
Andres Freund
Sorry, I hit the send button by mistake.
On Sun, Aug 20, 2023 at 4:34 AM Andres Freund <andres@anarazel.de> wrote:
On 2023-08-19 20:09:25 +0900, Etsuro Fujita wrote:
* The problem we had with the set_join_pathlist_hook hook is that in
such a typical use case, previously, if the replaced joins had any
pseudoconstant clauses, the planner would produce incorrect query
plans, due to the lack of support for handling such quals in
createplan.c. We could fix the extensions side, as you proposed, but
the cause of the issue is 100% the planner's deficiency, so it would
be unreasonable to force the authors to do so, which would also go
against our policy of ABI compatibility. So I fixed the core side, as
in the FDW case, so that extensions created for such a typical use
case, which I guess are the majority of the hook extensions, need not
be modified/recompiled. I think it is unfortunate that that breaks
the use case of the Citus extension, though.I'm not neutral - I don't work on citus, but work in the same Unit as
Onder. With that said: I don't think that's really a justification for
breaking a pre-existing, not absurd, use case in a minor release.Except that this was only noticed after it was released in a set of minor
versions, I would say that 6f80a8d9c should just straight up be reverted.
Skimming the thread there wasn't really any analysis done about breaking
extensions etc - and that ought to be done before a substantial semantics
change in a somewhat commonly used hook. I'm inclined to think that that
might still be the right path.
I think you misread the thread; actually, we did an analysis and
applied a fix that would avoid ABI breakage (see the commit message
for 6f80a8d9c). It turned out that that breaks the Citus extension,
though.
Also, this is not such a change; it is just an optimization
disablement. Let me explain. This is the commit message for
e7cb7ee14, which added the hook we are discussing:
Allow FDWs and custom scan providers to replace joins with scans.
Foreign data wrappers can use this capability for so-called "join
pushdown"; that is, instead of executing two separate foreign scans
and then joining the results locally, they can generate a path which
performs the join on the remote server and then is scanned locally.
This commit does not extend postgres_fdw to take advantage of this
capability; it just provides the infrastructure.
Custom scan providers can use this in a similar way. Previously,
it was only possible for a custom scan provider to scan a single
relation. Now, it can scan an entire join tree, provided of course
that it knows how to produce the same results that the join would
have produced if executed normally.
As described in the commit message, we assume that extensions use the
hook in a similar way to FDWs; if they do so, the restriction added by
6f80a8d9c just diables them to add paths for join pushdown, making the
planner use paths involving local joins, so any breakage (other than
plan changes from custom joins to local joins) would never happen.
So my question is: does the Citus extension use the hook like this?
(Sorry, I do not fully understand Onder's explanation.)
BTW: commit 9e9931d2b removed the restriction on the call to the hook
extensions, so you might want to back-patch it.Citus is an extension, not a fork, there's not really a way to just backpatch
a random commit.
Yeah, I was thinking that that would be your concern.
Best regards,
Etsuro Fujita
Hi,
Thanks for the explanation.
As described in the commit message, we assume that extensions use the
hook in a similar way to FDWs
I'm not sure if it is fair to assume that extensions use any hook in any
way.
So my question is: does the Citus extension use the hook like this?
(Sorry, I do not fully understand Onder's explanation.)
I haven't gone into detail about how Citus uses this hook, but I don't
think we should
need to explain it. In general, Citus uses many hooks, and many other
extensions
use this specific hook. With minor version upgrades, we haven't seen this
kind of
behavior change before.
In general, Citus relies on this hook for collecting information about
joins across
relations/ctes/subqueries. So, its scope is bigger than a single join for
Citus.
The extension assigns a special marker(s) for RTE Relations, and then
checks whether
all relations with these special markers joined transitively across
subqueries, such that
it can decide to pushdown the whole or some parts of the (sub)query.
I must admit, I have not yet looked into whether we can fix the problem
within the extension.
Maybe we can, maybe not.
But the bigger issue is that there has usually been a clear line between
the extensions and
the PG itself when it comes to hooks within the minor version upgrades.
Sadly, this change
breaks that line. We wanted to share our worries here and find out what
others think.
Except that this was only noticed after it was released in a set of minor
versions, I would say that 6f80a8d9c should just straight up be reverted.
I cannot be the one to ask for reverting a commit in PG, but I think doing
it would be a
fair action. We kindly ask those who handle this to think about it.
Thanks,
Onder
Hi,
Thanks for the detailed explanation!
On Mon, Aug 21, 2023 at 10:34 PM Önder Kalacı <onderkalaci@gmail.com> wrote:
As described in the commit message, we assume that extensions use the
hook in a similar way to FDWs
I'm not sure if it is fair to assume that extensions use any hook in any way.
I am not sure either, but as for the hook, I think it is an undeniable
fact that the core system assumes that extensions will use it in that
way.
So my question is: does the Citus extension use the hook like this?
(Sorry, I do not fully understand Onder's explanation.)
I haven't gone into detail about how Citus uses this hook, but I don't think we should
need to explain it. In general, Citus uses many hooks, and many other extensions
use this specific hook. With minor version upgrades, we haven't seen this kind of
behavior change before.In general, Citus relies on this hook for collecting information about joins across
relations/ctes/subqueries. So, its scope is bigger than a single join for Citus.The extension assigns a special marker(s) for RTE Relations, and then checks whether
all relations with these special markers joined transitively across subqueries, such that
it can decide to pushdown the whole or some parts of the (sub)query.
IIUC, I think that that is going beyond what the hook supports.
But the bigger issue is that there has usually been a clear line between the extensions and
the PG itself when it comes to hooks within the minor version upgrades. Sadly, this change
breaks that line. We wanted to share our worries here and find out what others think.
My understanding is: at least for hooks with intended usages, if an
extension uses them as intended, it is guaranteed that the extension
as-is will work correctly with minor version upgrades; otherwise it is
not necessarily. I think it is unfortunate that my commit broke the
Citus extension, though.
Except that this was only noticed after it was released in a set of minor
versions, I would say that 6f80a8d9c should just straight up be reverted.
I cannot be the one to ask for reverting a commit in PG, but I think doing it would be a
fair action. We kindly ask those who handle this to think about it.
Reverting the commit would resolve your issue, but re-introduce the
issue mentioned upthread to extensions that use the hook properly, so
I do not think that reverting the commit would be a fair action.
Sorry for the delay.
Best regards,
Etsuro Fujita