postgres_fdw: wrong results with self join + enable_nestloop off

Started by Nishant Sharmaabout 3 years ago34 messageshackers
Jump to latest
#1Nishant Sharma
nishant.sharma@enterprisedb.com

Hi,

We have observed that running the same self JOIN query on postgres FDW
setup is returning different results with set enable_nestloop off & on. I
am at today's latest commit:- 928e05ddfd4031c67e101c5e74dbb5c8ec4f9e23

I created a local FDW setup. And ran this experiment on the same. Kindly
refer to the P.S section for details.

|********************************************************************|
*Below is the output difference along with query plan:-*
postgres@71609=#set enable_nestloop=off;
SET
postgres@71609=#select * from pg_tbl_foreign tbl1 join pg_tbl_foreign tbl2
on tbl1.id1 < 5 and now() < '23-Feb-2020'::timestamp;
id1 | id2 | id1 | id2
-----+-----+-----+-----
1 | 10 | 1 | 10
2 | 20 | 1 | 10
3 | 30 | 1 | 10
1 | 10 | 2 | 20
2 | 20 | 2 | 20
3 | 30 | 2 | 20
1 | 10 | 3 | 30
2 | 20 | 3 | 30
3 | 30 | 3 | 30
(9 rows)

postgres@71609=#explain (analyze, verbose) select * from pg_tbl_foreign
tbl1 join pg_tbl_foreign tbl2 on tbl1.id1 < 5 and now() <
'23-Feb-2020'::timestamp;
QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------
Foreign Scan (cost=100.00..49310.40 rows=2183680 width=16) (actual
time=0.514..0.515 rows=9 loops=1)
Output: tbl1.id1, tbl1.id2, tbl2.id1, tbl2.id2
Relations: (public.pg_tbl_foreign tbl1) INNER JOIN
(public.pg_tbl_foreign tbl2)
Remote SQL: SELECT r1.id1, r1.id2, r2.id1, r2.id2 FROM (public.pg_tbl r1
INNER JOIN public.pg_tbl r2 ON (((r1.id1 < 5))))
Planning Time: 0.139 ms
Execution Time: 0.984 ms
(6 rows)

postgres@71609=#set enable_nestloop=on;
SET
postgres@71609=#select * from pg_tbl_foreign tbl1 join pg_tbl_foreign tbl2
on tbl1.id1 < 5 and now() < '23-Feb-2020'::timestamp;
id1 | id2 | id1 | id2
-----+-----+-----+-----
(0 rows)

postgres@71609=#explain (analyze, verbose) select * from pg_tbl_foreign
tbl1 join pg_tbl_foreign tbl2 on tbl1.id1 < 5 and now() <
'23-Feb-2020'::timestamp;
QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------
Result (cost=200.00..27644.00 rows=2183680 width=16) (actual
time=0.003..0.004 rows=0 loops=1)
Output: tbl1.id1, tbl1.id2, tbl2.id1, tbl2.id2
One-Time Filter: (now() < '2020-02-23 00:00:00'::timestamp without time
zone)
-> Nested Loop (cost=200.00..27644.00 rows=2183680 width=16) (never
executed)
Output: tbl1.id1, tbl1.id2, tbl2.id1, tbl2.id2
-> Foreign Scan on public.pg_tbl_foreign tbl2
(cost=100.00..186.80 rows=2560 width=8) (never executed)
Output: tbl2.id1, tbl2.id2
Remote SQL: SELECT id1, id2 FROM public.pg_tbl
-> Materialize (cost=100.00..163.32 rows=853 width=8) (never
executed)
Output: tbl1.id1, tbl1.id2
-> Foreign Scan on public.pg_tbl_foreign tbl1
(cost=100.00..159.06 rows=853 width=8) (never executed)
Output: tbl1.id1, tbl1.id2
Remote SQL: SELECT id1, id2 FROM public.pg_tbl WHERE
((id1 < 5))
Planning Time: 0.178 ms
Execution Time: 0.292 ms
(15 rows)

|********************************************************************|

I debugged this issue and was able to find a fix for the same. Kindly
please refer to the attached fix. With the fix I am able to resolve the
issue. But I am not that confident whether this change would affect some
other existing functionally but it has helped me resolve this result
difference in output.

*What is the technical issue?*
The problem here is the use of extract_actual_clauses. Because of which the
plan creation misses adding the second condition of AND i.e "now() <
'23-Feb-2020'::timestamp" in the plan. Because it is not considered a
pseudo constant and extract_actual_clause is passed with false as the
second parameter and it gets skipped from the list. As a result that
condition is never taken into consideration as either one-time filter
(before or after) or part of SQL remote execution

*Why do I think the fix is correct?*
The fix is simple, where we have created a new function similar to
extract_actual_clause which just extracts all the conditions from the list
with no checks and returns the list to the caller. As a result all
conditions would be taken into consideration in the query plan.

*After my fix patch:-*
postgres@78754=#set enable_nestloop=off;
SET
postgres@78754=#select * from pg_tbl_foreign tbl1 join pg_tbl_foreign tbl2
on tbl1.id1 < 5 and now() < '23-Feb-2020'::timestamp;
id1 | id2 | id1 | id2
-----+-----+-----+-----
(0 rows)
^
postgres@78754=#explain (analyze, verbose) select * from pg_tbl_foreign
tbl1 join pg_tbl_foreign tbl2 on tbl1.id1 < 5 and now() <
'23-Feb-2020'::timestamp;
QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------
Foreign Scan (cost=100.00..49310.40 rows=2183680 width=16) (actual
time=0.652..0.652 rows=0 loops=1)
Output: tbl1.id1, tbl1.id2, tbl2.id1, tbl2.id2
Filter: (now() < '2020-02-23 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 9
Relations: (public.pg_tbl_foreign tbl1) INNER JOIN
(public.pg_tbl_foreign tbl2)
Remote SQL: SELECT r1.id1, r1.id2, r2.id1, r2.id2 FROM (public.pg_tbl r1
INNER JOIN public.pg_tbl r2 ON (((r1.id1 < 5))))
Planning Time: 0.133 ms
Execution Time: 1.127 ms
(8 rows)

postgres@78754=#set enable_nestloop=on;
SET
postgres@78754=#select * from pg_tbl_foreign tbl1 join pg_tbl_foreign tbl2
on tbl1.id1 < 5 and now() < '23-Feb-2020'::timestamp;
id1 | id2 | id1 | id2
-----+-----+-----+-----
(0 rows)

postgres@78754=#explain (analyze, verbose) select * from pg_tbl_foreign
tbl1 join pg_tbl_foreign tbl2 on tbl1.id1 < 5 and now() <
'23-Feb-2020'::timestamp;
QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------
Result (cost=200.00..27644.00 rows=2183680 width=16) (actual
time=0.004..0.005 rows=0 loops=1)
Output: tbl1.id1, tbl1.id2, tbl2.id1, tbl2.id2
One-Time Filter: (now() < '2020-02-23 00:00:00'::timestamp without time
zone)
-> Nested Loop (cost=200.00..27644.00 rows=2183680 width=16) (never
executed)
Output: tbl1.id1, tbl1.id2, tbl2.id1, tbl2.id2
-> Foreign Scan on public.pg_tbl_foreign tbl2
(cost=100.00..186.80 rows=2560 width=8) (never executed)
Output: tbl2.id1, tbl2.id2
Remote SQL: SELECT id1, id2 FROM public.pg_tbl
-> Materialize (cost=100.00..163.32 rows=853 width=8) (never
executed)
Output: tbl1.id1, tbl1.id2
-> Foreign Scan on public.pg_tbl_foreign tbl1
(cost=100.00..159.06 rows=853 width=8) (never executed)
Output: tbl1.id1, tbl1.id2
Remote SQL: SELECT id1, id2 FROM public.pg_tbl WHERE
((id1 < 5))
Planning Time: 0.134 ms
Execution Time: 0.347 ms
(15 rows)
|********************************************************************|

Kindly please comment if I am in the correct direction or not?

Regards,
Nishant Sharma.
Developer at EnterpriseDB, Pune, India.

P.S
Steps that I used to create local postgres FDW setup ( followed link -
https://www.postgresql.org/docs/current/postgres-fdw.html
<https://www.postgresql.org/docs/current/postgres-fdw.html):-&gt; )

1) ./configure --prefix=/home/edb/POSTGRES_INSTALL/MASTER
--with-pgport=9996 --with-openssl --with-libxml --with-zlib --with-tcl
--with-perl --with-libxslt --with-ossp-uuid --with-ldap --with-pam
--enable-nls --enable-debug --enable-depend --enable-dtrace --with-selinux
--with-icu --enable-tap-tests --enable-cassert CFLAGS="-g -O0"

2) make

3) make install

4) cd contrib/postgres_fdw/

5) make install

6) Start the server

7)
[edb@localhost MASTER]$ bin/psql postgres edb;
psql (16devel)
Type "help" for help.

postgres@70613=#create database remote_db;
CREATE DATABASE
postgres@70613=#quit

[edb@localhost MASTER]$ bin/psql remote_db edb;
psql (16devel)
Type "help" for help.

remote_db@70613=#CREATE USER fdw_user;
CREATE ROLE

remote_db@70613=#GRANT ALL ON SCHEMA public TO fdw_user;
GRANT
remote_db@70613=#quit

[edb@localhost MASTER]$ bin/psql remote_db fdw_user;
psql (16devel)
Type "help" for help.

remote_db@70613=#create table pg_tbl(id1 int, id2 int);
CREATE TABLE
remote_db@70613=#insert into pg_tbl values(1, 10);
INSERT 0 1
remote_db@70613=#insert into pg_tbl values(2, 20);
INSERT 0 1
remote_db@70613=#insert into pg_tbl values(3, 30);
INSERT 0 1

8)
New terminal/Tab:-
[edb@localhost MASTER]$ bin/psql postgres edb;
postgres@71609=#create extension postgres_fdw;
CREATE EXTENSION
postgres@71609=#CREATE SERVER localhost_fdw FOREIGN DATA WRAPPER
postgres_fdw OPTIONS (dbname 'remote_db', host 'localhost', port '9996');
CREATE SERVER
postgres@71609=#CREATE USER MAPPING for edb SERVER localhost_fdw OPTIONS
(user 'fdw_user', password '');
CREATE USER MAPPING
postgres@71609=#GRANT ALL ON FOREIGN SERVER localhost_fdw TO edb;
GRANT
postgres@71609=#CREATE FOREIGN TABLE pg_tbl_foreign(id1 int, id2 int)
SERVER localhost_fdw OPTIONS (schema_name 'public', table_name 'pg_tbl');
CREATE FOREIGN TABLE
postgres@71609=#select * from pg_tbl_foreign;
id1 | id2
-----+-----
1 | 10
2 | 20
3 | 30
(3 rows)

Attachments:

PG_PATCH_set_nestloop_off_issue_fix.patchapplication/octet-stream; name=PG_PATCH_set_nestloop_off_issue_fix.patchDownload+22-2
#2Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Nishant Sharma (#1)
Re: postgres_fdw: wrong results with self join + enable_nestloop off

Hi Nishant,

On Fri, Apr 14, 2023 at 8:39 PM Nishant Sharma
<nishant.sharma@enterprisedb.com> wrote:

I debugged this issue and was able to find a fix for the same. Kindly please refer to the attached fix. With the fix I am able to resolve the issue.

Thanks for the report and patch!

What is the technical issue?
The problem here is the use of extract_actual_clauses. Because of which the plan creation misses adding the second condition of AND i.e "now() < '23-Feb-2020'::timestamp" in the plan. Because it is not considered a pseudo constant and extract_actual_clause is passed with false as the second parameter and it gets skipped from the list. As a result that condition is never taken into consideration as either one-time filter (before or after) or part of SQL remote execution

Why do I think the fix is correct?
The fix is simple, where we have created a new function similar to extract_actual_clause which just extracts all the conditions from the list with no checks and returns the list to the caller. As a result all conditions would be taken into consideration in the query plan.

I think that the root cause for this issue would be in the
create_scan_plan handling of pseudoconstant quals when creating a
foreign-join (or custom-join) plan. Anyway, I will look at your patch
closely, first.

Best regards,
Etsuro Fujita

#3Nishant Sharma
nishant.sharma@enterprisedb.com
In reply to: Etsuro Fujita (#2)
Re: postgres_fdw: wrong results with self join + enable_nestloop off

Thanks Etsuro for your response!

One small typo correction in my answer to "What is the technical issue?"
"it is *not* considered a pseudo constant" --> "it is considered a pseudo
constant"

Regards,
Nishant.

On Fri, Apr 14, 2023 at 6:21 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:

Show quoted text

Hi Nishant,

On Fri, Apr 14, 2023 at 8:39 PM Nishant Sharma
<nishant.sharma@enterprisedb.com> wrote:

I debugged this issue and was able to find a fix for the same. Kindly

please refer to the attached fix. With the fix I am able to resolve the
issue.

Thanks for the report and patch!

What is the technical issue?
The problem here is the use of extract_actual_clauses. Because of which

the plan creation misses adding the second condition of AND i.e "now() <
'23-Feb-2020'::timestamp" in the plan. Because it is not considered a
pseudo constant and extract_actual_clause is passed with false as the
second parameter and it gets skipped from the list. As a result that
condition is never taken into consideration as either one-time filter
(before or after) or part of SQL remote execution

Why do I think the fix is correct?
The fix is simple, where we have created a new function similar to

extract_actual_clause which just extracts all the conditions from the list
with no checks and returns the list to the caller. As a result all
conditions would be taken into consideration in the query plan.

I think that the root cause for this issue would be in the
create_scan_plan handling of pseudoconstant quals when creating a
foreign-join (or custom-join) plan. Anyway, I will look at your patch
closely, first.

Best regards,
Etsuro Fujita

#4Nishant Sharma
nishant.sharma@enterprisedb.com
In reply to: Nishant Sharma (#3)
Re: postgres_fdw: wrong results with self join + enable_nestloop off

Hi Etsuro Fujita,

Any updates? -- did you get a chance to look into this?

Regards,
Nishant.

On Mon, Apr 17, 2023 at 11:00 AM Nishant Sharma <
nishant.sharma@enterprisedb.com> wrote:

Show quoted text

Thanks Etsuro for your response!

One small typo correction in my answer to "What is the technical issue?"
"it is *not* considered a pseudo constant" --> "it is considered a pseudo
constant"

Regards,
Nishant.

On Fri, Apr 14, 2023 at 6:21 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:

Hi Nishant,

On Fri, Apr 14, 2023 at 8:39 PM Nishant Sharma
<nishant.sharma@enterprisedb.com> wrote:

I debugged this issue and was able to find a fix for the same. Kindly

please refer to the attached fix. With the fix I am able to resolve the
issue.

Thanks for the report and patch!

What is the technical issue?
The problem here is the use of extract_actual_clauses. Because of which

the plan creation misses adding the second condition of AND i.e "now() <
'23-Feb-2020'::timestamp" in the plan. Because it is not considered a
pseudo constant and extract_actual_clause is passed with false as the
second parameter and it gets skipped from the list. As a result that
condition is never taken into consideration as either one-time filter
(before or after) or part of SQL remote execution

Why do I think the fix is correct?
The fix is simple, where we have created a new function similar to

extract_actual_clause which just extracts all the conditions from the list
with no checks and returns the list to the caller. As a result all
conditions would be taken into consideration in the query plan.

I think that the root cause for this issue would be in the
create_scan_plan handling of pseudoconstant quals when creating a
foreign-join (or custom-join) plan. Anyway, I will look at your patch
closely, first.

Best regards,
Etsuro Fujita

#5Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Nishant Sharma (#4)
Re: postgres_fdw: wrong results with self join + enable_nestloop off

On Mon, Apr 24, 2023 at 3:31 PM Nishant Sharma
<nishant.sharma@enterprisedb.com> wrote:

Any updates? -- did you get a chance to look into this?

Sorry, I have not looked into this yet, because I have been busy with
some other work recently. I plan to do so early next week.

Best regards,
Etsuro Fujita

#6Richard Guo
guofenglinux@gmail.com
In reply to: Etsuro Fujita (#2)
Re: postgres_fdw: wrong results with self join + enable_nestloop off

On Fri, Apr 14, 2023 at 8:51 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:

I think that the root cause for this issue would be in the
create_scan_plan handling of pseudoconstant quals when creating a
foreign-join (or custom-join) plan.

Yes exactly. In create_scan_plan, we are supposed to extract all the
pseudoconstant clauses and use them as one-time quals in a gating Result
node. Currently we check against rel->baserestrictinfo and ppi_clauses
for the pseudoconstant clauses. But for scans of foreign joins, we do
not have any restriction clauses in these places and thus the gating
Result node as well as the pseudoconstant clauses would just be lost.

I looked at Nishant's patch. IIUC it treats the pseudoconstant clauses
as local conditions. While it can fix the wrong results issue, I think
maybe it's better to still treat the pseudoconstant clauses as one-time
quals in a gating node. So I wonder if we can store the restriction
clauses for foreign joins in ForeignPath, just as what we do for normal
JoinPath, and then check against them for pseudoconstant clauses in
create_scan_plan, something like attached.

BTW, while going through the codes I noticed one place in
add_foreign_final_paths that uses NULL for List *. I changed it to NIL.

Thanks
Richard

Attachments:

v1-0001-Fix-how-we-handle-pseudoconstant-clauses-for-scans-of-foreign-joins.patchapplication/octet-stream; name=v1-0001-Fix-how-we-handle-pseudoconstant-clauses-for-scans-of-foreign-joins.patchDownload+53-7
#7Suraj Kharage
suraj.kharage@enterprisedb.com
In reply to: Richard Guo (#6)
Re: postgres_fdw: wrong results with self join + enable_nestloop off

+1 for fixing this in the backend code rather than FDW code.

Thanks, Richard, for working on this. The patch looks good to me at
a glance.

On Tue, Apr 25, 2023 at 3:36 PM Richard Guo <guofenglinux@gmail.com> wrote:

On Fri, Apr 14, 2023 at 8:51 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:

I think that the root cause for this issue would be in the
create_scan_plan handling of pseudoconstant quals when creating a
foreign-join (or custom-join) plan.

Yes exactly. In create_scan_plan, we are supposed to extract all the
pseudoconstant clauses and use them as one-time quals in a gating Result
node. Currently we check against rel->baserestrictinfo and ppi_clauses
for the pseudoconstant clauses. But for scans of foreign joins, we do
not have any restriction clauses in these places and thus the gating
Result node as well as the pseudoconstant clauses would just be lost.

I looked at Nishant's patch. IIUC it treats the pseudoconstant clauses
as local conditions. While it can fix the wrong results issue, I think
maybe it's better to still treat the pseudoconstant clauses as one-time
quals in a gating node. So I wonder if we can store the restriction
clauses for foreign joins in ForeignPath, just as what we do for normal
JoinPath, and then check against them for pseudoconstant clauses in
create_scan_plan, something like attached.

BTW, while going through the codes I noticed one place in
add_foreign_final_paths that uses NULL for List *. I changed it to NIL.

Thanks
Richard

--
--

Thanks & Regards,
Suraj kharage,

edbpostgres.com

#8Nishant Sharma
nishant.sharma@enterprisedb.com
In reply to: Suraj Kharage (#7)
Re: postgres_fdw: wrong results with self join + enable_nestloop off

I also agree that Richard's patch is better. As it fixes the issue at the
backend and does not treat pseudoconstant as local condition.

I have tested Richard's patch and observe that it is resolving the problem.
Patch looks good to me as well.

*I only had a minor comment on below change:-*

*- gating_clauses = get_gating_quals(root, scan_clauses);+ if
(best_path->pathtype == T_ForeignScan && IS_JOIN_REL(rel))+
gating_clauses = get_gating_quals(root, ((ForeignPath *)
best_path)->joinrestrictinfo);+ else+ gating_clauses =
get_gating_quals(root, scan_clauses);*

Instead of using 'if' and creating a special case here can't we do

something in the above switch?

Regards,
Nishant.

P.S
I tried something quickly but I am seeing a crash:-

* case T_IndexOnlyScan: scan_clauses
= castNode(IndexPath, best_path)->indexinfo->indrestrictinfo;
break;+ case T_ForeignScan:+
/*+ * Note that for scans of foreign joins, we do
not have restriction clauses+ * stored in
baserestrictinfo and we do not consider parameterization.+
* Instead we need to check against joinrestrictinfo stored in
ForeignPath.+ */+ if
(IS_JOIN_REL(rel))+ scan_clauses =
((ForeignPath *) best_path)->joinrestrictinfo;+ else+
scan_clauses = rel->baserestrictinfo;+
break; default:
scan_clauses = rel->baserestrictinfo; break;*

On Fri, Jun 2, 2023 at 9:00 AM Suraj Kharage <suraj.kharage@enterprisedb.com>
wrote:

Show quoted text

+1 for fixing this in the backend code rather than FDW code.

Thanks, Richard, for working on this. The patch looks good to me at
a glance.

On Tue, Apr 25, 2023 at 3:36 PM Richard Guo <guofenglinux@gmail.com>
wrote:

On Fri, Apr 14, 2023 at 8:51 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:

I think that the root cause for this issue would be in the
create_scan_plan handling of pseudoconstant quals when creating a
foreign-join (or custom-join) plan.

Yes exactly. In create_scan_plan, we are supposed to extract all the
pseudoconstant clauses and use them as one-time quals in a gating Result
node. Currently we check against rel->baserestrictinfo and ppi_clauses
for the pseudoconstant clauses. But for scans of foreign joins, we do
not have any restriction clauses in these places and thus the gating
Result node as well as the pseudoconstant clauses would just be lost.

I looked at Nishant's patch. IIUC it treats the pseudoconstant clauses
as local conditions. While it can fix the wrong results issue, I think
maybe it's better to still treat the pseudoconstant clauses as one-time
quals in a gating node. So I wonder if we can store the restriction
clauses for foreign joins in ForeignPath, just as what we do for normal
JoinPath, and then check against them for pseudoconstant clauses in
create_scan_plan, something like attached.

BTW, while going through the codes I noticed one place in
add_foreign_final_paths that uses NULL for List *. I changed it to NIL.

Thanks
Richard

--
--

Thanks & Regards,
Suraj kharage,

edbpostgres.com

#9Richard Guo
guofenglinux@gmail.com
In reply to: Suraj Kharage (#7)
Re: postgres_fdw: wrong results with self join + enable_nestloop off

On Fri, Jun 2, 2023 at 11:30 AM Suraj Kharage <
suraj.kharage@enterprisedb.com> wrote:

+1 for fixing this in the backend code rather than FDW code.

Thanks, Richard, for working on this. The patch looks good to me at
a glance.

Thank you Suraj for the review!

Thanks
Richard

#10Richard Guo
guofenglinux@gmail.com
In reply to: Nishant Sharma (#8)
Re: postgres_fdw: wrong results with self join + enable_nestloop off

On Fri, Jun 2, 2023 at 8:31 PM Nishant Sharma <
nishant.sharma@enterprisedb.com> wrote:

*I only had a minor comment on below change:-*

*- gating_clauses = get_gating_quals(root, scan_clauses);+ if
(best_path->pathtype == T_ForeignScan && IS_JOIN_REL(rel))+
gating_clauses = get_gating_quals(root, ((ForeignPath *)
best_path)->joinrestrictinfo);+ else+ gating_clauses =
get_gating_quals(root, scan_clauses);*

Instead of using 'if' and creating a special case here can't we do
something in the above switch?

I thought about that too. IIRC I did not do it in that way because
postgresGetForeignPlan expects that there is no scan_clauses for a join
rel. So doing that would trigger the Assert there.

/*
* For a join rel, baserestrictinfo is NIL and we are not considering
* parameterization right now, so there should be no scan_clauses for
* a joinrel or an upper rel either.
*/
Assert(!scan_clauses);

Thanks
Richard

#11Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Nishant Sharma (#8)
Re: postgres_fdw: wrong results with self join + enable_nestloop off

Hi,

On Fri, Jun 2, 2023 at 9:31 PM Nishant Sharma
<nishant.sharma@enterprisedb.com> wrote:

I also agree that Richard's patch is better. As it fixes the issue at the backend and does not treat pseudoconstant as local condition.

I have tested Richard's patch and observe that it is resolving the problem. Patch looks good to me as well.

If the patch is intended for HEAD only, I also think it goes in the
right direction. But if it is intended for back branches as well, I
do not think so, because it would cause ABI breakage due to changes
made to the ForeignPath struct and the create_foreign_join_path() API.
(For the former, I think we could avoid doing so by adding the new
member at the end of the struct, not in the middle, though.)

To avoid this issue, I am wondering if we should modify
add_paths_to_joinrel() in back branches so that it just disallows the
FDW to consider pushing down joins when the restrictlist has
pseudoconstant clauses. Attached is a patch for that.

My apologies for not reviewing your patch and the long long delay.

Best regards,
Etsuro Fujita

Attachments:

disallow-join-pushdown-if-pseudoconstants.patchapplication/octet-stream; name=disallow-join-pushdown-if-pseudoconstants.patchDownload+69-1
#12Richard Guo
guofenglinux@gmail.com
In reply to: Etsuro Fujita (#11)
Re: postgres_fdw: wrong results with self join + enable_nestloop off

On Mon, Jun 5, 2023 at 9:19 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:

If the patch is intended for HEAD only, I also think it goes in the
right direction. But if it is intended for back branches as well, I
do not think so, because it would cause ABI breakage due to changes
made to the ForeignPath struct and the create_foreign_join_path() API.
(For the former, I think we could avoid doing so by adding the new
member at the end of the struct, not in the middle, though.)

Thanks for pointing this out. You're right. The patch has backport
issue because of the ABI breakage. So it can only be applied on HEAD.

To avoid this issue, I am wondering if we should modify
add_paths_to_joinrel() in back branches so that it just disallows the
FDW to consider pushing down joins when the restrictlist has
pseudoconstant clauses. Attached is a patch for that.

I think we can do that in back branches. But I'm a little concerned
that we'd miss a better plan if FDW cannot push down joins in such
cases. I may be worrying over nothing though if it's not common that
the restrictlist has pseudoconstant clauses.

Thanks
Richard

#13Nishant Sharma
nishant.sharma@enterprisedb.com
In reply to: Richard Guo (#12)
Re: postgres_fdw: wrong results with self join + enable_nestloop off

Hi,

Etsuro's patch is also showing the correct output for "set
enable_nestloop=off". Looks good to me for back branches due to backport
issues.

But below are a few observations for the same:-
1) I looked into the query plan for both "set enable_nestloop" on & off
case and observe that they are the same. That is, what we see with "set
enable_nestloop=on".
2) In back branches for "set enable_nestloop" on & off value, at least this
type of query execution won't make any difference. No comparison of plans
to be selected based on total cost of two plans old (Nested Loop with
Foreign Scans) & new (Only Foreign Scan) will be done, because we are
avoiding the call to "postgresGetForeignJoinPaths()" up front when we have
pseudo constants.

Regards,
Nishant.

On Tue, Jun 6, 2023 at 8:50 AM Richard Guo <guofenglinux@gmail.com> wrote:

Show quoted text

On Mon, Jun 5, 2023 at 9:19 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:

If the patch is intended for HEAD only, I also think it goes in the
right direction. But if it is intended for back branches as well, I
do not think so, because it would cause ABI breakage due to changes
made to the ForeignPath struct and the create_foreign_join_path() API.
(For the former, I think we could avoid doing so by adding the new
member at the end of the struct, not in the middle, though.)

Thanks for pointing this out. You're right. The patch has backport
issue because of the ABI breakage. So it can only be applied on HEAD.

To avoid this issue, I am wondering if we should modify
add_paths_to_joinrel() in back branches so that it just disallows the
FDW to consider pushing down joins when the restrictlist has
pseudoconstant clauses. Attached is a patch for that.

I think we can do that in back branches. But I'm a little concerned
that we'd miss a better plan if FDW cannot push down joins in such
cases. I may be worrying over nothing though if it's not common that
the restrictlist has pseudoconstant clauses.

Thanks
Richard

#14Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Richard Guo (#12)
Re: postgres_fdw: wrong results with self join + enable_nestloop off

Hi Richard,

On Tue, Jun 6, 2023 at 12:20 PM Richard Guo <guofenglinux@gmail.com> wrote:

On Mon, Jun 5, 2023 at 9:19 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:

To avoid this issue, I am wondering if we should modify
add_paths_to_joinrel() in back branches so that it just disallows the
FDW to consider pushing down joins when the restrictlist has
pseudoconstant clauses. Attached is a patch for that.

I think we can do that in back branches. But I'm a little concerned
that we'd miss a better plan if FDW cannot push down joins in such
cases. I may be worrying over nothing though if it's not common that
the restrictlist has pseudoconstant clauses.

Yeah, it is unfortunate that we would not get better plans. Given
that it took quite a long time to find this issue, I suppose that
users seldom do foreign joins with pseudoconstant clauses, though.

Anyway thanks for working on this, Richard!

Best regards,
Etsuro Fujita

#15Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Nishant Sharma (#13)
Re: postgres_fdw: wrong results with self join + enable_nestloop off

Hi,

On Wed, Jun 7, 2023 at 7:28 PM Nishant Sharma
<nishant.sharma@enterprisedb.com> wrote:

Etsuro's patch is also showing the correct output for "set enable_nestloop=off". Looks good to me for back branches due to backport issues.

But below are a few observations for the same:-
1) I looked into the query plan for both "set enable_nestloop" on & off case and observe that they are the same. That is, what we see with "set enable_nestloop=on".
2) In back branches for "set enable_nestloop" on & off value, at least this type of query execution won't make any difference. No comparison of plans to be selected based on total cost of two plans old (Nested Loop with Foreign Scans) & new (Only Foreign Scan) will be done, because we are avoiding the call to "postgresGetForeignJoinPaths()" up front when we have pseudo constants.

Thanks for looking!

Best regards,
Etsuro Fujita

#16Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#11)
Re: postgres_fdw: wrong results with self join + enable_nestloop off

On Mon, Jun 5, 2023 at 10:19 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:

To avoid this issue, I am wondering if we should modify
add_paths_to_joinrel() in back branches so that it just disallows the
FDW to consider pushing down joins when the restrictlist has
pseudoconstant clauses. Attached is a patch for that.

I think that custom scans have the same issue, so I modified the patch
further so that it also disallows custom-scan providers to consider
join pushdown in add_paths_to_joinrel() if necessary. Attached is a
new version of the patch.

Best regards,
Etsuro Fujita

Attachments:

disallow-join-pushdown-if-pseudoconstants-v2.patchapplication/octet-stream; name=disallow-join-pushdown-if-pseudoconstants-v2.patchDownload+79-2
#17Nishant Sharma
nishant.sharma@enterprisedb.com
In reply to: Etsuro Fujita (#16)
Re: postgres_fdw: wrong results with self join + enable_nestloop off

Looks good to me. Tested on master and it works.
New patch used a bool flag to avoid calls for both FDW and custom hook's
call. And a slight change in comment of "has_pseudoconstant_clauses"
function.

Regards,
Nishant.

On Wed, Jun 14, 2023 at 12:19 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:

Show quoted text

On Mon, Jun 5, 2023 at 10:19 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:

To avoid this issue, I am wondering if we should modify
add_paths_to_joinrel() in back branches so that it just disallows the
FDW to consider pushing down joins when the restrictlist has
pseudoconstant clauses. Attached is a patch for that.

I think that custom scans have the same issue, so I modified the patch
further so that it also disallows custom-scan providers to consider
join pushdown in add_paths_to_joinrel() if necessary. Attached is a
new version of the patch.

Best regards,
Etsuro Fujita

#18Richard Guo
guofenglinux@gmail.com
In reply to: Etsuro Fujita (#16)
Re: postgres_fdw: wrong results with self join + enable_nestloop off

On Wed, Jun 14, 2023 at 2:49 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:

On Mon, Jun 5, 2023 at 10:19 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:

To avoid this issue, I am wondering if we should modify
add_paths_to_joinrel() in back branches so that it just disallows the
FDW to consider pushing down joins when the restrictlist has
pseudoconstant clauses. Attached is a patch for that.

I think that custom scans have the same issue, so I modified the patch
further so that it also disallows custom-scan providers to consider
join pushdown in add_paths_to_joinrel() if necessary. Attached is a
new version of the patch.

Good point. The v2 patch looks good to me for back branches.

I'm wondering what the plan is for HEAD. Should we also disallow
foreign/custom join pushdown in the case that there is any
pseudoconstant restriction clause, or instead still allow join pushdown
in that case? If it is the latter, I think we can do something like my
patch upthread does. But that patch needs to be revised to consider
custom scans, maybe by storing the restriction clauses also in
CustomPath?

Thanks
Richard

#19Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Richard Guo (#18)
Re: postgres_fdw: wrong results with self join + enable_nestloop off

Hi Richard,

On Sun, Jun 25, 2023 at 3:05 PM Richard Guo <guofenglinux@gmail.com> wrote:

On Wed, Jun 14, 2023 at 2:49 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:

On Mon, Jun 5, 2023 at 10:19 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:

To avoid this issue, I am wondering if we should modify
add_paths_to_joinrel() in back branches so that it just disallows the
FDW to consider pushing down joins when the restrictlist has
pseudoconstant clauses. Attached is a patch for that.

I think that custom scans have the same issue, so I modified the patch
further so that it also disallows custom-scan providers to consider
join pushdown in add_paths_to_joinrel() if necessary. Attached is a

Good point. The v2 patch looks good to me for back branches.

Cool! Thanks for looking!

I'm wondering what the plan is for HEAD. Should we also disallow
foreign/custom join pushdown in the case that there is any
pseudoconstant restriction clause, or instead still allow join pushdown
in that case? If it is the latter, I think we can do something like my
patch upthread does. But that patch needs to be revised to consider
custom scans, maybe by storing the restriction clauses also in
CustomPath?

I think we should choose the latter, so I modified your patch as
mentioned, after re-creating it on top of my patch. Attached is a new
version (0002-Allow-join-pushdown-even-if-pseudoconstant-quals-v2.patch).
I am attaching my patch as well
(0001-Disable-join-pushdown-if-pseudoconstant-quals-v2.patch).

Other changes made to your patch:

* I renamed the new member of the ForeignPath struct to
fdw_restrictinfo. (And I named that of the CustomPath struct
custom_restrictinfo.)

* In your patch, only for create_foreign_join_path(), the API was
modified so that the caller provides the new member of ForeignPath,
but I modified that for
create_foreignscan_path()/create_foreign_upper_path() as well, for
consistency.

* In this bit I changed the last argument to NIL, which would be
nitpicking, though.

@@ -1038,7 +1038,7 @@ postgresGetForeignPaths(PlannerInfo *root,
add_path(baserel, (Path *) path);

  /* Add paths with pathkeys */
- add_paths_with_pathkeys_for_rel(root, baserel, NULL);
+ add_paths_with_pathkeys_for_rel(root, baserel, NULL, NULL);

* I dropped this test case, because it would not be stable if the
system clock was too slow.

+-- bug due to sloppy handling of pseudoconstant clauses for foreign joins
+EXPLAIN (VERBOSE, COSTS OFF)
+  SELECT * FROM ft2 a, ft2 b
+  WHERE b.c1 = a.c1 AND now() < '25-April-2023'::timestamp;
+SELECT * FROM ft2 a, ft2 b
+WHERE b.c1 = a.c1 AND now() < '25-April-2023'::timestamp;

That is it.

Sorry for the long long delay.

Best regards,
Etsuro Fujita

Attachments:

0001-Disable-join-pushdown-if-pseudoconstant-quals-v2.patchapplication/octet-stream; name=0001-Disable-join-pushdown-if-pseudoconstant-quals-v2.patchDownload+79-2
0002-Allow-join-pushdown-even-if-pseudoconstant-quals-v2.patchapplication/octet-stream; name=0002-Allow-join-pushdown-even-if-pseudoconstant-quals-v2.patchDownload+69-74
#20Richard Guo
guofenglinux@gmail.com
In reply to: Etsuro Fujita (#19)
Re: postgres_fdw: wrong results with self join + enable_nestloop off

On Fri, Jul 21, 2023 at 8:51 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:

I think we should choose the latter, so I modified your patch as
mentioned, after re-creating it on top of my patch. Attached is a new
version (0002-Allow-join-pushdown-even-if-pseudoconstant-quals-v2.patch).
I am attaching my patch as well
(0001-Disable-join-pushdown-if-pseudoconstant-quals-v2.patch).

Other changes made to your patch:

* I renamed the new member of the ForeignPath struct to
fdw_restrictinfo. (And I named that of the CustomPath struct
custom_restrictinfo.)

That's much better, and more consistent with other members in
ForeignPath/CustomPath. Thanks!

* In your patch, only for create_foreign_join_path(), the API was
modified so that the caller provides the new member of ForeignPath,
but I modified that for
create_foreignscan_path()/create_foreign_upper_path() as well, for
consistency.

LGTM.

* In this bit I changed the last argument to NIL, which would be
nitpicking, though.

@@ -1038,7 +1038,7 @@ postgresGetForeignPaths(PlannerInfo *root,
add_path(baserel, (Path *) path);

/* Add paths with pathkeys */
- add_paths_with_pathkeys_for_rel(root, baserel, NULL);
+ add_paths_with_pathkeys_for_rel(root, baserel, NULL, NULL);

Good catch! This was my oversight.

* I dropped this test case, because it would not be stable if the
system clock was too slow.

Agreed. And the test case from 0001 should be sufficient.

So the two patches both look good to me now.

Thanks
Richard

#21Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Richard Guo (#20)
#22Richard Guo
guofenglinux@gmail.com
In reply to: Etsuro Fujita (#21)
#23Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Richard Guo (#22)
#24Richard Guo
guofenglinux@gmail.com
In reply to: Etsuro Fujita (#23)
#25Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Richard Guo (#24)
#26Önder Kalacı
onderkalaci@gmail.com
In reply to: Etsuro Fujita (#25)
#27Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Önder Kalacı (#26)
#28Önder Kalacı
onderkalaci@gmail.com
In reply to: Etsuro Fujita (#27)
#29Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Önder Kalacı (#28)
#30Andres Freund
andres@anarazel.de
In reply to: Etsuro Fujita (#29)
#31Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Andres Freund (#30)
#32Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Andres Freund (#30)
#33Önder Kalacı
onderkalaci@gmail.com
In reply to: Etsuro Fujita (#32)
#34Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Önder Kalacı (#33)