Should consider materializing the cheapest inner path in consider_parallel_nestloop()

Started by Tender Wangover 2 years ago29 messageshackers
Jump to latest
#1Tender Wang
tndrwang@gmail.com

Hi all,

I recently run benchmark[1]https://github.com/tenderwg/tpch_test on master, but I found performance problem
as below:

explain analyze select
subq_0.c0 as c0,
subq_0.c1 as c1,
subq_0.c2 as c2
from
(select
ref_0.l_shipmode as c0,
sample_0.l_orderkey as c1,
sample_0.l_quantity as c2,
ref_0.l_orderkey as c3,
sample_0.l_shipmode as c5,
ref_0.l_shipinstruct as c6
from
public.lineitem as ref_0
left join public.lineitem as sample_0
on ((select p_partkey from public.part order by p_partkey limit 1)
is not NULL)
where sample_0.l_orderkey is NULL) as subq_0
where subq_0.c5 is NULL
limit 1;
QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=78.00..45267050.75 rows=1 width=27) (actual
time=299695.097..299695.099 rows=0 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=78.00..78.00 rows=1 width=8) (actual
time=0.651..0.652 rows=1 loops=1)
-> Sort (cost=78.00..83.00 rows=2000 width=8) (actual
time=0.650..0.651 rows=1 loops=1)
Sort Key: part.p_partkey
Sort Method: top-N heapsort Memory: 25kB
-> Seq Scan on part (cost=0.00..68.00 rows=2000 width=8)
(actual time=0.013..0.428 rows=2000 loops=1)
-> Nested Loop Left Join (cost=0.00..45266972.75 rows=1 width=27)
(actual time=299695.096..299695.096 rows=0 loops=1)
Join Filter: ($0 IS NOT NULL)
Filter: ((sample_0.l_orderkey IS NULL) AND (sample_0.l_shipmode IS
NULL))
Rows Removed by Filter: 3621030625
-> Seq Scan on lineitem ref_0 (cost=0.00..1969.75 rows=60175
width=11) (actual time=0.026..6.225 rows=60175 loops=1)
-> Materialize (cost=0.00..2270.62 rows=60175 width=27) (actual
time=0.000..2.554 rows=60175 loops=60175)
-> Seq Scan on lineitem sample_0 (cost=0.00..1969.75
rows=60175 width=27) (actual time=0.004..8.169 rows=60175 loops=1)
Planning Time: 0.172 ms
Execution Time: 299695.501 ms
(16 rows)

After I set enable_material to off, the same query run faster, as below:
set enable_material = off;
explain analyze select
subq_0.c0 as c0,
subq_0.c1 as c1,
subq_0.c2 as c2
from
(select
ref_0.l_shipmode as c0,
sample_0.l_orderkey as c1,
sample_0.l_quantity as c2,
ref_0.l_orderkey as c3,
sample_0.l_shipmode as c5,
ref_0.l_shipinstruct as c6
from
public.lineitem as ref_0
left join public.lineitem as sample_0
on ((select p_partkey from public.part order by p_partkey limit 1)
is not NULL)
where sample_0.l_orderkey is NULL) as subq_0
where subq_0.c5 is NULL
limit 1;
QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1078.00..91026185.57 rows=1 width=27) (actual
time=192669.605..192670.425 rows=0 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=78.00..78.00 rows=1 width=8) (actual
time=0.662..0.663 rows=1 loops=1)
-> Sort (cost=78.00..83.00 rows=2000 width=8) (actual
time=0.661..0.662 rows=1 loops=1)
Sort Key: part.p_partkey
Sort Method: top-N heapsort Memory: 25kB
-> Seq Scan on part (cost=0.00..68.00 rows=2000 width=8)
(actual time=0.017..0.430 rows=2000 loops=1)
-> Gather (cost=1000.00..91026107.57 rows=1 width=27) (actual
time=192669.604..192670.422 rows=0 loops=1)
Workers Planned: 1
Params Evaluated: $0
Workers Launched: 1
-> Nested Loop Left Join (cost=0.00..91025107.47 rows=1
width=27) (actual time=192588.143..192588.144 rows=0 loops=2)
Join Filter: ($0 IS NOT NULL)
Filter: ((sample_0.l_orderkey IS NULL) AND
(sample_0.l_shipmode IS NULL))
Rows Removed by Filter: 1810515312
-> Parallel Seq Scan on lineitem ref_0 (cost=0.00..1721.97
rows=35397 width=11) (actual time=0.007..3.797 rows=30088 loops=2)
-> Seq Scan on lineitem sample_0 (cost=0.00..1969.75
rows=60175 width=27) (actual time=0.000..2.637 rows=60175 loops=60175)
Planning Time: 0.174 ms
Execution Time: 192670.458 ms
(19 rows)

I debug the code and find consider_parallel_nestloop() doesn't consider
materialized form of the cheapest inner path.
When enable_material = true, we can see Material path won in first plan,
but Parallel Seq Scan node doesn't add as outer path, which because
in try_partial_nestloop_path() , the cost of nestloop wat computed using
seq scan path not material path.

[1]: https://github.com/tenderwg/tpch_test

I try fix this problem in attached patch, and I found pg12.12 also had this
issue. Please review my patch, thanks!

[1]: https://github.com/tenderwg/tpch_test

Attachments:

0001-Parallel-seq-scan-should-consider-materila-inner-pat.patchtext/plain; charset=US-ASCII; name=0001-Parallel-seq-scan-should-consider-materila-inner-pat.patchDownload+53-1
#2Tender Wang
tndrwang@gmail.com
In reply to: Tender Wang (#1)
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

After using patch, the result as below :
QUERY
PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1078.00..26630101.20 rows=1 width=27) (actual
time=160571.005..160571.105 rows=0 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=78.00..78.00 rows=1 width=8) (actual
time=1.065..1.066 rows=1 loops=1)
-> Sort (cost=78.00..83.00 rows=2000 width=8) (actual
time=1.064..1.065 rows=1 loops=1)
Sort Key: part.p_partkey
Sort Method: top-N heapsort Memory: 25kB
-> Seq Scan on part (cost=0.00..68.00 rows=2000 width=8)
(actual time=0.046..0.830 rows=2000 loops=1)
-> Gather (cost=1000.00..26630023.20 rows=1 width=27) (actual
time=160571.003..160571.102 rows=0 loops=1)
Workers Planned: 1
Params Evaluated: $0
Workers Launched: 1
-> Nested Loop Left Join (cost=0.00..26629023.10 rows=1
width=27) (actual time=160549.257..160549.258 rows=0 loops=2)
Join Filter: ($0 IS NOT NULL)
Filter: ((sample_0.l_orderkey IS NULL) AND
(sample_0.l_shipmode IS NULL))
Rows Removed by Filter: 1810515312
-> Parallel Seq Scan on lineitem ref_0 (cost=0.00..1721.97
rows=35397 width=11) (actual time=0.010..3.393 rows=30088 loops=2)
-> Materialize (cost=0.00..2270.62 rows=60175 width=27)
(actual time=0.000..2.839 rows=60175 loops=60175)
-> Seq Scan on lineitem sample_0 (cost=0.00..1969.75
rows=60175 width=27) (actual time=0.008..11.381 rows=60175 loops=2)
Planning Time: 0.174 ms
Execution Time: 160571.476 ms
(20 rows)

tender wang <tndrwang@gmail.com> 于2023年9月5日周二 16:52写道:

Show quoted text

Hi all,

I recently run benchmark[1] on master, but I found performance problem
as below:

explain analyze select
subq_0.c0 as c0,
subq_0.c1 as c1,
subq_0.c2 as c2
from
(select
ref_0.l_shipmode as c0,
sample_0.l_orderkey as c1,
sample_0.l_quantity as c2,
ref_0.l_orderkey as c3,
sample_0.l_shipmode as c5,
ref_0.l_shipinstruct as c6
from
public.lineitem as ref_0
left join public.lineitem as sample_0
on ((select p_partkey from public.part order by p_partkey limit
1)
is not NULL)
where sample_0.l_orderkey is NULL) as subq_0
where subq_0.c5 is NULL
limit 1;
QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=78.00..45267050.75 rows=1 width=27) (actual
time=299695.097..299695.099 rows=0 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=78.00..78.00 rows=1 width=8) (actual
time=0.651..0.652 rows=1 loops=1)
-> Sort (cost=78.00..83.00 rows=2000 width=8) (actual
time=0.650..0.651 rows=1 loops=1)
Sort Key: part.p_partkey
Sort Method: top-N heapsort Memory: 25kB
-> Seq Scan on part (cost=0.00..68.00 rows=2000
width=8) (actual time=0.013..0.428 rows=2000 loops=1)
-> Nested Loop Left Join (cost=0.00..45266972.75 rows=1 width=27)
(actual time=299695.096..299695.096 rows=0 loops=1)
Join Filter: ($0 IS NOT NULL)
Filter: ((sample_0.l_orderkey IS NULL) AND (sample_0.l_shipmode
IS NULL))
Rows Removed by Filter: 3621030625
-> Seq Scan on lineitem ref_0 (cost=0.00..1969.75 rows=60175
width=11) (actual time=0.026..6.225 rows=60175 loops=1)
-> Materialize (cost=0.00..2270.62 rows=60175 width=27) (actual
time=0.000..2.554 rows=60175 loops=60175)
-> Seq Scan on lineitem sample_0 (cost=0.00..1969.75
rows=60175 width=27) (actual time=0.004..8.169 rows=60175 loops=1)
Planning Time: 0.172 ms
Execution Time: 299695.501 ms
(16 rows)

After I set enable_material to off, the same query run faster, as below:
set enable_material = off;
explain analyze select
subq_0.c0 as c0,
subq_0.c1 as c1,
subq_0.c2 as c2
from
(select
ref_0.l_shipmode as c0,
sample_0.l_orderkey as c1,
sample_0.l_quantity as c2,
ref_0.l_orderkey as c3,
sample_0.l_shipmode as c5,
ref_0.l_shipinstruct as c6
from
public.lineitem as ref_0
left join public.lineitem as sample_0
on ((select p_partkey from public.part order by p_partkey limit
1)
is not NULL)
where sample_0.l_orderkey is NULL) as subq_0
where subq_0.c5 is NULL
limit 1;
QUERY
PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1078.00..91026185.57 rows=1 width=27) (actual
time=192669.605..192670.425 rows=0 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=78.00..78.00 rows=1 width=8) (actual
time=0.662..0.663 rows=1 loops=1)
-> Sort (cost=78.00..83.00 rows=2000 width=8) (actual
time=0.661..0.662 rows=1 loops=1)
Sort Key: part.p_partkey
Sort Method: top-N heapsort Memory: 25kB
-> Seq Scan on part (cost=0.00..68.00 rows=2000
width=8) (actual time=0.017..0.430 rows=2000 loops=1)
-> Gather (cost=1000.00..91026107.57 rows=1 width=27) (actual
time=192669.604..192670.422 rows=0 loops=1)
Workers Planned: 1
Params Evaluated: $0
Workers Launched: 1
-> Nested Loop Left Join (cost=0.00..91025107.47 rows=1
width=27) (actual time=192588.143..192588.144 rows=0 loops=2)
Join Filter: ($0 IS NOT NULL)
Filter: ((sample_0.l_orderkey IS NULL) AND
(sample_0.l_shipmode IS NULL))
Rows Removed by Filter: 1810515312
-> Parallel Seq Scan on lineitem ref_0
(cost=0.00..1721.97 rows=35397 width=11) (actual time=0.007..3.797
rows=30088 loops=2)
-> Seq Scan on lineitem sample_0 (cost=0.00..1969.75
rows=60175 width=27) (actual time=0.000..2.637 rows=60175 loops=60175)
Planning Time: 0.174 ms
Execution Time: 192670.458 ms
(19 rows)

I debug the code and find consider_parallel_nestloop() doesn't consider
materialized form of the cheapest inner path.
When enable_material = true, we can see Material path won in first plan,
but Parallel Seq Scan node doesn't add as outer path, which because
in try_partial_nestloop_path() , the cost of nestloop wat computed using
seq scan path not material path.

[1] include test table schema and data, you can repeat above problem.

I try fix this problem in attached patch, and I found pg12.12 also had
this issue. Please review my patch, thanks!

[1] https://github.com/tenderwg/tpch_test

#3Richard Guo
guofenglinux@gmail.com
In reply to: Tender Wang (#1)
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

On Tue, Sep 5, 2023 at 4:52 PM tender wang <tndrwang@gmail.com> wrote:

I recently run benchmark[1] on master, but I found performance problem
as below:
...

I debug the code and find consider_parallel_nestloop() doesn't consider
materialized form of the cheapest inner path.

Yeah, this seems an omission in commit 45be99f8. I reviewed the patch
and here are some comments.

* I think we should not consider materializing the cheapest inner path
if we're doing JOIN_UNIQUE_INNER, because in this case we have to
unique-ify the inner path.

* I think we can check if it'd be parallel safe before creating the
material path, thus avoid the creation in unsafe cases.

* I don't think the test case you added works for the code changes.
Maybe a plan likes below is better:

explain (costs off)
select * from tenk1, tenk2 where tenk1.two = tenk2.two;
QUERY PLAN
----------------------------------------------
Gather
Workers Planned: 4
-> Nested Loop
Join Filter: (tenk1.two = tenk2.two)
-> Parallel Seq Scan on tenk1
-> Materialize
-> Seq Scan on tenk2
(7 rows)

Thanks
Richard

#4Tender Wang
tndrwang@gmail.com
In reply to: Richard Guo (#3)
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

Richard Guo <guofenglinux@gmail.com> 于2023年9月5日周二 18:51写道:

On Tue, Sep 5, 2023 at 4:52 PM tender wang <tndrwang@gmail.com> wrote:

I recently run benchmark[1] on master, but I found performance problem
as below:
...

I debug the code and find consider_parallel_nestloop() doesn't consider
materialized form of the cheapest inner path.

Yeah, this seems an omission in commit 45be99f8. I reviewed the patch
and here are some comments.

* I think we should not consider materializing the cheapest inner path
if we're doing JOIN_UNIQUE_INNER, because in this case we have to
unique-ify the inner path.

That's right. The V2 patch has been fixed.

* I think we can check if it'd be parallel safe before creating the
material path, thus avoid the creation in unsafe cases.

Agreed.

* I don't think the test case you added works for the code changes.
Maybe a plan likes below is better:

Agreed.

explain (costs off)

Show quoted text

select * from tenk1, tenk2 where tenk1.two = tenk2.two;
QUERY PLAN
----------------------------------------------
Gather
Workers Planned: 4
-> Nested Loop
Join Filter: (tenk1.two = tenk2.two)
-> Parallel Seq Scan on tenk1
-> Materialize
-> Seq Scan on tenk2
(7 rows)

Thanks
Richard

Attachments:

v2-0001-Parallel-seq-scan-should-consider-materila-inner-.patchtext/plain; charset=US-ASCII; name=v2-0001-Parallel-seq-scan-should-consider-materila-inner-.patchDownload+52-1
#5Robert Haas
robertmhaas@gmail.com
In reply to: Richard Guo (#3)
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

On Tue, Sep 5, 2023 at 8:07 AM Richard Guo <guofenglinux@gmail.com> wrote:

Yeah, this seems an omission in commit 45be99f8.

It's been a while, but I think I omitted this deliberately because I
didn't really understand the value of it and wanted to keep the
planning cost down.

The example query provided here seems rather artificial. Surely few
people write a join clause that references neither of the tables being
joined. Is there a more realistic case where this makes a big
difference?

--
Robert Haas
EDB: http://www.enterprisedb.com

#6Richard Guo
guofenglinux@gmail.com
In reply to: Robert Haas (#5)
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

On Fri, Sep 8, 2023 at 3:15 AM Robert Haas <robertmhaas@gmail.com> wrote:

The example query provided here seems rather artificial. Surely few
people write a join clause that references neither of the tables being
joined. Is there a more realistic case where this makes a big
difference?

Yes the given example query is not that convincing. I tried a query
with plans as below (after some GUC setting) which might be more
realistic in real world.

unpatched:

explain select * from partsupp join lineitem on l_partkey > ps_partkey;
QUERY PLAN
--------------------------------------------------------------------------------------
Gather (cost=0.00..5522666.44 rows=160466667 width=301)
Workers Planned: 4
-> Nested Loop (cost=0.00..5522666.44 rows=40116667 width=301)
Join Filter: (lineitem.l_partkey > partsupp.ps_partkey)
-> Parallel Seq Scan on lineitem (cost=0.00..1518.44 rows=15044
width=144)
-> Seq Scan on partsupp (cost=0.00..267.00 rows=8000 width=157)
(6 rows)

patched:

explain select * from partsupp join lineitem on l_partkey > ps_partkey;
QUERY PLAN
--------------------------------------------------------------------------------------
Gather (cost=0.00..1807085.44 rows=160466667 width=301)
Workers Planned: 4
-> Nested Loop (cost=0.00..1807085.44 rows=40116667 width=301)
Join Filter: (lineitem.l_partkey > partsupp.ps_partkey)
-> Parallel Seq Scan on lineitem (cost=0.00..1518.44 rows=15044
width=144)
-> Materialize (cost=0.00..307.00 rows=8000 width=157)
-> Seq Scan on partsupp (cost=0.00..267.00 rows=8000
width=157)
(7 rows)

The execution time (ms) are (avg of 3 runs):

unpatched: 71769.21
patched: 65510.04

So we can see some (~9%) performance gains in this case.

Thanks
Richard

#7Tender Wang
tndrwang@gmail.com
In reply to: Richard Guo (#6)
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

Hi tom,
Do you have any comments or suggestions on this issue? Thanks.

Richard Guo <guofenglinux@gmail.com> 于2023年9月8日周五 14:06写道:

Show quoted text

On Fri, Sep 8, 2023 at 3:15 AM Robert Haas <robertmhaas@gmail.com> wrote:

The example query provided here seems rather artificial. Surely few
people write a join clause that references neither of the tables being
joined. Is there a more realistic case where this makes a big
difference?

Yes the given example query is not that convincing. I tried a query
with plans as below (after some GUC setting) which might be more
realistic in real world.

unpatched:

explain select * from partsupp join lineitem on l_partkey > ps_partkey;
QUERY PLAN

--------------------------------------------------------------------------------------
Gather (cost=0.00..5522666.44 rows=160466667 width=301)
Workers Planned: 4
-> Nested Loop (cost=0.00..5522666.44 rows=40116667 width=301)
Join Filter: (lineitem.l_partkey > partsupp.ps_partkey)
-> Parallel Seq Scan on lineitem (cost=0.00..1518.44 rows=15044
width=144)
-> Seq Scan on partsupp (cost=0.00..267.00 rows=8000 width=157)
(6 rows)

patched:

explain select * from partsupp join lineitem on l_partkey > ps_partkey;
QUERY PLAN

--------------------------------------------------------------------------------------
Gather (cost=0.00..1807085.44 rows=160466667 width=301)
Workers Planned: 4
-> Nested Loop (cost=0.00..1807085.44 rows=40116667 width=301)
Join Filter: (lineitem.l_partkey > partsupp.ps_partkey)
-> Parallel Seq Scan on lineitem (cost=0.00..1518.44 rows=15044
width=144)
-> Materialize (cost=0.00..307.00 rows=8000 width=157)
-> Seq Scan on partsupp (cost=0.00..267.00 rows=8000
width=157)
(7 rows)

The execution time (ms) are (avg of 3 runs):

unpatched: 71769.21
patched: 65510.04

So we can see some (~9%) performance gains in this case.

Thanks
Richard

#8David Rowley
dgrowleyml@gmail.com
In reply to: Robert Haas (#5)
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

On Fri, 8 Sept 2023 at 09:41, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Sep 5, 2023 at 8:07 AM Richard Guo <guofenglinux@gmail.com> wrote:

Yeah, this seems an omission in commit 45be99f8.

It's been a while, but I think I omitted this deliberately because I
didn't really understand the value of it and wanted to keep the
planning cost down.

I think the value is potentially not having to repeatedly execute some
expensive rescan to the nested loop join once for each outer-side
tuple.

The planning cost is something to consider for sure, but it seems
strange that we deemed it worthy to consider material paths for the
non-parallel input paths but draw the line for the parallel/partial
ones. It seems to me that the additional costs and the possible
benefits are the same for both.

David

#9David Rowley
dgrowleyml@gmail.com
In reply to: Richard Guo (#6)
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

On Fri, 8 Sept 2023 at 19:14, Richard Guo <guofenglinux@gmail.com> wrote:

explain select * from partsupp join lineitem on l_partkey > ps_partkey;
QUERY PLAN
--------------------------------------------------------------------------------------
Gather (cost=0.00..1807085.44 rows=160466667 width=301)
Workers Planned: 4
-> Nested Loop (cost=0.00..1807085.44 rows=40116667 width=301)
Join Filter: (lineitem.l_partkey > partsupp.ps_partkey)
-> Parallel Seq Scan on lineitem (cost=0.00..1518.44 rows=15044 width=144)
-> Materialize (cost=0.00..307.00 rows=8000 width=157)
-> Seq Scan on partsupp (cost=0.00..267.00 rows=8000 width=157)
(7 rows)

The execution time (ms) are (avg of 3 runs):

unpatched: 71769.21
patched: 65510.04

This gap would be wider if the partsupp Seq Scan were filtering off
some rows and wider still if you added more rows to lineitem.
However, a clauseless seqscan is not the most compelling use case
below a material node. The inner side of the nested loop could be some
subquery that takes 6 days to complete. Running the 6 day query ~15044
times seems like something that would be good to avoid.

It seems worth considering Material paths to me. I think that the
above example could be tuned any way you like to make it look better
or worse.

David

#10Alena Rybakina
lena.ribackina@yandex.ru
In reply to: David Rowley (#9)
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

Hi!

Thank you for your work on the subject.

I reviewed your patch and found that your commit message does not fully
explain your code, in addition, I found several spelling mistakes.

I think it's better to change to:

With parallel seqscan, we should consider materializing the cheapest
inner path in
case of nested loop if it doesn't contain a unique node or it is unsafe
to use it in a subquery.

Besides, I couldn't understand why we again check that material path is
safe?

if (matpath != NULL && matpath->parallel_safe)
            try_partial_nestloop_path(root, joinrel, outerpath, matpath,
                                      pathkeys, jointype, extra);

--
Regards,
Alena Rybakina

#11Andrey Borodin
amborodin@acm.org
In reply to: Tender Wang (#7)
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

On 27 Sep 2023, at 16:06, tender wang <tndrwang@gmail.com> wrote:

Do you have any comments or suggestions on this issue? Thanks.

Hi Tender,

there are some review comments in the thread, that you might be interested in.
I'll mark this [0]https://commitfest.postgresql.org/47/4549/ entry "Waiting on Author" and move to next CF.

Thanks!

Best regards, Andrey Borodin.

[0]: https://commitfest.postgresql.org/47/4549/

#12Tender Wang
tndrwang@gmail.com
In reply to: Andrey Borodin (#11)
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

Andrey M. Borodin <x4mmm@yandex-team.ru> 于2024年4月8日周一 17:40写道:

On 27 Sep 2023, at 16:06, tender wang <tndrwang@gmail.com> wrote:

Do you have any comments or suggestions on this issue? Thanks.

Hi Tender,

there are some review comments in the thread, that you might be interested
in.
I'll mark this [0] entry "Waiting on Author" and move to next CF.

Thank you for the reminder. I will update the patch later.
I also deeply hope to get more advice about this patch.
(even the advice that not worth continuint to work on this patch).

Thanks.

Thanks!

Best regards, Andrey Borodin.

[0]https://commitfest.postgresql.org/47/4549/

--
Tender Wang
OpenPie: https://en.openpie.com/

#13Tender Wang
tndrwang@gmail.com
In reply to: Andrey Borodin (#11)
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

Andrey M. Borodin <x4mmm@yandex-team.ru> 于2024年4月8日周一 17:40写道:

On 27 Sep 2023, at 16:06, tender wang <tndrwang@gmail.com> wrote:

Do you have any comments or suggestions on this issue? Thanks.

Hi Tender,

there are some review comments in the thread, that you might be interested
in.
I'll mark this [0] entry "Waiting on Author" and move to next CF.

Thanks!

Best regards, Andrey Borodin.

[0]https://commitfest.postgresql.org/47/4549/

I have rebased master and fixed a plan diff case.
--
Tender Wang
OpenPie: https://en.openpie.com/

Attachments:

v3-0001-Support-materializing-inner-path-on-parallel-oute.patchapplication/octet-stream; name=v3-0001-Support-materializing-inner-path-on-parallel-oute.patchDownload+112-47
#14Tomasz Rybak
tomasz.rybak@post.pl
In reply to: Tender Wang (#13)
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

On Tue, 2024-04-23 at 16:59 +0800, Tender Wang wrote:

[ cut ]

I have rebased master and fixed a plan diff case.

We (me, Paul Jungwirth, and Yuki Fujii) reviewed this patch
at PgConf.dev Patch Review Workshop.
Here are our findings.

Patch tries to allow for using materialization together
with parallel subqueries.
It applies cleanly on 8fea1bd5411b793697a4c9087c403887e050c4ac
(current HEAD).
Tests pass locally on macOS and Linux in VM under Windows.
Tests are also green in cfbot (for last 2 weeks; they were
red previously, probably because of need to rebase).

Please add more tests. Especially please add some negative tests;
current patch checks that it is safe to apply materialization. It would
be helpful to add tests checking that materialization is not applied
in both checked cases:
1. when inner join path is not parallel safe
2. when matpath is not parallel safe

This patch tries to apply materialization only when join type
is not JOIN_UNIQUE_INNER. Comment mentions this, but does not
explain why. So please either add comment describing reason for that
or try enabling materialization in such a case.

Best regards.

--
Tomasz Rybak, Debian Developer <serpent@debian.org>
GPG: A565 CE64 F866 A258 4DDC F9C7 ECB7 3E37 E887 AA8C

#15Tender Wang
tndrwang@gmail.com
In reply to: Tomasz Rybak (#14)
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

Tomasz Rybak <tomasz.rybak@post.pl> 于2024年5月31日周五 04:21写道:

On Tue, 2024-04-23 at 16:59 +0800, Tender Wang wrote:

[ cut ]

I have rebased master and fixed a plan diff case.

We (me, Paul Jungwirth, and Yuki Fujii) reviewed this patch
at PgConf.dev Patch Review Workshop.

Thanks for reviewing this patch.

Here are our findings.

Patch tries to allow for using materialization together
with parallel subqueries.
It applies cleanly on 8fea1bd5411b793697a4c9087c403887e050c4ac
(current HEAD).
Tests pass locally on macOS and Linux in VM under Windows.
Tests are also green in cfbot (for last 2 weeks; they were
red previously, probably because of need to rebase).

Please add more tests. Especially please add some negative tests;
current patch checks that it is safe to apply materialization. It would
be helpful to add tests checking that materialization is not applied
in both checked cases:
1. when inner join path is not parallel safe
2. when matpath is not parallel safe

I added a test case that inner rel is not parallel safe. Actually, matpath
will not create
if inner rel is not parallel safe. So I didn't add test case for the
second scenario.

This patch tries to apply materialization only when join type

is not JOIN_UNIQUE_INNER. Comment mentions this, but does not
explain why. So please either add comment describing reason for that
or try enabling materialization in such a case.

Yeah, Richard commented the v1 patch about JOIN_UNIQUE_INNER in [1]/messages/by-id/CAMbWs49LbQF_Z0iKPRPnTHfsRECT7M-4rF6ft5vpW1ARSpBkPA@mail.gmail.com

* I think we should not consider materializing the cheapest inner path
if we're doing JOIN_UNIQUE_INNER, because in this case we have to
unique-ify the inner path.

We don't consider material inner path if jointype is JOIN_UNIQUE_INNER in
match_unsorted_order().
So here is as same logic as match_unsorted_order(). I added comments to
explain why.

[1]: /messages/by-id/CAMbWs49LbQF_Z0iKPRPnTHfsRECT7M-4rF6ft5vpW1ARSpBkPA@mail.gmail.com
/messages/by-id/CAMbWs49LbQF_Z0iKPRPnTHfsRECT7M-4rF6ft5vpW1ARSpBkPA@mail.gmail.com

--
Tender Wang
OpenPie: https://en.openpie.com/

Attachments:

v4-0001-Support-materializing-inner-path-on-parallel-oute.patchapplication/octet-stream; name=v4-0001-Support-materializing-inner-path-on-parallel-oute.patchDownload+137-47
#16Fujii.Yuki@df.MitsubishiElectric.co.jp
Fujii.Yuki@df.MitsubishiElectric.co.jp
In reply to: Tender Wang (#15)
RE: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

Hi. Tender.

Thank you for modification.

From: Tender Wang <tndrwang@gmail.com>
Sent: Tuesday, June 4, 2024 7:51 PM
Please add more tests. Especially please add some negative tests;
current patch checks that it is safe to apply materialization. It would
be helpful to add tests checking that materialization is not applied
in both checked cases:
1. when inner join path is not parallel safe
2. when matpath is not parallel safe

I added a test case that inner rel is not parallel safe. Actually,
matpath will not create if inner rel is not parallel safe. So I didn't add test case for the second scenario.

Is there case in which matpath is not parallel safe and inner rel is parallel safe?
If right, I think that it would be suitable to add a negative test in a such case.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

#17Tender Wang
tndrwang@gmail.com
In reply to: Fujii.Yuki@df.MitsubishiElectric.co.jp (#16)
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

Fujii.Yuki@df.MitsubishiElectric.co.jp <
Fujii.Yuki@df.mitsubishielectric.co.jp> 于2024年6月5日周三 09:26写道:

Hi. Tender.

Thank you for modification.

From: Tender Wang <tndrwang@gmail.com>
Sent: Tuesday, June 4, 2024 7:51 PM
Please add more tests. Especially please add some negative tests;
current patch checks that it is safe to apply materialization. It

would

be helpful to add tests checking that materialization is not

applied

in both checked cases:
1. when inner join path is not parallel safe
2. when matpath is not parallel safe

I added a test case that inner rel is not parallel safe. Actually,
matpath will not create if inner rel is not parallel safe. So I didn't

add test case for the second scenario.
Is there case in which matpath is not parallel safe and inner rel is
parallel safe?
If right, I think that it would be suitable to add a negative test in a
such case.

I looked through create_xxx_path(), and I found that almost
path.parallel_safe is assigned from RelOptiInfo.consider_parallel.
Some pathes take subpath->parallel_safe into account(e.g. Material path).
In most cases, Material is parallel_safe if rel is parallel
safe. Now I haven't come up a query plan that material is un parallel-safe
but rel is parallel-safe.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

--
Tender Wang
OpenPie: https://en.openpie.com/

#18Fujii.Yuki@df.MitsubishiElectric.co.jp
Fujii.Yuki@df.MitsubishiElectric.co.jp
In reply to: Tender Wang (#17)
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

Hi. Tender.

From: Tender Wang <tndrwang@gmail.com>
Sent: Tuesday, June 11, 2024 5:11 PM

From: Tender Wang <tndrwang@gmail.com <mailto:tndrwang@gmail.com> >
Sent: Tuesday, June 4, 2024 7:51 PM
Please add more tests. Especially please add some negative tests;
current patch checks that it is safe to apply materialization. It would
be helpful to add tests checking that materialization is not applied
in both checked cases:
1. when inner join path is not parallel safe
2. when matpath is not parallel safe

I added a test case that inner rel is not parallel safe. Actually,
matpath will not create if inner rel is not parallel safe. So I didn't add test case for the second scenario.

Is there case in which matpath is not parallel safe and inner rel is parallel safe?
If right, I think that it would be suitable to add a negative test in a such case.

I looked through create_xxx_path(), and I found that almost path.parallel_safe is assigned from
RelOptiInfo.consider_parallel.
Some pathes take subpath->parallel_safe into account(e.g. Material path). In most cases, Material is parallel_safe if rel is
parallel safe. Now I haven't come up a query plan that material is un parallel-safe but rel is parallel-safe.

Thank you for looking into the source code. I understand the situation now.

Sincerely yours,
Yuki Fujii

--
Yuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

#19Tender Wang
tndrwang@gmail.com
In reply to: Robert Haas (#5)
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

Hi Robert,

Since this patch had been reviewed at PgConf.dev Patch Review
Workshop. And I have updated
the patch according to the review advice. Now there are no others to
comment this patch.
The status of this patch on commitfest have stayed "need review" for a long
time.
I want to know if it is ready to move to the next status "Ready for
commiter".

Thanks.

--
Tender Wang

#20Richard Guo
guofenglinux@gmail.com
In reply to: Tender Wang (#15)
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

On Tue, Jun 4, 2024 at 6:51 PM Tender Wang <tndrwang@gmail.com> wrote:

Yeah, Richard commented the v1 patch about JOIN_UNIQUE_INNER in [1]

* I think we should not consider materializing the cheapest inner path
if we're doing JOIN_UNIQUE_INNER, because in this case we have to
unique-ify the inner path.

We don't consider material inner path if jointype is JOIN_UNIQUE_INNER in match_unsorted_order().
So here is as same logic as match_unsorted_order(). I added comments to explain why.

I looked through the v4 patch and found an issue. For the plan diff:

+         ->  Nested Loop
+               ->  Parallel Seq Scan on prt1_p1 t1_1
+               ->  Materialize
+                     ->  Sample Scan on prt1_p1 t2_1
+                           Sampling: system (t1_1.a) REPEATABLE (t1_1.b)
+                           Filter: (t1_1.a = a)

This does not seem correct to me. The inner path is parameterized by
the outer rel, in which case it does not make sense to add a Materialize
node on top of it.

I updated the patch to include a check in consider_parallel_nestloop
ensuring that inner_cheapest_total is not parameterized by outerrel
before materializing it. I also tweaked the comments, test cases and
commit message.

Thanks
Richard

Attachments:

v5-0001-Consider-materializing-the-cheapest-inner-path-in-parallel-nestloop.patchapplication/octet-stream; name=v5-0001-Consider-materializing-the-cheapest-inner-path-in-parallel-nestloop.patchDownload+66-1
#21Tender Wang
tndrwang@gmail.com
In reply to: Richard Guo (#20)
#22Richard Guo
guofenglinux@gmail.com
In reply to: Tender Wang (#21)
#23Richard Guo
guofenglinux@gmail.com
In reply to: Richard Guo (#22)
#24Tender Wang
tndrwang@gmail.com
In reply to: Richard Guo (#23)
#25Alexander Lakhin
exclusion@gmail.com
In reply to: Richard Guo (#23)
#26Richard Guo
guofenglinux@gmail.com
In reply to: Alexander Lakhin (#25)
#27Richard Guo
guofenglinux@gmail.com
In reply to: Richard Guo (#26)
#28Alexander Lakhin
exclusion@gmail.com
In reply to: Richard Guo (#27)
#29Richard Guo
guofenglinux@gmail.com
In reply to: Alexander Lakhin (#28)