Construction of Plan-node by CSP (RE: Custom/Foreign-Join-APIs)
-----Original Message-----
Sent: Friday, May 15, 2015 8:44 AM
To: 'Tom Lane'; Kohei KaiGai
Cc: Robert Haas; Thom Brown; Shigeru Hanada; pgsql-hackers@postgreSQL.org
Subject: RE: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)A possible compromise that we could perhaps still wedge into 9.5 is to
extend CustomPath with a List of child Paths, and CustomScan with a List
of child Plans, which createplan.c would know to build from the Paths,
and other modules would then also be aware of these children. I find that
uglier than a separate join node type, but it would be tolerable I guess.The attached patch implements what you suggested as is.
It allows custom-scan providers to have child Paths without exporting
create_plan_recurse(), and enables to represent N-way join naturally.
Please add any solution, even if we don't reach the consensus of how
create_plan_recurse (and other useful static functions) are visible to
extensions.
I updated the patch to fix up this problem towards the latest master
branch.
Let me remind the problem again. (I really have a hard time of it)
When an extension tries to implement its own join logic using custom-
scan interface, it adds CustomPath on set_join_pathlist_hook with its
cost estimation.
Once the path gets chosen by planner, PlanCustomPath callback shall be
invoked, then, the custom-scan provider will construct its CustomScan
node according to the path, and I expected it recursively initializes
underlying Path nodes (that work as join input, if any) using
create_plan_recurse().
However, at this moment, we didn't get 100% consensus to export this
function to extensions. So, later commit made this function as static
one, again.
Instead of this approach, Tom suggested to add a list of child Paths
on CustomPath node, then createplan.c calls create_plan_recurse() for
each entry of the list, without this function getting exported.
I can agree with this approach as an alternative of the previous
public create_plan_recurse(), and the attached patch implements this
idea, as is. (Do I understand his suggestion correctly?)
Below is the expectation of the custom-scan provider which takes
underlying Path/Plan nodes.
1. It adds a list of underlying Path nodes on custom_children of
the CustomPath node.
2. On the PlanCustomPath, it adds adds Plan nodes (initialized by
createplan.c, and passed as argument) onto lefttree, righttree
and/or custom_children of CustomScan node
3. On the BeginCustomScan, it calls ExecInitNode() to begin execution
of the underlying plan node. Then, if it has more than 2 children,
attach these PlanState objects on the custom_children list for
EXPLAIN output.
As long as extension follows the above interface contract, it can
have underlying child Path/Plan/PlanState without direct call of
create_plan_recurse() as previously argued.
I think it is enough reasonable solution for the problem.
How about people's thought?
Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>
Attachments:
custom-join-children.v2.patchapplication/octet-stream; name=custom-join-children.v2.patchDownload+84-7
On Mon, May 25, 2015 at 5:08 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
I updated the patch to fix up this problem towards the latest master
branch.
[ ... ]
Instead of this approach, Tom suggested to add a list of child Paths
on CustomPath node, then createplan.c calls create_plan_recurse() for
each entry of the list, without this function getting exported.
Tom, do you want to review this patch and figure out how to solve the
underlying problem? If not, I will take care of it. But I will be
unhappy if I put time and effort into this and then you insist on
changing everything afterwards, again.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
Tom, do you want to review this patch and figure out how to solve the
underlying problem? If not, I will take care of it. But I will be
unhappy if I put time and effort into this and then you insist on
changing everything afterwards, again.
[ sorry for slow response, been busy ] I will take a look.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Tom Lane
Sent: Thursday, May 28, 2015 5:35 AM
To: Robert Haas
Cc: Kaigai Kouhei(海外 浩平); Thom Brown; Kohei KaiGai; Shigeru Hanada;
pgsql-hackers@postgreSQL.org
Subject: Re: [HACKERS] Construction of Plan-node by CSP (RE:
Custom/Foreign-Join-APIs)Robert Haas <robertmhaas@gmail.com> writes:
Tom, do you want to review this patch and figure out how to solve the
underlying problem? If not, I will take care of it. But I will be
unhappy if I put time and effort into this and then you insist on
changing everything afterwards, again.[ sorry for slow response, been busy ] I will take a look.
Tom, how about your availability?
From my side, I adjust my extension (PG-Strom) to fit the infrastructure you proposed,
then confirmed it is workable even if custom-scan, that replaced relations join, takes
more than two Path nodes in the custom_children list of CustomPath, with no exportiong
create_plan_recurse().
Below is an example of custom-scan (GpuJoin) that involves four relations join.
Its code base is the latest master + custom-join-children.v2.patch; unchanged from
the last post.
postgres=# explain analyze select avg(x) from t0 natural join t1 natural join t2 natural join t3 group by cat;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=298513.77..298514.10 rows=26 width=12) (actual time=5622.028..5622.033 rows=26 loops=1)
Group Key: t0.cat
-> Custom Scan (GpuJoin) (cost=4702.00..249169.85 rows=9868784 width=12) (actual time=540.718..2075.566 rows=10000000 loops=1)
Bulkload: On (density: 100.00%)
Depth 1: Logic: GpuHashJoin, HashKeys: (cid), JoinQual: (cid = cid), nrows_ratio: 0.98936439
Depth 2: Logic: GpuHashJoin, HashKeys: (bid), JoinQual: (bid = bid), nrows_ratio: 0.99748135
Depth 3: Logic: GpuHashJoin, HashKeys: (aid), JoinQual: (aid = aid), nrows_ratio: 1.00000000
-> Custom Scan (BulkScan) on t0 (cost=0.00..242858.60 rows=10000060 width=24) (actual time=8.555..903.864 rows=10000000 loops=1)
-> Seq Scan on t3 (cost=0.00..734.00 rows=40000 width=4) (actual time=0.019..4.370 rows=40000 loops=1)
-> Seq Scan on t2 (cost=0.00..734.00 rows=40000 width=4) (actual time=0.004..4.182 rows=40000 loops=1)
-> Seq Scan on t1 (cost=0.00..734.00 rows=40000 width=4) (actual time=0.005..4.275 rows=40000 loops=1)
Planning time: 0.918 ms
Execution time: 6178.264 ms
(13 rows)
Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>
Attachments:
custom-join-children.v2.patchapplication/octet-stream; name=custom-join-children.v2.patchDownload+84-7
Robert Haas <robertmhaas@gmail.com> writes:
Tom, do you want to review this patch and figure out how to solve the
underlying problem? If not, I will take care of it. But I will be
unhappy if I put time and effort into this and then you insist on
changing everything afterwards, again.[ sorry for slow response, been busy ] I will take a look.
regards, tom lane
Tom, please don't forget the problem.
It is still problematic for custom-scan provider that tries to
implement own join logic, thus we still have to apply additional
patch (or copy&paste createplan.c to module's source).
Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>