FDW: ForeignPlan and parameterized paths
Hello.
I've noticed that, when implementing a FDW, it is difficult to use a plan which
best path is a parameterized path. This comes from the fact that the
parameterized clause is not easily available at plan time.
This is what I understood from how it works:
- The clauses coming from the best path restrictinfo are not available in the
scan_clauses argument to the GetForeignPlan function.
- They are, however, directly available on the path, but at this point the
clauses are of the form InnerVar OPERATOR OuterVar. The outer Var node is then
replaced by a Param node, using the replace_nestloop_params function.
It could be useful to make the "parameterized" version of the clause (in the
form InnerVar OPERATOR Param) available to the fdw at plan time.
Could this be possible ?
Maybe by replacing the clauses on the restrictinfo nodes from the path param
info by the "parameterized" clauses, and then adding these to the scan clauses
passed to GetForeignPlan ?
Regards,
--
Ronan Dunklau
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Ronan Dunklau <rdunklau@gmail.com> writes:
I've noticed that, when implementing a FDW, it is difficult to use a plan which
best path is a parameterized path. This comes from the fact that the
parameterized clause is not easily available at plan time.
This is what I understood from how it works:
- The clauses coming from the best path restrictinfo are not available in the
scan_clauses argument to the GetForeignPlan function.
- They are, however, directly available on the path, but at this point the
clauses are of the form InnerVar OPERATOR OuterVar. The outer Var node is then
replaced by a Param node, using the replace_nestloop_params function.
It could be useful to make the "parameterized" version of the clause (in the
form InnerVar OPERATOR Param) available to the fdw at plan time.
Could this be possible ?
I intentionally did the nestloop_params substitution after calling
GetForeignPlan not before. It's not apparent to me why it would be
useful to do it before, because the FDW is going to have no idea what
those params represent. (Note that they represent values coming from
some other, probably local, relation; not from the foreign table.)
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
I intentionally did the nestloop_params substitution after calling
GetForeignPlan not before. It's not apparent to me why it would be
useful to do it before, because the FDW is going to have no idea what
those params represent. (Note that they represent values coming from
some other, probably local, relation; not from the foreign table.)
Even if the FDW have no idea what they represent, it can identify a
clause of the form Var Operator Param, which allows to store the param
reference (paramid) for retrieving the param value at execution time.
If the chosen best path is a parameterized path that has been built by
the FDW, it allows to push down this restriction.
If this isn't possible, the only way I found to use those clauses
would be at scan time.
Lets's assume atable is a local relation, and aftable is a foreign
table, and the query looks like this:
select * from atable t1 inner join aftable t2 on t1.c1 = t2.c1
The FDW identifies the join clause on its column c1, and build a
parameterized path on this column (maybe because this column is unique
and indexed on the remote side).
The planner chooses this path, building a nested loop rescanning the
foreign table with this parameter value reflecting the outer relation
value (maybe because the local relation's size is much smaller than
the remote relation's size).
In that case, it seems to be of particular importance to have access
to the clause, so that the nested loop can work as intended: avoiding
a full seqscan on the remote side.
Or is there another way to achieve the same goal ?
Regards,
--
Ronan Dunklau
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Ronan Dunklau <rdunklau@gmail.com> writes:
I intentionally did the nestloop_params substitution after calling
GetForeignPlan not before. It's not apparent to me why it would be
useful to do it before, because the FDW is going to have no idea what
those params represent. (Note that they represent values coming from
some other, probably local, relation; not from the foreign table.)
Even if the FDW have no idea what they represent, it can identify a
clause of the form Var Operator Param, which allows to store the param
reference (paramid) for retrieving the param value at execution time.
I don't see any plausible reason for an FDW to special-case nestloop
params like that. What an FDW should be looking for is clauses of the
form Var-of-foreign-table Operator Expression-not-involving-foreign-table,
and a Param is just one case of Expression-not-involving-foreign-table.
(Compare the handling of indexscan clauses: indxpath.c doesn't much care
what's on the righthand side of an indexable clause, so long as there
is no Var of the indexed table there.)
Moreover, in order to do effective parameterized-path creation in the
first place, the FDW's GetForeignPaths function will already have had
to recognize these same clauses in their original form. If we do the
param substitution before calling GetForeignPlan, that will just mean
that the two functions can't share code anymore.
Or in short: the fact that the righthand-side expression gets replaced
(perhaps only partially) by a Param is an implementation detail of the
executor's expression evaluation methods. The FDW shouldn't care about
that, only about the result of the expression.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers