Problems with plan estimates in postgres_fdw

Started by Andrew Gierthover 7 years ago48 messageshackers
Jump to latest
#1Andrew Gierth
andrew@tao11.riddles.org.uk

This analysis comes from investigating a report from an IRC user. A
summary of the initial report is:

Using PG 9.6.9 and postgres_fdw, a query of the form "select * from
foreign_table order by col limit 1" is getting a local Sort plan, not
pushing the ORDER BY to the remote. Turning off use_remote_estimates
changes the plan to use a remote sort, with a 10000x speedup.

I don't think this can be called a bug, exactly, and I don't have an
immediate fix, so I'm putting this analysis up for the benefit of anyone
working on this in future.

The cause of the misplan seems to be this: postgres_fdw with
use_remote_estimates on does not attempt to obtain fast-start plans from
the remote. In this case what happens is this:

1. postgres_fdw gets the cost estimate from the plain remote fetch, by
doing "EXPLAIN select * from table". This produces a plan with a low
startup cost (just the constant overhead) and a high total cost (on
the order of 1.2e6 in this case).

2. postgres_fdw gets the cost estimate for the ordered fetch, by doing
"EXPLAIN select * from table order by col". Note that there is no
LIMIT nor any cursor_tuple_fraction in effect, so the plan returned
in this case is a seqscan+sort plan (in spite of the presence of an
index on "col"), with a very high (order of 8e6) startup and total
cost.

So when the local side tries to generate paths, it has the choice of
using a remote-ordered path with startup cost 8e6, or a local top-1
sort on top of an unordered remote path, which has a total cost on the
order of 1.5e6 in this case; cheaper than the remote sort because this
only needs to do top-1, while the remote is sorting millions of rows
and would probably spill to disk.

However, when it comes to actual execution, postgres_fdw opens a cursor
for the remote query, which means that cursor_tuple_fraction will come
into play. As far as I can tell, this is not set anywhere, so this means
that the plan that actually gets run on the remote is likely to have
_completely_ different costs from those returned by the EXPLAINs. In
particular, in this case the fast-start index-scan plan for the ORDER BY
remote query is clearly being chosen when use_remote_estimates is off
(since the query completes in 15ms rather than 150 seconds).

One possibility: would it be worth adding an option to EXPLAIN that
makes it assume cursor_tuple_fraction?

--
Andrew (irc:RhodiumToad)

#2Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Andrew Gierth (#1)
Re: Problems with plan estimates in postgres_fdw

Hello.

At Thu, 02 Aug 2018 01:06:41 +0100, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote in <87pnz1aby9.fsf@news-spur.riddles.org.uk>

This analysis comes from investigating a report from an IRC user. A
summary of the initial report is:

Using PG 9.6.9 and postgres_fdw, a query of the form "select * from
foreign_table order by col limit 1" is getting a local Sort plan, not
pushing the ORDER BY to the remote. Turning off use_remote_estimates
changes the plan to use a remote sort, with a 10000x speedup.

I don't think this can be called a bug, exactly, and I don't have an
immediate fix, so I'm putting this analysis up for the benefit of anyone
working on this in future.

I didn't see the concrete estimates, it seems that the cause is
too-small total cost of non-remote-sorted plan compared with the
startup cost of remote-sorted one. In other words, tuple cost by
the remoteness is estimated as too small. Perhaps setting
fdw_tuple_cost to , say 1 as an extreme value, will bring victory
to remote sort path for the query.

The cause of the misplan seems to be this: postgres_fdw with
use_remote_estimates on does not attempt to obtain fast-start plans from
the remote. In this case what happens is this:

1. postgres_fdw gets the cost estimate from the plain remote fetch, by
doing "EXPLAIN select * from table". This produces a plan with a low
startup cost (just the constant overhead) and a high total cost (on
the order of 1.2e6 in this case).

2. postgres_fdw gets the cost estimate for the ordered fetch, by doing
"EXPLAIN select * from table order by col". Note that there is no
LIMIT nor any cursor_tuple_fraction in effect, so the plan returned
in this case is a seqscan+sort plan (in spite of the presence of an
index on "col"), with a very high (order of 8e6) startup and total
cost.

So when the local side tries to generate paths, it has the choice of
using a remote-ordered path with startup cost 8e6, or a local top-1
sort on top of an unordered remote path, which has a total cost on the
order of 1.5e6 in this case; cheaper than the remote sort because this
only needs to do top-1, while the remote is sorting millions of rows
and would probably spill to disk.

A simple test at hand showed that (on a unix-domain connection):

=# explain (verbose on, analyze on) select * from ft1 order by a;

Foreign Scan on public.ft1 (cost=9847.82..17097.82 rows=100000 width=4)
(actual time=195.861..515.747 rows=100000 loops=1)

=# explain (verbose on, analyze on) select * from ft1;

Foreign Scan on public.ft1 (cost=100.00..8543.00 rows=100000 width=4)
(actual time=0.659..399.427 rows=100000 loops=1)

The cost is apaprently wrong. On my environment fdw_startup_cost
= 45 and fdw_tuple_cost = 0.2 gave me an even cost/actual time
ratio *for these queries*. (hard coded default is 100 and
0.01. Of course this disucussion is ignoring the accuracy of
local-execution estimate.)

=# explain (verbose on, analyze on) select * from ft1 order by a;

Foreign Scan on public.ft1 (cost=9792.82..31042.82 rows=100000 width=4)
(actual time=201.493..533.913 rows=100000 loops=1)

=# explain (verbose on, analyze on) select * from ft1;

Foreign Scan on public.ft1 (cost=45.00..22488.00 rows=100000 width=4)
(actual time=0.837..484.469 rows=100000 loops=1)

This gave me a remote-sorted plan for "select * from ft1 order by
a limit 1". (But also gave me a remote-sorted plan without a
LIMIT..)

However, when it comes to actual execution, postgres_fdw opens a cursor
for the remote query, which means that cursor_tuple_fraction will come
into play. As far as I can tell, this is not set anywhere, so this means
that the plan that actually gets run on the remote is likely to have
_completely_ different costs from those returned by the EXPLAINs. In
particular, in this case the fast-start index-scan plan for the ORDER BY
remote query is clearly being chosen when use_remote_estimates is off
(since the query completes in 15ms rather than 150 seconds).

One possibility: would it be worth adding an option to EXPLAIN that
makes it assume cursor_tuple_fraction?

Cursor fraction seems working since the foreign scan with remote
sort has a cost with different startup and total values. The
problem seems to be a too-small tuple cost.

So, we might have a room for improvement on
DEFAULT_FDW_STARTUP_COST, DEFAULT_FDW_TUPLE_COST and
DEFAULT_FDW_SORT_MULTIPLIER settings.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Gierth (#1)
Re: Problems with plan estimates in postgres_fdw

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

[ postgres_fdw is not smart about exploiting fast-start plans ]

Yeah, that's basically not accounted for at all in the current design.

One possibility: would it be worth adding an option to EXPLAIN that
makes it assume cursor_tuple_fraction?

[ handwaving ahead ]

I wonder whether it could be done without destroying postgres_fdw's
support for old servers, by instead including a LIMIT in the query sent
for explaining. The trick would be to know what value to put as the
limit, though. It'd be easy to do if we were willing to explain the query
twice (the second time with a limit chosen as a fraction of the rowcount
seen the first time), but man that's an expensive solution.

Another component of any real fix here would be to issue "SET
cursor_tuple_fraction" before opening the execution cursor, so as to
ensure that we actually get an appropriate plan on the remote side.

If we could tell whether there's going to be any use in fast-start plans,
it might make sense to build two scan paths for a foreign table, one based
on a full-table scan and one based on EXPLAIN ... LIMIT 1. This still
means two explain requests, which is why I'm not thrilled about doing it
unless there's a high probability of the extra explain being useful.

regards, tom lane

#4Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Tom Lane (#3)
Re: Problems with plan estimates in postgres_fdw

(2018/08/02 23:41), Tom Lane wrote:

Andrew Gierth<andrew@tao11.riddles.org.uk> writes:

[ postgres_fdw is not smart about exploiting fast-start plans ]

Yeah, that's basically not accounted for at all in the current design.

One possibility: would it be worth adding an option to EXPLAIN that
makes it assume cursor_tuple_fraction?

[ handwaving ahead ]

I wonder whether it could be done without destroying postgres_fdw's
support for old servers, by instead including a LIMIT in the query sent
for explaining. The trick would be to know what value to put as the
limit, though. It'd be easy to do if we were willing to explain the query
twice (the second time with a limit chosen as a fraction of the rowcount
seen the first time), but man that's an expensive solution.

Another component of any real fix here would be to issue "SET
cursor_tuple_fraction" before opening the execution cursor, so as to
ensure that we actually get an appropriate plan on the remote side.

If we could tell whether there's going to be any use in fast-start plans,
it might make sense to build two scan paths for a foreign table, one based
on a full-table scan and one based on EXPLAIN ... LIMIT 1. This still
means two explain requests, which is why I'm not thrilled about doing it
unless there's a high probability of the extra explain being useful.

Agreed, but ISTM that to address the original issue, it would be enough
to jsut add LIMIT (or ORDER BY LIMIT) pushdown to postgres_fdw based on
the upper-planner-pathification work.

I'm sorry that I'm really late for the party.

Best regards,
Etsuro Fujita

#5Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#4)
Re: Problems with plan estimates in postgres_fdw

(2018/10/05 19:15), Etsuro Fujita wrote:

(2018/08/02 23:41), Tom Lane wrote:

Andrew Gierth<andrew@tao11.riddles.org.uk> writes:

[ postgres_fdw is not smart about exploiting fast-start plans ]

Yeah, that's basically not accounted for at all in the current design.

One possibility: would it be worth adding an option to EXPLAIN that
makes it assume cursor_tuple_fraction?

[ handwaving ahead ]

I wonder whether it could be done without destroying postgres_fdw's
support for old servers, by instead including a LIMIT in the query sent
for explaining. The trick would be to know what value to put as the
limit, though. It'd be easy to do if we were willing to explain the query
twice (the second time with a limit chosen as a fraction of the rowcount
seen the first time), but man that's an expensive solution.

Another component of any real fix here would be to issue "SET
cursor_tuple_fraction" before opening the execution cursor, so as to
ensure that we actually get an appropriate plan on the remote side.

If we could tell whether there's going to be any use in fast-start plans,
it might make sense to build two scan paths for a foreign table, one
based
on a full-table scan and one based on EXPLAIN ... LIMIT 1. This still
means two explain requests, which is why I'm not thrilled about doing it
unless there's a high probability of the extra explain being useful.

Agreed, but ISTM that to address the original issue, it would be enough
to jsut add LIMIT (or ORDER BY LIMIT) pushdown to postgres_fdw based on
the upper-planner-pathification work.

Will work on it unless somebody else wants to.

Best regards,
Etsuro Fujita

#6Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#5)
Re: Problems with plan estimates in postgres_fdw

(2018/10/09 14:48), Etsuro Fujita wrote:

(2018/10/05 19:15), Etsuro Fujita wrote:

(2018/08/02 23:41), Tom Lane wrote:

Andrew Gierth<andrew@tao11.riddles.org.uk> writes:

[ postgres_fdw is not smart about exploiting fast-start plans ]

Yeah, that's basically not accounted for at all in the current design.

One possibility: would it be worth adding an option to EXPLAIN that
makes it assume cursor_tuple_fraction?

[ handwaving ahead ]

I wonder whether it could be done without destroying postgres_fdw's
support for old servers, by instead including a LIMIT in the query sent
for explaining. The trick would be to know what value to put as the
limit, though. It'd be easy to do if we were willing to explain the
query
twice (the second time with a limit chosen as a fraction of the rowcount
seen the first time), but man that's an expensive solution.

Another component of any real fix here would be to issue "SET
cursor_tuple_fraction" before opening the execution cursor, so as to
ensure that we actually get an appropriate plan on the remote side.

If we could tell whether there's going to be any use in fast-start
plans,
it might make sense to build two scan paths for a foreign table, one
based
on a full-table scan and one based on EXPLAIN ... LIMIT 1. This still
means two explain requests, which is why I'm not thrilled about doing it
unless there's a high probability of the extra explain being useful.

Agreed, but ISTM that to address the original issue, it would be enough
to jsut add LIMIT (or ORDER BY LIMIT) pushdown to postgres_fdw based on
the upper-planner-pathification work.

Will work on it unless somebody else wants to.

Here is a set of WIP patches for pushing down ORDER BY LIMIT to the remote:

* 0001-postgres-fdw-upperrel-ordered-WIP.patch:
This patch performs the UPPERREL_ORDERED step remotely.

* 0002-postgres-fdw-upperrel-final-WIP.patch:
This patch performs the UPPERREL_FINAL step remotely. Currently, this
only supports for SELECT commands, and pushes down LIMIT/OFFSET to the
remote if possible. This also removes LockRows if there is a FOR
UPDATE/SHARE clause, which would be safe because postgres_fdw performs
early locking. I'd like to leave INSERT, UPDATE and DELETE cases for
future work. It is my long-term todo to rewrite PlanDirectModify using
the upper-planner-pathification work. :)

For some regression test cases with ORDER BY and/or LIMIT, I noticed
that these patches still cannot push down those clause to the remote. I
guess it would be needed to tweak the cost/size estimation added by
these patches, but I didn't look at that in detail yet. Maybe I'm
missing something, though.

Comments welcome!

Best regards,
Etsuro Fujita

Attachments:

0001-postgres_fdw-Perform-UPPERREL_ORDERED-step-remotely-WIP.patchtext/x-patch; name=0001-postgres_fdw-Perform-UPPERREL_ORDERED-step-remotely-WIP.patchDownload+403-109
0002-postgres_fdw-Perform-UPPERREL_FINAL-step-remotely-WIP.patchtext/x-patch; name=0002-postgres_fdw-Perform-UPPERREL_FINAL-step-remotely-WIP.patchDownload+461-398
#7Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#6)
Re: Problems with plan estimates in postgres_fdw

(2018/12/17 22:09), Etsuro Fujita wrote:

Here is a set of WIP patches for pushing down ORDER BY LIMIT to the remote:

* 0001-postgres-fdw-upperrel-ordered-WIP.patch:

Correction:
0001-postgres_fdw-Perform-UPPERREL_ORDERED-step-remotely-WIP.patch

* 0002-postgres-fdw-upperrel-final-WIP.patch:

Correction: 0002-postgres_fdw-Perform-UPPERREL_FINAL-step-remotely-WIP.patch

Best regards,
Etsuro Fujita

#8Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#6)
Re: Problems with plan estimates in postgres_fdw

(2018/12/17 22:09), Etsuro Fujita wrote:

Here is a set of WIP patches for pushing down ORDER BY LIMIT to the remote:

For some regression test cases with ORDER BY and/or LIMIT, I noticed
that these patches still cannot push down those clause to the remote. I
guess it would be needed to tweak the cost/size estimation added by
these patches, but I didn't look at that in detail yet. Maybe I'm
missing something, though.

I looked into that. In the previous patch, I estimated costs for
performing grouping/aggregation with ORDER BY remotely, using the same
heuristic as scan/join cases if appropriate (i.e., I assumed that a
remote sort for grouping/aggregation also costs 20% extra), but that
seems too large for grouping/aggregation. So I reduced that to 5%
extra. The result showed that some test cases can be pushed down, but
some still cannot. I think we could push down the ORDER BY in all cases
by reducing the extra cost to a much smaller value, but I'm not sure
what value is reasonable.

Attached is an updated version of the patch. Other changes:

* Modified estimate_path_cost_size() further so that it accounts for
tlist eval cost as well
* Added more comments
* Simplified code a little bit

I'll add this to the upcoming CF.

Best regards,
Etsuro Fujita

Attachments:

0001-postgres_fdw-Perform-UPPERREL_ORDERED-step-remotely.patchtext/x-patch; name=0001-postgres_fdw-Perform-UPPERREL_ORDERED-step-remotely.patchDownload+474-125
0002-postgres_fdw-Perform-UPPERREL_FINAL-step-remotely.patchtext/x-patch; name=0002-postgres_fdw-Perform-UPPERREL_FINAL-step-remotely.patchDownload+470-398
#9Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#8)
Re: Problems with plan estimates in postgres_fdw

(2018/12/26 16:35), Etsuro Fujita wrote:

Attached is an updated version of the patch. Other changes:

While self-reviewing the patch I noticed a thinko in the patch 0001 for
pushing down the final sort: I estimated the costs for that using
estimate_path_cost_size that was modified so that it factored the
limit_tuples limit (if any) into the costs, but I think that was wrong;
that should not factor that because the remote query corresponding to
the pushdown step won't have LIMIT. So I fixed that. Also, a new data
structure I added to include/nodes/relation.h (ie, OrderedPathExtraData)
is no longer needed, so I removed that. Attached is a new version of
the patch.

Best regards,
Etsuro Fujita

Attachments:

0001-postgres_fdw-Perform-UPPERREL_ORDERED-step-remotely-v2.patchtext/x-patch; name=0001-postgres_fdw-Perform-UPPERREL_ORDERED-step-remotely-v2.patchDownload+452-123
0002-postgres_fdw-Perform-UPPERREL_FINAL-step-remotely-v2.patchtext/x-patch; name=0002-postgres_fdw-Perform-UPPERREL_FINAL-step-remotely-v2.patchDownload+470-398
#10Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#9)
Re: Problems with plan estimates in postgres_fdw

(2018/12/28 15:50), Etsuro Fujita wrote:

Attached is a new version of the
patch.

Here is an updated version of the patch set. Changes are:

* In the previous version, LIMIT without OFFSET was not performed
remotely as the costs was calculated the same as the costs of performing
it locally. (Actually, such LIMIT was performed remotely in a case in
the postgres_fdw regression test, but that was due to a bug :(.) I
think we should prefer performing such LIMIT remotely as that avoids
extra row fetches from the remote that performing it locally might
cause, so I tweaked the costs estimated in estimate_path_cost_size(), to
ensure that we'll prefer performing such LIMIT remotely. (I fixed the
mentioned bug as well.)

* I fixed another bug in adjusting the costs of pre-sorted
foreign-grouping paths.

* I tweaked comments, and rebased the patch set against the latest HEAD.

Best regards,
Etsuro Fujita

Attachments:

0001-postgres_fdw-Perform-UPPERREL_ORDERED-step-remotely-v3.patchtext/x-patch; name=0001-postgres_fdw-Perform-UPPERREL_ORDERED-step-remotely-v3.patchDownload+454-124
0002-postgres_fdw-Perform-UPPERREL_FINAL-step-remotely-v3.patchtext/x-patch; name=0002-postgres_fdw-Perform-UPPERREL_FINAL-step-remotely-v3.patchDownload+508-422
#11Antonin Houska
ah@cybertec.at
In reply to: Etsuro Fujita (#10)
Re: Problems with plan estimates in postgres_fdw

Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:

(2018/12/28 15:50), Etsuro Fujita wrote:

Attached is a new version of the
patch.

Here is an updated version of the patch set. Changes are:

I've spent some time reviewing the patches.

First, I wonder if the information on LIMIT needs to be passed to the
FDW. Cannot the FDW routines use root->tuple_fraction? I think (although have
not verified experimentally) that the problem reported in [1]/messages/by-id/87pnz1aby9.fsf@news-spur.riddles.org.uk is not limited
to the LIMIT clause. I think the same problem can happen if client uses cursor
and sets cursor_tuple_fraction very low. Since the remote node is not aware of
this setting when running EXPLAIN, the low fraction can also make postgres_fdw
think that retrieval of the whole set is cheaper than it will appear to be
during the actual execution.

Some comments on coding:

0001-postgres_fdw-Perform-UPPERREL_ORDERED-step-remotely-v3.patch
-----------------------------------------------------------------

* I'm not sure if UPPERREL_ORDERD should to deal with LIMIT at all. Note that
grouping_planner() does not call create_limit_path() until it's done with
create_ordered_paths(), and create_ordered_paths() is where the FDW is
requested to add its paths to UPPERREL_ORDERED relation.

* add_foreign_ordered_paths()

Exprssion root->upper_targets[UPPERREL_FINAL] is used to access the target.

I think create_ordered_paths() should actually set the target so that
postgresGetForeignUpperPaths() can simply find it in
output_rel->reltarget. This is how postgres_fdw already retrieves the target
for grouped paths. Small patch is attached that makes this possible.

* regression tests: I think test(s) should be added for queries that have
ORDER BY clause but do not have GROUP BY (and also no LIMIT / OFFSET)
clause. I haven't noticed such tests.

0002-postgres_fdw-Perform-UPPERREL_FINAL-step-remotely-v3.patch
---------------------------------------------------------------

* add_foreign_final_paths()

Similar problem I reported for add_foreign_ordered_paths() above.

* adjust_limit_rows_costs()

Doesn't this function address more generic problem than the use of
postgres_fdw? If so, then it should be submitted as a separate patch. BTW, the
function is only used in pathnode.c, so it should be declared static.

Both patches:
------------

One thing that makes me confused when I try to understand details is that the
new functions add_foreign_ordered_paths() and add_foreign_final_paths() both
pass "input_rel" for the "foreignrel" argument of estimate_path_cost_size():

estimate_path_cost_size(root, input_rel, ...)

whereas the existing add_foreign_grouping_paths() passes "grouped_rel":

estimate_path_cost_size(root, grouped_rel, ...)

Can you please make this consistent? If you do, you'll probably need to
reconsider your changes to estimate_path_cost_size().

And maybe related problem: why should FDW care about the effect of
apply_scanjoin_target_to_paths() like you claim to do here?

/*
* If this is UPPERREL_ORDERED and/or UPPERREL_FINAL steps on the
* final scan/join relation, the costs obtained from the cache
* wouldn't yet contain the eval cost for the final scan/join target,
* which would have been updated by apply_scanjoin_target_to_paths();
* add the eval cost now.
*/
if (fpextra && !IS_UPPER_REL(foreignrel))
{
/* The costs should have been obtained from the cache. */
Assert(fpinfo->rel_startup_cost >= 0 &&
fpinfo->rel_total_cost >= 0);

startup_cost += foreignrel->reltarget->cost.startup;
run_cost += foreignrel->reltarget->cost.per_tuple * rows;
}

I think it should not, whether "foreignrel" means "input_rel" or "output_rel"
from the perspective of postgresGetForeignUpperPaths().

[1]: /messages/by-id/87pnz1aby9.fsf@news-spur.riddles.org.uk

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: https://www.cybertec-postgresql.com

Attachments:

targets.patchtext/x-diffDownload+13-0
#12Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Antonin Houska (#11)
Re: Problems with plan estimates in postgres_fdw

Hi Antonin,

(2019/02/08 2:04), Antonin Houska wrote:

Etsuro Fujita<fujita.etsuro@lab.ntt.co.jp> wrote:

Here is an updated version of the patch set. Changes are:

I've spent some time reviewing the patches.

Thanks for the review!

First, I wonder if the information on LIMIT needs to be passed to the
FDW. Cannot the FDW routines use root->tuple_fraction?

I think it's OK for the FDW to use the tuple_fraction, but I'm not sure
the tuple_fraction should be enough. For example, I don't think we can
re-compute from the tuple_fraction the LIMIT/OFFSET values.

I think (although have
not verified experimentally) that the problem reported in [1] is not limited
to the LIMIT clause. I think the same problem can happen if client uses cursor
and sets cursor_tuple_fraction very low. Since the remote node is not aware of
this setting when running EXPLAIN, the low fraction can also make postgres_fdw
think that retrieval of the whole set is cheaper than it will appear to be
during the actual execution.

I think it would be better to get a fast-start plan on the remote side
in such a situation, but I'm not sure we can do that as well with this
patch, because in the cursor case, the FDW couldn't know in advance
exactly how may rows will be fetched from the remote side using that
cursor, so the FDW couldn't push down LIMIT. So I'd like to leave that
for another patch.

Some comments on coding:

0001-postgres_fdw-Perform-UPPERREL_ORDERED-step-remotely-v3.patch
-----------------------------------------------------------------

* I'm not sure if UPPERREL_ORDERD should to deal with LIMIT at all. Note that
grouping_planner() does not call create_limit_path() until it's done with
create_ordered_paths(), and create_ordered_paths() is where the FDW is
requested to add its paths to UPPERREL_ORDERED relation.

Maybe I'm missing something, but the 0001 patch doesn't deal with LIMIT
at all. I added the parameter limit_tuples to PgFdwPathExtraData so
that we can pass limit_tuples=-1, which means no LIMIT, to cost_sort(),
though.

* add_foreign_ordered_paths()

Exprssion root->upper_targets[UPPERREL_FINAL] is used to access the target.

I think create_ordered_paths() should actually set the target so that
postgresGetForeignUpperPaths() can simply find it in
output_rel->reltarget. This is how postgres_fdw already retrieves the target
for grouped paths. Small patch is attached that makes this possible.

Seems like a good idea. Thanks for the patch! Will review.

* regression tests: I think test(s) should be added for queries that have
ORDER BY clause but do not have GROUP BY (and also no LIMIT / OFFSET)
clause. I haven't noticed such tests.

Will do.

0002-postgres_fdw-Perform-UPPERREL_FINAL-step-remotely-v3.patch
---------------------------------------------------------------

* add_foreign_final_paths()

Similar problem I reported for add_foreign_ordered_paths() above.

* adjust_limit_rows_costs()

Doesn't this function address more generic problem than the use of
postgres_fdw? If so, then it should be submitted as a separate patch. BTW, the
function is only used in pathnode.c, so it should be declared static.

Actually, this is simple refactoring for create_limit_path() so that
that function can be shared with postgres_fdw. See
estimate_path_cost_size(). I'll separate that refactoring in the next
version of the patch set.

Both patches:
------------

One thing that makes me confused when I try to understand details is that the
new functions add_foreign_ordered_paths() and add_foreign_final_paths() both
pass "input_rel" for the "foreignrel" argument of estimate_path_cost_size():

estimate_path_cost_size(root, input_rel, ...)

whereas the existing add_foreign_grouping_paths() passes "grouped_rel":

estimate_path_cost_size(root, grouped_rel, ...)

Can you please make this consistent? If you do, you'll probably need to
reconsider your changes to estimate_path_cost_size().

This is intended and I think I should add more comments on that. Let me
explain. These patches modified estimate_path_cost_size() so that we
can estimate the costs of a foreign path for the UPPERREL_ORDERED or
UPPERREL_FINAL rel, by adding the costs of the upper-relation processing
(ie, ORDER BY and/or LIMIT/OFFSET, specified by PgFdwPathExtraData) to
the costs of implementing the *underlying* relation for the upper
relation (ie, scan, join or grouping rel, specified by the input_rel).
So those functions call estimate_path_cost_size() with the input_rel,
not the ordered_rel/final_rel, along with PgFdwPathExtraData.

And maybe related problem: why should FDW care about the effect of
apply_scanjoin_target_to_paths() like you claim to do here?

/*
* If this is UPPERREL_ORDERED and/or UPPERREL_FINAL steps on the
* final scan/join relation, the costs obtained from the cache
* wouldn't yet contain the eval cost for the final scan/join target,
* which would have been updated by apply_scanjoin_target_to_paths();
* add the eval cost now.
*/
if (fpextra&& !IS_UPPER_REL(foreignrel))
{
/* The costs should have been obtained from the cache. */
Assert(fpinfo->rel_startup_cost>= 0&&
fpinfo->rel_total_cost>= 0);

startup_cost += foreignrel->reltarget->cost.startup;
run_cost += foreignrel->reltarget->cost.per_tuple * rows;
}

I think it should not, whether "foreignrel" means "input_rel" or "output_rel"
from the perspective of postgresGetForeignUpperPaths().

I added this before costing the sort operation below, because 1) the
cost of evaluating the scan/join target might not be zero (consider the
case where sort columns are not simple Vars, for example) and 2) the
cost of sorting takes into account the underlying path's startup/total
costs. Maybe I'm missing something, though.

Thanks again!

Best regards,
Etsuro Fujita

#13Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#12)
Re: Problems with plan estimates in postgres_fdw

(2019/02/08 21:35), Etsuro Fujita wrote:

(2019/02/08 2:04), Antonin Houska wrote:

* add_foreign_ordered_paths()

Exprssion root->upper_targets[UPPERREL_FINAL] is used to access the
target.

I think create_ordered_paths() should actually set the target so that
postgresGetForeignUpperPaths() can simply find it in
output_rel->reltarget. This is how postgres_fdw already retrieves the
target
for grouped paths. Small patch is attached that makes this possible.

Seems like a good idea. Thanks for the patch! Will review.

Here are my review comments:

root->upper_targets[UPPERREL_FINAL] = final_target;
+ root->upper_targets[UPPERREL_ORDERED] = final_target;
root->upper_targets[UPPERREL_WINDOW] = sort_input_target;
root->upper_targets[UPPERREL_GROUP_AGG] = grouping_target;

I think that's a good idea. I think we'll soon need the PathTarget for
UPPERREL_DISTINCT, so how about saving that as well like this?

                 root->upper_targets[UPPERREL_FINAL] = final_target;
+               root->upper_targets[UPPERREL_ORDERED] = final_target;
+               root->upper_targets[UPPERREL_DISTINCT] = sort_input_target;
                 root->upper_targets[UPPERREL_WINDOW] = sort_input_target;
                 root->upper_targets[UPPERREL_GROUP_AGG] = grouping_target;

Another is about this:

  	/*
+	 * Set reltarget so that it's consistent with the paths. Also it's more
+	 * convenient for FDW to find the target here.
+	 */
+	ordered_rel->reltarget = target;

I think this is correct if there are no SRFs in the targetlist, but
otherwise I'm not sure this is really correct because the relation's
reltarget would not match the target of paths added to the relation
after adjust_paths_for_srfs(). Having said that, my question here is:
do we really need this (and the same for UPPERREL_FINAL you added)? For
the purpose of the FDW convenience, the upper_targets[] change seems to
me to be enough.

Best regards,
Etsuro Fujita

#14Antonin Houska
ah@cybertec.at
In reply to: Etsuro Fujita (#12)
Re: Problems with plan estimates in postgres_fdw

Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:

(2019/02/08 2:04), Antonin Houska wrote:

First, I wonder if the information on LIMIT needs to be passed to the
FDW. Cannot the FDW routines use root->tuple_fraction?

I think it's OK for the FDW to use the tuple_fraction, but I'm not sure the
tuple_fraction should be enough. For example, I don't think we can re-compute
from the tuple_fraction the LIMIT/OFFSET values.

I think (although have
not verified experimentally) that the problem reported in [1] is not limited
to the LIMIT clause. I think the same problem can happen if client uses cursor
and sets cursor_tuple_fraction very low. Since the remote node is not aware of
this setting when running EXPLAIN, the low fraction can also make postgres_fdw
think that retrieval of the whole set is cheaper than it will appear to be
during the actual execution.

I think it would be better to get a fast-start plan on the remote side in such
a situation, but I'm not sure we can do that as well with this patch, because
in the cursor case, the FDW couldn't know in advance exactly how may rows will
be fetched from the remote side using that cursor, so the FDW couldn't push
down LIMIT. So I'd like to leave that for another patch.

root->tuple_fraction (possibly derived from cursor_tuple_fraction) should be
enough to decide whether fast-startup plan should be considered. I just failed
to realize that your patch primarily aims at the support of UPPERREL_ORDERED
and UPPERREL_FINAL relations by postgres_fdw. Yes, another patch would be
needed to completely resolve the problem reported by Andrew Gierth upthread.

Some comments on coding:

0001-postgres_fdw-Perform-UPPERREL_ORDERED-step-remotely-v3.patch
-----------------------------------------------------------------

* I'm not sure if UPPERREL_ORDERD should to deal with LIMIT at all. Note that
grouping_planner() does not call create_limit_path() until it's done with
create_ordered_paths(), and create_ordered_paths() is where the FDW is
requested to add its paths to UPPERREL_ORDERED relation.

Maybe I'm missing something, but the 0001 patch doesn't deal with LIMIT at
all. I added the parameter limit_tuples to PgFdwPathExtraData so that we can
pass limit_tuples=-1, which means no LIMIT, to cost_sort(), though.

Yes, the limit_tuples field made me confused. But if cost_sort() is the only
reason, why not simply pass -1 in the 0001 patch, and use the limit_tuples
argument only in 0002? I see several other calls of cost_sort() in the core
where -1 is hard-coded.

Both patches:
------------

One thing that makes me confused when I try to understand details is that the
new functions add_foreign_ordered_paths() and add_foreign_final_paths() both
pass "input_rel" for the "foreignrel" argument of estimate_path_cost_size():

estimate_path_cost_size(root, input_rel, ...)

whereas the existing add_foreign_grouping_paths() passes "grouped_rel":

estimate_path_cost_size(root, grouped_rel, ...)

Can you please make this consistent? If you do, you'll probably need to
reconsider your changes to estimate_path_cost_size().

This is intended and I think I should add more comments on that. Let me
explain. These patches modified estimate_path_cost_size() so that we can
estimate the costs of a foreign path for the UPPERREL_ORDERED or
UPPERREL_FINAL rel, by adding the costs of the upper-relation processing (ie,
ORDER BY and/or LIMIT/OFFSET, specified by PgFdwPathExtraData) to the costs of
implementing the *underlying* relation for the upper relation (ie, scan, join
or grouping rel, specified by the input_rel). So those functions call
estimate_path_cost_size() with the input_rel, not the ordered_rel/final_rel,
along with PgFdwPathExtraData.

I think the same already happens for the UPPERREL_GROUP_AGG relation:
estimate_path_cost_size()

...
else if (IS_UPPER_REL(foreignrel))
{
...
ofpinfo = (PgFdwRelationInfo *) fpinfo->outerrel->fdw_private;

Here "outerrel" actually means input relation from the grouping perspective.
The input relation (i.e. result of scan / join) estimates are retrieved from
"ofpinfo" and the costs of aggregation are added. Why should this be different
for other kinds of upper relations?

And maybe related problem: why should FDW care about the effect of
apply_scanjoin_target_to_paths() like you claim to do here?

/*
* If this is UPPERREL_ORDERED and/or UPPERREL_FINAL steps on the
* final scan/join relation, the costs obtained from the cache
* wouldn't yet contain the eval cost for the final scan/join target,
* which would have been updated by apply_scanjoin_target_to_paths();
* add the eval cost now.
*/
if (fpextra&& !IS_UPPER_REL(foreignrel))
{
/* The costs should have been obtained from the cache. */
Assert(fpinfo->rel_startup_cost>= 0&&
fpinfo->rel_total_cost>= 0);

startup_cost += foreignrel->reltarget->cost.startup;
run_cost += foreignrel->reltarget->cost.per_tuple * rows;
}

I think it should not, whether "foreignrel" means "input_rel" or "output_rel"
from the perspective of postgresGetForeignUpperPaths().

I added this before costing the sort operation below, because 1) the cost of
evaluating the scan/join target might not be zero (consider the case where
sort columns are not simple Vars, for example) and 2) the cost of sorting
takes into account the underlying path's startup/total costs. Maybe I'm
missing something, though.

My understanding is that the "input_rel" argument of
postgresGetForeignUpperPaths() should already have been processed by
postgresGetForeignPaths() or postgresGetForeignJoinPaths() (or actually by
postgresGetForeignUpperPaths() called with different "stage" too). Therefore
it should already have the "fdw_private" field initialized. The estimates in
the corresponding PgFdwRelationInfo should already include the reltarget
costs, as set previously by estimate_path_cost_size().

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: https://www.cybertec-postgresql.com

#15Antonin Houska
ah@cybertec.at
In reply to: Etsuro Fujita (#13)
Re: Problems with plan estimates in postgres_fdw

Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:

(2019/02/08 21:35), Etsuro Fujita wrote:

(2019/02/08 2:04), Antonin Houska wrote:

* add_foreign_ordered_paths()

Exprssion root->upper_targets[UPPERREL_FINAL] is used to access the
target.

I think create_ordered_paths() should actually set the target so that
postgresGetForeignUpperPaths() can simply find it in
output_rel->reltarget. This is how postgres_fdw already retrieves the
target
for grouped paths. Small patch is attached that makes this possible.

Seems like a good idea. Thanks for the patch! Will review.

Here are my review comments:

root->upper_targets[UPPERREL_FINAL] = final_target;
+ root->upper_targets[UPPERREL_ORDERED] = final_target;
root->upper_targets[UPPERREL_WINDOW] = sort_input_target;
root->upper_targets[UPPERREL_GROUP_AGG] = grouping_target;

I think that's a good idea. I think we'll soon need the PathTarget for
UPPERREL_DISTINCT, so how about saving that as well like this?

root->upper_targets[UPPERREL_FINAL] = final_target;
+               root->upper_targets[UPPERREL_ORDERED] = final_target;
+               root->upper_targets[UPPERREL_DISTINCT] = sort_input_target;
root->upper_targets[UPPERREL_WINDOW] = sort_input_target;
root->upper_targets[UPPERREL_GROUP_AGG] = grouping_target;

I see nothing wrong about this.

Another is about this:

/*
+	 * Set reltarget so that it's consistent with the paths. Also it's more
+	 * convenient for FDW to find the target here.
+	 */
+	ordered_rel->reltarget = target;

I think this is correct if there are no SRFs in the targetlist, but otherwise
I'm not sure this is really correct because the relation's reltarget would not
match the target of paths added to the relation after adjust_paths_for_srfs().

I wouldn't expect problem here. create_grouping_paths() also sets the
reltarget although adjust_paths_for_srfs() can be alled on grouped_rel
afterwards.

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: https://www.cybertec-postgresql.com

#16Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Antonin Houska (#15)
Re: Problems with plan estimates in postgres_fdw

(2019/02/12 20:43), Antonin Houska wrote:

Etsuro Fujita<fujita.etsuro@lab.ntt.co.jp> wrote:

Here are my review comments:

root->upper_targets[UPPERREL_FINAL] = final_target;
+ root->upper_targets[UPPERREL_ORDERED] = final_target;
root->upper_targets[UPPERREL_WINDOW] = sort_input_target;
root->upper_targets[UPPERREL_GROUP_AGG] = grouping_target;

I think that's a good idea. I think we'll soon need the PathTarget for
UPPERREL_DISTINCT, so how about saving that as well like this?

root->upper_targets[UPPERREL_FINAL] = final_target;
+               root->upper_targets[UPPERREL_ORDERED] = final_target;
+               root->upper_targets[UPPERREL_DISTINCT] = sort_input_target;
root->upper_targets[UPPERREL_WINDOW] = sort_input_target;
root->upper_targets[UPPERREL_GROUP_AGG] = grouping_target;

I see nothing wrong about this.

Cool!

Another is about this:

/*
+	 * Set reltarget so that it's consistent with the paths. Also it's more
+	 * convenient for FDW to find the target here.
+	 */
+	ordered_rel->reltarget = target;

I think this is correct if there are no SRFs in the targetlist, but otherwise
I'm not sure this is really correct because the relation's reltarget would not
match the target of paths added to the relation after adjust_paths_for_srfs().

I wouldn't expect problem here. create_grouping_paths() also sets the
reltarget although adjust_paths_for_srfs() can be alled on grouped_rel
afterwards.

Yeah, but as I said in a previous email, I think the root->upper_targets
change would be enough at least currently for FDWs to consider
performing post-UPPERREL_GROUP_AGG steps remotely, as shown in my
patches, so I'm inclined to leave the retarget change for future work.

Best regards,
Etsuro Fujita

#17Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Antonin Houska (#14)
Re: Problems with plan estimates in postgres_fdw

(2019/02/12 18:03), Antonin Houska wrote:

Etsuro Fujita<fujita.etsuro@lab.ntt.co.jp> wrote:

(2019/02/08 2:04), Antonin Houska wrote:

Some comments on coding:

0001-postgres_fdw-Perform-UPPERREL_ORDERED-step-remotely-v3.patch
-----------------------------------------------------------------

* I'm not sure if UPPERREL_ORDERD should to deal with LIMIT at all. Note that
grouping_planner() does not call create_limit_path() until it's done with
create_ordered_paths(), and create_ordered_paths() is where the FDW is
requested to add its paths to UPPERREL_ORDERED relation.

Maybe I'm missing something, but the 0001 patch doesn't deal with LIMIT at
all. I added the parameter limit_tuples to PgFdwPathExtraData so that we can
pass limit_tuples=-1, which means no LIMIT, to cost_sort(), though.

Yes, the limit_tuples field made me confused. But if cost_sort() is the only
reason, why not simply pass -1 in the 0001 patch, and use the limit_tuples
argument only in 0002? I see several other calls of cost_sort() in the core
where -1 is hard-coded.

Yeah, that would make it easy to review the patch; will do.

Both patches:
------------

One thing that makes me confused when I try to understand details is that the
new functions add_foreign_ordered_paths() and add_foreign_final_paths() both
pass "input_rel" for the "foreignrel" argument of estimate_path_cost_size():

estimate_path_cost_size(root, input_rel, ...)

whereas the existing add_foreign_grouping_paths() passes "grouped_rel":

estimate_path_cost_size(root, grouped_rel, ...)

Can you please make this consistent? If you do, you'll probably need to
reconsider your changes to estimate_path_cost_size().

This is intended and I think I should add more comments on that. Let me
explain. These patches modified estimate_path_cost_size() so that we can
estimate the costs of a foreign path for the UPPERREL_ORDERED or
UPPERREL_FINAL rel, by adding the costs of the upper-relation processing (ie,
ORDER BY and/or LIMIT/OFFSET, specified by PgFdwPathExtraData) to the costs of
implementing the *underlying* relation for the upper relation (ie, scan, join
or grouping rel, specified by the input_rel). So those functions call
estimate_path_cost_size() with the input_rel, not the ordered_rel/final_rel,
along with PgFdwPathExtraData.

I think the same already happens for the UPPERREL_GROUP_AGG relation:
estimate_path_cost_size()

...
else if (IS_UPPER_REL(foreignrel))
{
...
ofpinfo = (PgFdwRelationInfo *) fpinfo->outerrel->fdw_private;

Here "outerrel" actually means input relation from the grouping perspective.
The input relation (i.e. result of scan / join) estimates are retrieved from
"ofpinfo" and the costs of aggregation are added. Why should this be different
for other kinds of upper relations?

IIUC, I think there is an oversight in that case: the cost estimation
doesn't account for evaluating the final scan/join target updated by
apply_scanjoin_target_to_paths(), but I think that is another issue, so
I'll start a new thread.

And maybe related problem: why should FDW care about the effect of
apply_scanjoin_target_to_paths() like you claim to do here?

/*
* If this is UPPERREL_ORDERED and/or UPPERREL_FINAL steps on the
* final scan/join relation, the costs obtained from the cache
* wouldn't yet contain the eval cost for the final scan/join target,
* which would have been updated by apply_scanjoin_target_to_paths();
* add the eval cost now.
*/
if (fpextra&& !IS_UPPER_REL(foreignrel))
{
/* The costs should have been obtained from the cache. */
Assert(fpinfo->rel_startup_cost>= 0&&
fpinfo->rel_total_cost>= 0);

startup_cost += foreignrel->reltarget->cost.startup;
run_cost += foreignrel->reltarget->cost.per_tuple * rows;
}

I think it should not, whether "foreignrel" means "input_rel" or "output_rel"
from the perspective of postgresGetForeignUpperPaths().

I added this before costing the sort operation below, because 1) the cost of
evaluating the scan/join target might not be zero (consider the case where
sort columns are not simple Vars, for example) and 2) the cost of sorting
takes into account the underlying path's startup/total costs. Maybe I'm
missing something, though.

My understanding is that the "input_rel" argument of
postgresGetForeignUpperPaths() should already have been processed by
postgresGetForeignPaths() or postgresGetForeignJoinPaths() (or actually by
postgresGetForeignUpperPaths() called with different "stage" too). Therefore
it should already have the "fdw_private" field initialized. The estimates in
the corresponding PgFdwRelationInfo should already include the reltarget
costs, as set previously by estimate_path_cost_size().

Right, but what I'm saying here is: the costs of evaluating the final
scan/join target updated by apply_scanjoin_target_to_paths() wouldn't be
yet included in the costs we calculated using local statistics when
called from postgresGetForeignPaths() or postgresGetForeignJoinPaths().
Let me explain using an example:

SELECT a+b FROM foreign_table LIMIT 10;

For this query, the reltarget for the foreign_table would be {a, b} when
called from postgresGetForeignPaths() (see build_base_rel_tlists()).
The costs of evaluating simple Vars are zeroes, so we wouldn't charge
any costs for tlist evaluation when estimating the costs of a basic
foreign table scan in that callback routine, but the final scan target,
with which the reltarget will be replaced later by
apply_scanjoin_target_to_paths(), would be {a+b}, so we need to adjust
the costs of the basic foreign table scan, cached in the
PgFdwRelationInfo for the input_rel (ie, the foreign_table), so that
eval costs for the replaced tlist are included when called from
postgresGetForeignUpperPaths() with the UPPERREL_FINAL stage, as the
costs of evaluating the expression 'a+b' wouldn't be zeroes.

Best regards,
Etsuro Fujita

#18Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#16)
Re: Problems with plan estimates in postgres_fdw

(2019/02/14 18:44), Etsuro Fujita wrote:

(2019/02/12 20:43), Antonin Houska wrote:

Etsuro Fujita<fujita.etsuro@lab.ntt.co.jp> wrote:

Here are my review comments:

root->upper_targets[UPPERREL_FINAL] = final_target;
+ root->upper_targets[UPPERREL_ORDERED] = final_target;
root->upper_targets[UPPERREL_WINDOW] = sort_input_target;
root->upper_targets[UPPERREL_GROUP_AGG] = grouping_target;

I think that's a good idea. I think we'll soon need the PathTarget for
UPPERREL_DISTINCT, so how about saving that as well like this?

root->upper_targets[UPPERREL_FINAL] = final_target;
+ root->upper_targets[UPPERREL_ORDERED] = final_target;
+ root->upper_targets[UPPERREL_DISTINCT] = sort_input_target;
root->upper_targets[UPPERREL_WINDOW] = sort_input_target;
root->upper_targets[UPPERREL_GROUP_AGG] = grouping_target;

I see nothing wrong about this.

Cool!

Another is about this:

/*
+ * Set reltarget so that it's consistent with the paths. Also it's more
+ * convenient for FDW to find the target here.
+ */
+ ordered_rel->reltarget = target;

I think this is correct if there are no SRFs in the targetlist, but
otherwise
I'm not sure this is really correct because the relation's reltarget
would not
match the target of paths added to the relation after
adjust_paths_for_srfs().

I wouldn't expect problem here. create_grouping_paths() also sets the
reltarget although adjust_paths_for_srfs() can be alled on grouped_rel
afterwards.

Yeah, but as I said in a previous email, I think the root->upper_targets
change would be enough at least currently for FDWs to consider
performing post-UPPERREL_GROUP_AGG steps remotely, as shown in my
patches, so I'm inclined to leave the retarget change for future work.

So, here is an updated patch. If there are no objections from you or
anyone else, I'll commit the patch as a preliminary one for what's
proposed in this thread.

Best regards,
Etsuro Fujita

Attachments:

0001-Save-PathTargets-for-distinct-ordered-relations-in-r.patchtext/x-patch; name=0001-Save-PathTargets-for-distinct-ordered-relations-in-r.patchDownload+2-1
#19Antonin Houska
ah@cybertec.at
In reply to: Etsuro Fujita (#17)
Re: Problems with plan estimates in postgres_fdw

Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:

(2019/02/12 18:03), Antonin Houska wrote:

Etsuro Fujita<fujita.etsuro@lab.ntt.co.jp> wrote:

(2019/02/08 2:04), Antonin Houska wrote:

And maybe related problem: why should FDW care about the effect of
apply_scanjoin_target_to_paths() like you claim to do here?

/*
* If this is UPPERREL_ORDERED and/or UPPERREL_FINAL steps on the
* final scan/join relation, the costs obtained from the cache
* wouldn't yet contain the eval cost for the final scan/join target,
* which would have been updated by apply_scanjoin_target_to_paths();
* add the eval cost now.
*/
if (fpextra&& !IS_UPPER_REL(foreignrel))
{
/* The costs should have been obtained from the cache. */
Assert(fpinfo->rel_startup_cost>= 0&&
fpinfo->rel_total_cost>= 0);

startup_cost += foreignrel->reltarget->cost.startup;
run_cost += foreignrel->reltarget->cost.per_tuple * rows;
}

I think it should not, whether "foreignrel" means "input_rel" or "output_rel"
from the perspective of postgresGetForeignUpperPaths().

I added this before costing the sort operation below, because 1) the cost of
evaluating the scan/join target might not be zero (consider the case where
sort columns are not simple Vars, for example) and 2) the cost of sorting
takes into account the underlying path's startup/total costs. Maybe I'm
missing something, though.

My understanding is that the "input_rel" argument of
postgresGetForeignUpperPaths() should already have been processed by
postgresGetForeignPaths() or postgresGetForeignJoinPaths() (or actually by
postgresGetForeignUpperPaths() called with different "stage" too). Therefore
it should already have the "fdw_private" field initialized. The estimates in
the corresponding PgFdwRelationInfo should already include the reltarget
costs, as set previously by estimate_path_cost_size().

Right, but what I'm saying here is: the costs of evaluating the final
scan/join target updated by apply_scanjoin_target_to_paths() wouldn't be yet
included in the costs we calculated using local statistics when called from
postgresGetForeignPaths() or postgresGetForeignJoinPaths(). Let me explain
using an example:

SELECT a+b FROM foreign_table LIMIT 10;

For this query, the reltarget for the foreign_table would be {a, b} when
called from postgresGetForeignPaths() (see build_base_rel_tlists()). The costs
of evaluating simple Vars are zeroes, so we wouldn't charge any costs for
tlist evaluation when estimating the costs of a basic foreign table scan in
that callback routine, but the final scan target, with which the reltarget
will be replaced later by apply_scanjoin_target_to_paths(), would be {a+b}, so
we need to adjust the costs of the basic foreign table scan, cached in the
PgFdwRelationInfo for the input_rel (ie, the foreign_table), so that eval
costs for the replaced tlist are included when called from
postgresGetForeignUpperPaths() with the UPPERREL_FINAL stage, as the costs of
evaluating the expression 'a+b' wouldn't be zeroes.

ok, I understand now. I assume that the patch

/messages/by-id/5C66A056.60007@lab.ntt.co.jp

obsoletes the code snippet we discussed above.

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: https://www.cybertec-postgresql.com

#20Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#18)
Re: Problems with plan estimates in postgres_fdw

(2019/02/15 21:19), Etsuro Fujita wrote:

So, here is an updated patch. If there are no objections from you or
anyone else, I'll commit the patch as a preliminary one for what's
proposed in this thread.

Pushed, after fiddling with the commit message a bit.

Best regards,
Etsuro Fujita

#21Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Antonin Houska (#19)
#22Antonin Houska
ah@cybertec.at
In reply to: Etsuro Fujita (#21)
#23Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Antonin Houska (#22)
#24Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#12)
#25Jeff Janes
jeff.janes@gmail.com
In reply to: Etsuro Fujita (#10)
#26Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Jeff Janes (#25)
#27Antonin Houska
ah@cybertec.at
In reply to: Etsuro Fujita (#23)
#28Antonin Houska
ah@cybertec.at
In reply to: Etsuro Fujita (#24)
#29Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Antonin Houska (#28)
#30Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Antonin Houska (#27)
#31Antonin Houska
ah@cybertec.at
In reply to: Etsuro Fujita (#30)
#32Antonin Houska
ah@cybertec.at
In reply to: Etsuro Fujita (#29)
#33Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Antonin Houska (#31)
#34Antonin Houska
ah@cybertec.at
In reply to: Etsuro Fujita (#33)
#35Antonin Houska
ah@cybertec.at
In reply to: Etsuro Fujita (#33)
#36Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Antonin Houska (#34)
#37Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Antonin Houska (#35)
#38Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Antonin Houska (#32)
#39Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#29)
#40Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#26)
#41Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#39)
#42Antonin Houska
ah@cybertec.at
In reply to: Etsuro Fujita (#38)
#43Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Antonin Houska (#42)
#44Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#41)
#45Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#44)
#46Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#45)
#47Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#46)
#48Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#47)