Cost estimation in foreign data wrappers
Hello,
There is a callback function in fdw's which should also set estimates for
startup and total costs for each path. Assume a fdw adds only one path
(e.g. in file_fdw). I am trying to understand what can go wrong if we do a
bad job in estimating these costs.
Since we have only one scan path here, it doesn't make a difference in
choosing the best scan path.
By looking at the code and doing some experiments, I think this can
be significant in (1) underestimating a nested loop's cost, (2) not
materializing inner table in nested loop.
* Are there any other cases that this can be significant?
* Assume we are not sure about the exact cost, but we know that it is in
[lower_bound, upper_bound] range, where upper_bound can be 10x lower_bound
Then, what value is better to choose? lower bound? upper bound? or average?
Thanks,
-- Hadi
Hadi Moshayedi <hadi@moshayedi.net> writes:
There is a callback function in fdw's which should also set estimates for
startup and total costs for each path. Assume a fdw adds only one path
(e.g. in file_fdw). I am trying to understand what can go wrong if we do a
bad job in estimating these costs.
Since we have only one scan path here, it doesn't make a difference in
choosing the best scan path.
Right. But if there's more than one table in the query, it might make a
difference in terms of what join plan gets chosen. I'd say that getting
an accurate rowcount estimate is usually far more important, though.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers