postgres_fdw: another oddity in costing aggregate pushdown paths

Started by Etsuro Fujitaabout 7 years ago5 messageshackers
Jump to latest
#1Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp

As mentioned in the near thread, I think there is another oversight in
the cost estimation for aggregate pushdown paths in postgres_fdw, IIUC.
When costing an aggregate pushdown path using local statistics, we
re-use the estimated costs of implementing the underlying scan/join
relation, cached in the relation's PgFdwRelationInfo (ie,
rel_startup_cost and rel_total_cost). Since these costs wouldn't yet
contain the costs of evaluating the final scan/join target, as tlist
replacement by apply_scanjoin_target_to_paths() is performed afterwards.
So I think we need to adjust these costs so that the tlist eval costs
are included, but ISTM that estimate_path_cost_size() forgot to do so.
Attached is a patch for fixing this issue.

Best regards,
Etsuro Fujita

Attachments:

another-fix-postgres-fdw-grouping-path-cost.patchtext/x-patch; name=another-fix-postgres-fdw-grouping-path-cost.patchDownload+13-8
#2Antonin Houska
ah@cybertec.at
In reply to: Etsuro Fujita (#1)
Re: postgres_fdw: another oddity in costing aggregate pushdown paths

Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:

As mentioned in the near thread, I think there is another oversight in
the cost estimation for aggregate pushdown paths in postgres_fdw, IIUC.
When costing an aggregate pushdown path using local statistics, we
re-use the estimated costs of implementing the underlying scan/join
relation, cached in the relation's PgFdwRelationInfo (ie,
rel_startup_cost and rel_total_cost). Since these costs wouldn't yet
contain the costs of evaluating the final scan/join target, as tlist
replacement by apply_scanjoin_target_to_paths() is performed afterwards.
So I think we need to adjust these costs so that the tlist eval costs
are included, but ISTM that estimate_path_cost_size() forgot to do so.
Attached is a patch for fixing this issue.

I think the following comment in apply_scanjoin_target_to_paths() should
mention that FDWs rely on the new value of reltarget.

/*
* Update the reltarget. This may not be strictly necessary in all cases,
* but it is at least necessary when create_append_path() gets called
* below directly or indirectly, since that function uses the reltarget as
* the pathtarget for the resulting path. It seems like a good idea to do
* it unconditionally.
*/
rel->reltarget = llast_node(PathTarget, scanjoin_targets);

--
Antonin Houska
https://www.cybertec-postgresql.com

#3Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Antonin Houska (#2)
Re: postgres_fdw: another oddity in costing aggregate pushdown paths

(2019/02/22 23:10), Antonin Houska wrote:

Etsuro Fujita<fujita.etsuro@lab.ntt.co.jp> wrote:

As mentioned in the near thread, I think there is another oversight in
the cost estimation for aggregate pushdown paths in postgres_fdw, IIUC.
When costing an aggregate pushdown path using local statistics, we
re-use the estimated costs of implementing the underlying scan/join
relation, cached in the relation's PgFdwRelationInfo (ie,
rel_startup_cost and rel_total_cost). Since these costs wouldn't yet
contain the costs of evaluating the final scan/join target, as tlist
replacement by apply_scanjoin_target_to_paths() is performed afterwards.
So I think we need to adjust these costs so that the tlist eval costs
are included, but ISTM that estimate_path_cost_size() forgot to do so.
Attached is a patch for fixing this issue.

I think the following comment in apply_scanjoin_target_to_paths() should
mention that FDWs rely on the new value of reltarget.

/*
* Update the reltarget. This may not be strictly necessary in all cases,
* but it is at least necessary when create_append_path() gets called
* below directly or indirectly, since that function uses the reltarget as
* the pathtarget for the resulting path. It seems like a good idea to do
* it unconditionally.
*/
rel->reltarget = llast_node(PathTarget, scanjoin_targets);

Agreed. How about mentioning that like the attached? In addition, I
added another assertion to estimate_path_cost_size() in that patch.

Thanks for the review!

Best regards,
Etsuro Fujita

Attachments:

another-fix-postgres-fdw-grouping-path-cost-v2.patchtext/x-patch; name=another-fix-postgres-fdw-grouping-path-cost-v2.patchDownload+24-16
#4Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#3)
Re: postgres_fdw: another oddity in costing aggregate pushdown paths

(2019/02/25 19:59), Etsuro Fujita wrote:

(2019/02/22 23:10), Antonin Houska wrote:

Etsuro Fujita<fujita.etsuro@lab.ntt.co.jp> wrote:

As mentioned in the near thread, I think there is another oversight in
the cost estimation for aggregate pushdown paths in postgres_fdw, IIUC.
When costing an aggregate pushdown path using local statistics, we
re-use the estimated costs of implementing the underlying scan/join
relation, cached in the relation's PgFdwRelationInfo (ie,
rel_startup_cost and rel_total_cost). Since these costs wouldn't yet
contain the costs of evaluating the final scan/join target, as tlist
replacement by apply_scanjoin_target_to_paths() is performed afterwards.
So I think we need to adjust these costs so that the tlist eval costs
are included, but ISTM that estimate_path_cost_size() forgot to do so.
Attached is a patch for fixing this issue.

I think the following comment in apply_scanjoin_target_to_paths() should
mention that FDWs rely on the new value of reltarget.

/*
* Update the reltarget. This may not be strictly necessary in all cases,
* but it is at least necessary when create_append_path() gets called
* below directly or indirectly, since that function uses the reltarget as
* the pathtarget for the resulting path. It seems like a good idea to do
* it unconditionally.
*/
rel->reltarget = llast_node(PathTarget, scanjoin_targets);

Agreed. How about mentioning that like the attached? In addition, I
added another assertion to estimate_path_cost_size() in that patch.

This doesn't get applied cleanly after commit 1d33858406. Here is a
rebased version of the patch. I also modified the comments a little
bit. If there are no objections from Antonin or anyone else, I'll
commit the patch.

Best regards,
Etsuro Fujita

Attachments:

another-fix-postgres-fdw-grouping-path-cost-v3.patchtext/x-patch; name=another-fix-postgres-fdw-grouping-path-cost-v3.patchDownload+20-8
#5Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#4)
Re: postgres_fdw: another oddity in costing aggregate pushdown paths

On Wed, May 8, 2019 at 12:45 PM Etsuro Fujita
<fujita.etsuro@lab.ntt.co.jp> wrote:

This doesn't get applied cleanly after commit 1d33858406. Here is a
rebased version of the patch. I also modified the comments a little
bit. If there are no objections from Antonin or anyone else, I'll
commit the patch.

Pushed. Thanks for reviewing, Antonin!

Best regards,
Etsuro Fujita