Cost overestimation of foreign JOIN

Started by Andrey Lepikhovabout 5 years ago5 messages
#1Andrey Lepikhov
a.lepikhov@postgrespro.ru
1 attachment(s)

Hi,

While testing of Asynchronous Append feature with TPC-H queries, I found
that the push-down JOIN technique is rarely used.
For my servers fdw_tuple_cost = 0.2, fdw_startup_cost = 100.
Exploring the code, i found in postgres_fdw, estimate_path_cost_size(),
lines 2908,2909:
run_cost += nrows * join_cost.per_tuple;
nrows = clamp_row_est(nrows * fpinfo->joinclause_sel);

Above:
nrows = fpinfo_i->rows * fpinfo_o->rows;

Maybe it is needed to swap lines 2908 and 2909 (see attachment)?

In my case of two big partitioned tables and small join result it
strongly influenced on choice of the JOIN push-down strategy.

--
regards,
Andrey Lepikhov
Postgres Professional

Attachments:

estimation.difftext/plain; charset=UTF-8; name=estimation.diff; x-mac-creator=0; x-mac-type=0Download
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index b6c72e1d1e..3047300c4b 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -2905,8 +2905,8 @@ estimate_path_cost_size(PlannerInfo *root,
 			 */
 			run_cost = fpinfo_i->rel_total_cost - fpinfo_i->rel_startup_cost;
 			run_cost += fpinfo_o->rel_total_cost - fpinfo_o->rel_startup_cost;
-			run_cost += nrows * join_cost.per_tuple;
 			nrows = clamp_row_est(nrows * fpinfo->joinclause_sel);
+			run_cost += nrows * join_cost.per_tuple;
 			run_cost += nrows * remote_conds_cost.per_tuple;
 			run_cost += fpinfo->local_conds_cost.per_tuple * retrieved_rows;
 
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrey Lepikhov (#1)
Re: Cost overestimation of foreign JOIN

Andrey Lepikhov <a.lepikhov@postgrespro.ru> writes:

Maybe it is needed to swap lines 2908 and 2909 (see attachment)?

No; as explained in the comment immediately above here, we're assuming
that the join conditions will be applied on the cross product of the
input relations.

Now admittedly, that's a worst-case assumption, since it amounts to
expecting that the remote server will do the join in the dumbest
possible nested-loop way. If the remote can use a merge or hash
join, for example, the cost is likely to be a lot less. But it is
not the job of this code path to outguess the remote planner. It's
certainly not appropriate to invent an unprincipled cost estimate
as a substitute for trying to guess that.

If you're unhappy with the planning results you get for this,
why don't you have use_remote_estimate turned on?

regards, tom lane

#3Andrey Lepikhov
a.lepikhov@postgrespro.ru
In reply to: Tom Lane (#2)
Re: Cost overestimation of foreign JOIN

On 30.11.2020 22:38, Tom Lane wrote:

Andrey Lepikhov <a.lepikhov@postgrespro.ru> writes:

Maybe it is needed to swap lines 2908 and 2909 (see attachment)?

No; as explained in the comment immediately above here, we're assuming
that the join conditions will be applied on the cross product of the
input relations.

Thank you. Now it is clear to me.

Now admittedly, that's a worst-case assumption, since it amounts to
expecting that the remote server will do the join in the dumbest
possible nested-loop way. If the remote can use a merge or hash
join, for example, the cost is likely to be a lot less.

My goal is scaling Postgres on a set of the same servers with same
postgres instances. If one server uses for the join a hash-join node, i
think it is most likely that the other server will also use for this
join a hash-join node (Maybe you remember, I also use the statistics
copying technique to provide up-to-date statistics on partitions). Tests
show good results with such an approach. But maybe this is my special case.

But it is
not the job of this code path to outguess the remote planner. It's
certainly not appropriate to invent an unprincipled cost estimate
as a substitute for trying to guess that.

Agreed.

If you're unhappy with the planning results you get for this,
why don't you have use_remote_estimate turned on?

I have a mixed load model. Large queries are suitable for additional
estimate queries. But for many simple SELECT's that touch a small
portion of the data, the latency has increased significantly. And I
don't know how to switch the use_remote_estimate setting in such case.

--
regards,
Andrey Lepikhov
Postgres Professional

#4Ashutosh Bapat
ashutosh.bapat.oss@gmail.com
In reply to: Andrey Lepikhov (#3)
Re: Cost overestimation of foreign JOIN

On Mon, Nov 30, 2020 at 11:56 PM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:

On 30.11.2020 22:38, Tom Lane wrote:

Andrey Lepikhov <a.lepikhov@postgrespro.ru> writes:

Maybe it is needed to swap lines 2908 and 2909 (see attachment)?

No; as explained in the comment immediately above here, we're assuming
that the join conditions will be applied on the cross product of the
input relations.

Thank you. Now it is clear to me.

Now admittedly, that's a worst-case assumption, since it amounts to
expecting that the remote server will do the join in the dumbest
possible nested-loop way. If the remote can use a merge or hash
join, for example, the cost is likely to be a lot less.

My goal is scaling Postgres on a set of the same servers with same
postgres instances. If one server uses for the join a hash-join node, i
think it is most likely that the other server will also use for this
join a hash-join node (Maybe you remember, I also use the statistics
copying technique to provide up-to-date statistics on partitions). Tests
show good results with such an approach. But maybe this is my special case.

But it is
not the job of this code path to outguess the remote planner. It's
certainly not appropriate to invent an unprincipled cost estimate
as a substitute for trying to guess that.

Agreed.

If you're unhappy with the planning results you get for this,
why don't you have use_remote_estimate turned on?

I have a mixed load model. Large queries are suitable for additional
estimate queries. But for many simple SELECT's that touch a small
portion of the data, the latency has increased significantly. And I
don't know how to switch the use_remote_estimate setting in such case.

You may disable use_remote_estimates for given table or a server. So
if tables participating in short queries are different from those in
the large queries, you could set use_remote_estimate at table level to
turn it off for the first set. Otherwise, we need a FDW level GUC
which can be turned on/off for a given session or a query.

Generally use_remote_estimate isn't scalable and there have been
discussions about eliminating the need of it. But no concrete proposal
has come yet.

--
Best Wishes,
Ashutosh Bapat

#5Andrey V. Lepikhov
a.lepikhov@postgrespro.ru
In reply to: Ashutosh Bapat (#4)
Re: Cost overestimation of foreign JOIN

On 12/1/20 6:17 PM, Ashutosh Bapat wrote:

On Mon, Nov 30, 2020 at 11:56 PM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:

On 30.11.2020 22:38, Tom Lane wrote:

Andrey Lepikhov <a.lepikhov@postgrespro.ru> writes:
If you're unhappy with the planning results you get for this,
why don't you have use_remote_estimate turned on?

I have a mixed load model. Large queries are suitable for additional
estimate queries. But for many simple SELECT's that touch a small
portion of the data, the latency has increased significantly. And I
don't know how to switch the use_remote_estimate setting in such case.

You may disable use_remote_estimates for given table or a server. So
if tables participating in short queries are different from those in
the large queries, you could set use_remote_estimate at table level to
turn it off for the first set. Otherwise, we need a FDW level GUC
which can be turned on/off for a given session or a query.

Currently I implemented another technique:
- By default, use_remote_estimate is off.
- On the estimate_path_cost_size() some estimation criteria is checked.
If true, we force remote estimation for this JOIN.
This approach solves the push-down problem in my case - TPC-H test with
6 servers/instances. But it is not so scalable, as i want.

Generally use_remote_estimate isn't scalable and there have been
discussions about eliminating the need of it. But no concrete proposal
has come yet.

Above I suggested to use results of cost calculation on local JOIN,
assuming that in the case of postgres_fdw wrapper very likely, that
foreign server will use the same type of join (or even better, if it has
some index, for example).
If this approach is of interest, I can investigate it.

--
regards,
Andrey Lepikhov
Postgres Professional