Problems with plan estimates in postgres_fdw

Started by Andrew Gierthalmost 8 years ago48 messageshackers

andrew@tao11.riddles.org.uk

almost 8 years ago

This analysis comes from investigating a report from an IRC user. A
summary of the initial report is:

Using PG 9.6.9 and postgres_fdw, a query of the form "select * from
foreign_table order by col limit 1" is getting a local Sort plan, not
pushing the ORDER BY to the remote. Turning off use_remote_estimates
changes the plan to use a remote sort, with a 10000x speedup.

I don't think this can be called a bug, exactly, and I don't have an
immediate fix, so I'm putting this analysis up for the benefit of anyone
working on this in future.

The cause of the misplan seems to be this: postgres_fdw with
use_remote_estimates on does not attempt to obtain fast-start plans from
the remote. In this case what happens is this:

1. postgres_fdw gets the cost estimate from the plain remote fetch, by
doing "EXPLAIN select * from table". This produces a plan with a low
startup cost (just the constant overhead) and a high total cost (on
the order of 1.2e6 in this case).

2. postgres_fdw gets the cost estimate for the ordered fetch, by doing
"EXPLAIN select * from table order by col". Note that there is no
LIMIT nor any cursor_tuple_fraction in effect, so the plan returned
in this case is a seqscan+sort plan (in spite of the presence of an
index on "col"), with a very high (order of 8e6) startup and total
cost.

So when the local side tries to generate paths, it has the choice of
using a remote-ordered path with startup cost 8e6, or a local top-1
sort on top of an unordered remote path, which has a total cost on the
order of 1.5e6 in this case; cheaper than the remote sort because this
only needs to do top-1, while the remote is sorting millions of rows
and would probably spill to disk.

However, when it comes to actual execution, postgres_fdw opens a cursor
for the remote query, which means that cursor_tuple_fraction will come
into play. As far as I can tell, this is not set anywhere, so this means
that the plan that actually gets run on the remote is likely to have
_completely_ different costs from those returned by the EXPLAINs. In
particular, in this case the fast-start index-scan plan for the ORDER BY
remote query is clearly being chosen when use_remote_estimates is off
(since the query completes in 15ms rather than 150 seconds).

One possibility: would it be worth adding an option to EXPLAIN that
makes it assume cursor_tuple_fraction?

--
Andrew (irc:RhodiumToad)

Kyotaro Horiguchi

horikyota.ntt@gmail.com

almost 8 years ago

In reply to: Andrew Gierth (#1)

Re: Problems with plan estimates in postgres_fdw

Hello.

At Thu, 02 Aug 2018 01:06:41 +0100, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote in <87pnz1aby9.fsf@news-spur.riddles.org.uk>

This analysis comes from investigating a report from an IRC user. A
summary of the initial report is:

Using PG 9.6.9 and postgres_fdw, a query of the form "select * from
foreign_table order by col limit 1" is getting a local Sort plan, not
pushing the ORDER BY to the remote. Turning off use_remote_estimates
changes the plan to use a remote sort, with a 10000x speedup.

I don't think this can be called a bug, exactly, and I don't have an
immediate fix, so I'm putting this analysis up for the benefit of anyone
working on this in future.

I didn't see the concrete estimates, it seems that the cause is
too-small total cost of non-remote-sorted plan compared with the
startup cost of remote-sorted one. In other words, tuple cost by
the remoteness is estimated as too small. Perhaps setting
fdw_tuple_cost to , say 1 as an extreme value, will bring victory
to remote sort path for the query.

The cause of the misplan seems to be this: postgres_fdw with
use_remote_estimates on does not attempt to obtain fast-start plans from
the remote. In this case what happens is this:

1. postgres_fdw gets the cost estimate from the plain remote fetch, by
doing "EXPLAIN select * from table". This produces a plan with a low
startup cost (just the constant overhead) and a high total cost (on
the order of 1.2e6 in this case).

2. postgres_fdw gets the cost estimate for the ordered fetch, by doing
"EXPLAIN select * from table order by col". Note that there is no
LIMIT nor any cursor_tuple_fraction in effect, so the plan returned
in this case is a seqscan+sort plan (in spite of the presence of an
index on "col"), with a very high (order of 8e6) startup and total
cost.

So when the local side tries to generate paths, it has the choice of
using a remote-ordered path with startup cost 8e6, or a local top-1
sort on top of an unordered remote path, which has a total cost on the
order of 1.5e6 in this case; cheaper than the remote sort because this
only needs to do top-1, while the remote is sorting millions of rows
and would probably spill to disk.

A simple test at hand showed that (on a unix-domain connection):

=# explain (verbose on, analyze on) select * from ft1 order by a;

Foreign Scan on public.ft1 (cost=9847.82..17097.82 rows=100000 width=4)
(actual time=195.861..515.747 rows=100000 loops=1)

=# explain (verbose on, analyze on) select * from ft1;

Foreign Scan on public.ft1 (cost=100.00..8543.00 rows=100000 width=4)
(actual time=0.659..399.427 rows=100000 loops=1)

The cost is apaprently wrong. On my environment fdw_startup_cost
= 45 and fdw_tuple_cost = 0.2 gave me an even cost/actual time
ratio *for these queries*. (hard coded default is 100 and
0.01. Of course this disucussion is ignoring the accuracy of
local-execution estimate.)

=# explain (verbose on, analyze on) select * from ft1 order by a;

Foreign Scan on public.ft1 (cost=9792.82..31042.82 rows=100000 width=4)
(actual time=201.493..533.913 rows=100000 loops=1)

=# explain (verbose on, analyze on) select * from ft1;

Foreign Scan on public.ft1 (cost=45.00..22488.00 rows=100000 width=4)
(actual time=0.837..484.469 rows=100000 loops=1)

This gave me a remote-sorted plan for "select * from ft1 order by
a limit 1". (But also gave me a remote-sorted plan without a
LIMIT..)

However, when it comes to actual execution, postgres_fdw opens a cursor
for the remote query, which means that cursor_tuple_fraction will come
into play. As far as I can tell, this is not set anywhere, so this means
that the plan that actually gets run on the remote is likely to have
_completely_ different costs from those returned by the EXPLAINs. In
particular, in this case the fast-start index-scan plan for the ORDER BY
remote query is clearly being chosen when use_remote_estimates is off
(since the query completes in 15ms rather than 150 seconds).

One possibility: would it be worth adding an option to EXPLAIN that
makes it assume cursor_tuple_fraction?

Cursor fraction seems working since the foreign scan with remote
sort has a cost with different startup and total values. The
problem seems to be a too-small tuple cost.

So, we might have a room for improvement on
DEFAULT_FDW_STARTUP_COST, DEFAULT_FDW_TUPLE_COST and
DEFAULT_FDW_SORT_MULTIPLIER settings.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Tom Lane

tgl@sss.pgh.pa.us

almost 8 years ago

In reply to: Andrew Gierth (#1)

Re: Problems with plan estimates in postgres_fdw

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

[ postgres_fdw is not smart about exploiting fast-start plans ]

Yeah, that's basically not accounted for at all in the current design.

One possibility: would it be worth adding an option to EXPLAIN that
makes it assume cursor_tuple_fraction?

[ handwaving ahead ]

I wonder whether it could be done without destroying postgres_fdw's
support for old servers, by instead including a LIMIT in the query sent
for explaining. The trick would be to know what value to put as the
limit, though. It'd be easy to do if we were willing to explain the query
twice (the second time with a limit chosen as a fraction of the rowcount
seen the first time), but man that's an expensive solution.

Another component of any real fix here would be to issue "SET
cursor_tuple_fraction" before opening the execution cursor, so as to
ensure that we actually get an appropriate plan on the remote side.

If we could tell whether there's going to be any use in fast-start plans,
it might make sense to build two scan paths for a foreign table, one based
on a full-table scan and one based on EXPLAIN ... LIMIT 1. This still
means two explain requests, which is why I'm not thrilled about doing it
unless there's a high probability of the extra explain being useful.

regards, tom lane

Problems with plan estimates in postgres_fdw

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments: