Query plan degradation 8.2 --> 8.3

Started by Josh Berkusover 18 years ago7 messages
#1Josh Berkus
josh@agliodbs.com

All,

I now have a simple test case which shows significant performance
degradation on 8.3devel for a specific query, apparenly due to an
unnecessary call to Top-N sort. I've tried to forward the test case to
the lists but the package is 3.5m, so I'm putting it on pgFoundry instead:

If you run the example query (badsql.sql), you get the following for 8.2,
which is a Good Plan:

Limit (cost=0.00..24.04 rows=20 width=64) (actual time=0.091..0.129
rows=20 loops=1)
-> Index Scan using abc_idx_t_s_symb_dts on abc (cost=0.00..908.55
rows=756 width=64) (actual time=0.089..0.123 rows=20 loops=1)
Index Cond: (((t_s_symb)::text = 'SYMPRA'::text) AND (t_dts >=
'2006-02-15 00:00:00'::timestamp without time zone) AND (t_dts <=
'2007-06-28 00:00:00'::timestamp without time zone))
Filter: ((t_st_id = 'CMPT'::bpchar) AND (t_ca_id <=
43000050000::bigint))
Total runtime: 0.222 ms

However, running the same against 8.3-snapshot-5-28 gives you:

Limit (cost=631.85..631.90 rows=20 width=55) (actual time=2.325..2.336
rows=20 loops=1)
-> Sort (cost=631.85..633.61 rows=704 width=55) (actual
time=2.323..2.326 rows=20 loops=1)
Sort Key: t_dts
Sort Method: top-N heapsort Memory: 27kB
-> Bitmap Heap Scan on abc (cost=25.23..613.12 rows=704
width=55) (actual time=0.537..1.477 rows=843 loops=1)
Recheck Cond: (((t_s_symb)::text = 'SYMPRA'::text) AND
(t_dts >= '2006-02-15 00:00:00'::timestamp without time zone) AND (t_dts
<= '2007-06-28 00:00:00'::timestamp without time zone))
Filter: ((t_ca_id <= 43000050000::bigint) AND (t_st_id =
'CMPT'::bpchar))
-> Bitmap Index Scan on abc_idx_t_s_symb_dts
(cost=0.00..25.06 rows=704 width=0) (actual time=0.506..0.506 rows=843
loops=1)
Index Cond: (((t_s_symb)::text = 'SYMPRA'::text) AND
(t_dts >= '2006-02-15 00:00:00'::timestamp without time zone) AND (t_dts
<= '2007-06-28 00:00:00'::timestamp without time zone))
Total runtime: 2.460 ms

.... and if you disable bitmap:

Limit (cost=812.35..812.40 rows=20 width=55) (actual time=2.363..2.372
rows=20 loops=1)
-> Sort (cost=812.35..814.11 rows=704 width=55) (actual
time=2.360..2.363 rows=20 loops=1)
Sort Key: t_dts
Sort Method: top-N heapsort Memory: 27kB
-> Index Scan using abc_idx_t_s_symb_dts on abc
(cost=0.00..793.62 rows=704 width=55) (actual time=0.080..1.567 rows=843
loops=1)
Index Cond: (((t_s_symb)::text = 'SYMPRA'::text) AND (t_dts

= '2006-02-15 00:00:00'::timestamp without time zone) AND (t_dts <=

'2007-06-28 00:00:00'::timestamp without time zone))
Filter: ((t_ca_id <= 43000050000::bigint) AND (t_st_id =
'CMPT'::bpchar))
Total runtime: 2.475 ms

The problem appears to be that top-N heapsort is being called even when
it's not needed, such as immediately after an indexscan.

Is my assessment correct?

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

#2Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#1)
Re: Query plan degradation 8.2 --> 8.3

On Wednesday 30 May 2007 15:51, Josh Berkus wrote:

I now have a simple test case which shows significant performance
degradation on 8.3devel for a specific query, apparenly due to an
unnecessary call to Top-N sort.  I've tried to forward the test case to
the lists but the package is 3.5m, so I'm putting it on pgFoundry
instead:

...ooops, how about a link for that:
http://pgfoundry.org/docman/view.php/1000041/767/pg83_sortbug.tar.Z

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#1)
Re: Query plan degradation 8.2 --> 8.3

Josh Berkus <josh@agliodbs.com> writes:

I now have a simple test case which shows significant performance
degradation on 8.3devel for a specific query, apparenly due to an
unnecessary call to Top-N sort.

It does the right thing if t_s_symb is declared as text instead of
varchar. When it's varchar, even setting enable_sort off won't make
it pick the right plan, which suggests that it fails to recognize that
the index can match the query's ORDER BY. I'm guessing I overlooked
a binary-compatibility case when I rejiggered the handling of PathKeys
in connection with the NULLS FIRST/LAST stuff. No time to look deeper
right now.

regards, tom lane

#4Gregory Stark
stark@enterprisedb.com
In reply to: Josh Berkus (#2)
Re: Query plan degradation 8.2 --> 8.3

"Josh Berkus" <josh@agliodbs.com> writes:

On Wednesday 30 May 2007 15:51, Josh Berkus wrote:

I now have a simple test case which shows significant performance
degradation on 8.3devel for a specific query, apparenly due to an
unnecessary call to Top-N sort.  I've tried to forward the test case to
the lists but the package is 3.5m, so I'm putting it on pgFoundry
instead:

How recently did you check out your 8.3 tree?

When I run it I get a bitmap index scan which I think might mean you're
suffering from the same problem Tom found and fixed a few days ago. The
planner is finding the bitmap index scan with the sort is the best possible
plan but then discarding that option later leaving it with a suboptimal
choice.

The exact manifestation is somewhat different from what other people saw. iirc
they saw sequential scans when there was an index scan available. But I
suspect it's the same thing going on.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

#5Josh Berkus
josh@agliodbs.com
In reply to: Gregory Stark (#4)
Re: Query plan degradation 8.2 --> 8.3

Greg,

How recently did you check out your 8.3 tree?

It's the snapshot from 5/28, which means it was pulled from CVS on 5/27.
So, recent.

When I run it I get a bitmap index scan which I think might mean you're
suffering from the same problem Tom found and fixed a few days ago. The
planner is finding the bitmap index scan with the sort is the best
possible plan but then discarding that option later leaving it with a
suboptimal choice.

Apparently. I'll do another build and check.

It does the right thing if t_s_symb is declared as text instead of
varchar. When it's varchar, even setting enable_sort off won't make
it pick the right plan, which suggests that it fails to recognize that
the index can match the query's ORDER BY. I'm guessing I overlooked
a binary-compatibility case when I rejiggered the handling of PathKeys
in connection with the NULLS FIRST/LAST stuff. No time to look deeper
right now.

Yeah, that looks like the case. We'll move it to TEXT for the tests right
now, but I'll make sure we don't forget this bug during beta. Thanks!

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#5)
Re: Query plan degradation 8.2 --> 8.3

Josh Berkus <josh@agliodbs.com> writes:

It does the right thing if t_s_symb is declared as text instead of
varchar. When it's varchar, even setting enable_sort off won't make
it pick the right plan, which suggests that it fails to recognize that
the index can match the query's ORDER BY. I'm guessing I overlooked
a binary-compatibility case when I rejiggered the handling of PathKeys
in connection with the NULLS FIRST/LAST stuff. No time to look deeper
right now.

Yeah, that looks like the case. We'll move it to TEXT for the tests right
now, but I'll make sure we don't forget this bug during beta. Thanks!

I've applied a patch that fixes this case, but I'm not yet 100%
convinced that there are no other cases where it'll prevent matching
things that should match. Please test.

regards, tom lane

#7Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#6)
Re: Query plan degradation 8.2 --> 8.3

Tom,

I've applied a patch that fixes this case, but I'm not yet 100%
convinced that there are no other cases where it'll prevent matching
things that should match. Please test.

Will do. We're having trouble building from CVS on the TPCE test rig, so
it'll wait for tommorrow's snapshot.

--
Josh Berkus
PostgreSQL @ Sun
San Francisco