Planner picking topsey turvey plan?

Started by Glyn Astillover 17 years ago6 messagesgeneral
Jump to latest
#1Glyn Astill
glynastill@yahoo.co.uk

Hi people,

Does anyone know how I can change what I'm doing to get pgsql to pick a better plan?

I'll explain what I've done below but please forgive me if I interpret the plans wrong as I try to describe, I've split it into 4 points to try and ease the mess of pasting in the plans..

1) I've created a view "orders" that joins two tables "credit" and "mult_ord" together as below:

CREATE VIEW orders AS
SELECT b.mult_ref, a.show, MIN(a.transno) AS "lead_transno", COUNT(a.transno) AS "parts", SUM(a.tickets) AS "items", SUM(a.value) AS "value"
FROM (credit a LEFT OUTER JOIN mult_ord b ON a.transno = b.transno)
GROUP BY b.mult_ref, a.show;

2) And an explain on that view comes out as below, it's using the correct index for the field show on "credit" which doesn't look too bad to me:

DB=# explain select a.artist, a.date, b.mult_ref, b.items, b.parts from (show a inner join orders b on a.code = b.show)
where b.show = 357600;
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Nested Loop (cost=15050.79..15099.68 rows=1013 width=70)
-> Index Scan using show_index01 on show a (cost=0.00..8.37 rows=1 width=26)
Index Cond: (code = 357600::numeric)
-> HashAggregate (cost=15050.79..15071.05 rows=1013 width=39)
-> Nested Loop Left Join (cost=0.00..15035.60 rows=1013 width=39)
-> Index Scan using credit_index04 on credit a (cost=0.00..4027.30 rows=1013 width=31)
Index Cond: (show = 357600::numeric)
-> Index Scan using mult_ord_index02 on mult_ord b (cost=0.00..10.85 rows=1 width=17)
Index Cond: (a.transno = b.transno)
(9 rows)

3) Then I have a table called "show" that is indexed on the artist field, and a plan for listing the shows for an artist is as below, again this doesn't look too bad to me, as it's using the index on artist.

DB=# explain select * from show where artist = 'ALKALINE TRIO';
QUERY PLAN
-----------------------------------------------------------------------------
Bitmap Heap Scan on show (cost=9.59..582.41 rows=153 width=348)
Recheck Cond: ((artist)::text = 'ALKALINE TRIO'::text)
-> Bitmap Index Scan on show_index07 (cost=0.00..9.56 rows=153 width=0)
Index Cond: ((artist)::text = 'ALKALINE TRIO'::text)
(4 rows)

4) So.. I guess I can join "show" -> "orders", expecting an index scan on "show" for the artist, then an index scan on "orders" for each show.

However it seems the planner has other ideas, it just looks backwards to me:

DB=# explain select a.artist, a.date, b.mult_ref, b.items, b.parts from (show a inner join orders b on a.code = b.show)
where artist = 'ALKALINE TRIO';
QUERY PLAN
----------------------------------------------------------------------------------------------------
Hash Join (cost=1576872.96..1786175.37 rows=1689 width=70)
Hash Cond: (a.show = a.code)
-> GroupAggregate (cost=1576288.64..1729424.39 rows=4083620 width=39)
-> Sort (cost=1576288.64..1586497.69 rows=4083620 width=39)
Sort Key: b.mult_ref, a.show
-> Hash Left Join (cost=321406.05..792886.22 rows=4083620 width=39)
Hash Cond: (a.transno = b.transno)
-> Seq Scan on credit a (cost=0.00..267337.20 rows=4083620 width=31)
-> Hash (cost=160588.80..160588.80 rows=8759380 width=17)
-> Seq Scan on mult_ord b (cost=0.00..160588.80 rows=8759380 width=17)
-> Hash (cost=582.41..582.41 rows=153 width=26)
-> Bitmap Heap Scan on show a (cost=9.59..582.41 rows=153 width=26)
Recheck Cond: ((artist)::text = 'ALKALINE TRIO'::text)
-> Bitmap Index Scan on show_index07 (cost=0.00..9.56 rows=153 width=0)
Index Cond: ((artist)::text = 'ALKALINE TRIO'::text)
(15 rows)

Any idea if I can get around this?

#2Glyn Astill
glynastill@yahoo.co.uk
In reply to: Glyn Astill (#1)
Re: [GENERAL] Planner picking topsey turvey plan?

Anyone?

--- On Fri, 5/12/08, Glyn Astill <glynastill@yahoo.co.uk> wrote:
Show quoted text

From: Glyn Astill <glynastill@yahoo.co.uk>
Subject: [GENERAL] Planner picking topsey turvey plan?
To: pgsql-general@postgresql.org
Date: Friday, 5 December, 2008, 2:23 PM
Hi people,

Does anyone know how I can change what I'm doing to get
pgsql to pick a better plan?

I'll explain what I've done below but please
forgive me if I interpret the plans wrong as I try to
describe, I've split it into 4 points to try and ease
the mess of pasting in the plans..

1) I've created a view "orders" that joins
two tables "credit" and "mult_ord"
together as below:

CREATE VIEW orders AS
SELECT b.mult_ref, a.show, MIN(a.transno) AS
"lead_transno", COUNT(a.transno) AS
"parts", SUM(a.tickets) AS "items",
SUM(a.value) AS "value"
FROM (credit a LEFT OUTER JOIN mult_ord b ON a.transno =
b.transno)
GROUP BY b.mult_ref, a.show;

2) And an explain on that view comes out as below, it's
using the correct index for the field show on
"credit" which doesn't look too bad to me:

DB=# explain select a.artist, a.date, b.mult_ref, b.items,
b.parts from (show a inner join orders b on a.code = b.show)
where b.show = 357600;
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Nested Loop (cost=15050.79..15099.68 rows=1013 width=70)
-> Index Scan using show_index01 on show a
(cost=0.00..8.37 rows=1 width=26)
Index Cond: (code = 357600::numeric)
-> HashAggregate (cost=15050.79..15071.05 rows=1013
width=39)
-> Nested Loop Left Join (cost=0.00..15035.60
rows=1013 width=39)
-> Index Scan using credit_index04 on
credit a (cost=0.00..4027.30 rows=1013 width=31)
Index Cond: (show = 357600::numeric)
-> Index Scan using mult_ord_index02 on
mult_ord b (cost=0.00..10.85 rows=1 width=17)
Index Cond: (a.transno = b.transno)
(9 rows)

3) Then I have a table called "show" that is
indexed on the artist field, and a plan for listing the
shows for an artist is as below, again this doesn't look
too bad to me, as it's using the index on artist.

DB=# explain select * from show where artist =
'ALKALINE TRIO';
QUERY PLAN
-----------------------------------------------------------------------------
Bitmap Heap Scan on show (cost=9.59..582.41 rows=153
width=348)
Recheck Cond: ((artist)::text = 'ALKALINE
TRIO'::text)
-> Bitmap Index Scan on show_index07
(cost=0.00..9.56 rows=153 width=0)
Index Cond: ((artist)::text = 'ALKALINE
TRIO'::text)
(4 rows)

4) So.. I guess I can join "show" ->
"orders", expecting an index scan on
"show" for the artist, then an index scan on
"orders" for each show.

However it seems the planner has other ideas, it just looks
backwards to me:

DB=# explain select a.artist, a.date, b.mult_ref, b.items,
b.parts from (show a inner join orders b on a.code = b.show)
where artist = 'ALKALINE TRIO';
QUERY PLAN
----------------------------------------------------------------------------------------------------
Hash Join (cost=1576872.96..1786175.37 rows=1689
width=70)
Hash Cond: (a.show = a.code)
-> GroupAggregate (cost=1576288.64..1729424.39
rows=4083620 width=39)
-> Sort (cost=1576288.64..1586497.69
rows=4083620 width=39)
Sort Key: b.mult_ref, a.show
-> Hash Left Join
(cost=321406.05..792886.22 rows=4083620 width=39)
Hash Cond: (a.transno = b.transno)
-> Seq Scan on credit a
(cost=0.00..267337.20 rows=4083620 width=31)
-> Hash
(cost=160588.80..160588.80 rows=8759380 width=17)
-> Seq Scan on mult_ord b
(cost=0.00..160588.80 rows=8759380 width=17)
-> Hash (cost=582.41..582.41 rows=153 width=26)
-> Bitmap Heap Scan on show a
(cost=9.59..582.41 rows=153 width=26)
Recheck Cond: ((artist)::text =
'ALKALINE TRIO'::text)
-> Bitmap Index Scan on show_index07
(cost=0.00..9.56 rows=153 width=0)
Index Cond: ((artist)::text =
'ALKALINE TRIO'::text)
(15 rows)

Any idea if I can get around this?

--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Scott Marlowe
scott.marlowe@gmail.com
In reply to: Glyn Astill (#2)
Re: [GENERAL] Planner picking topsey turvey plan?

what does explain analyze yourqueryhere say?

On Sat, Dec 6, 2008 at 1:33 PM, Glyn Astill <glynastill@yahoo.co.uk> wrote:

Anyone?

--- On Fri, 5/12/08, Glyn Astill <glynastill@yahoo.co.uk> wrote:

From: Glyn Astill <glynastill@yahoo.co.uk>
Subject: [GENERAL] Planner picking topsey turvey plan?
To: pgsql-general@postgresql.org
Date: Friday, 5 December, 2008, 2:23 PM
Hi people,

Does anyone know how I can change what I'm doing to get
pgsql to pick a better plan?

I'll explain what I've done below but please
forgive me if I interpret the plans wrong as I try to
describe, I've split it into 4 points to try and ease
the mess of pasting in the plans..

1) I've created a view "orders" that joins
two tables "credit" and "mult_ord"
together as below:

CREATE VIEW orders AS
SELECT b.mult_ref, a.show, MIN(a.transno) AS
"lead_transno", COUNT(a.transno) AS
"parts", SUM(a.tickets) AS "items",
SUM(a.value) AS "value"
FROM (credit a LEFT OUTER JOIN mult_ord b ON a.transno =
b.transno)
GROUP BY b.mult_ref, a.show;

2) And an explain on that view comes out as below, it's
using the correct index for the field show on
"credit" which doesn't look too bad to me:

DB=# explain select a.artist, a.date, b.mult_ref, b.items,
b.parts from (show a inner join orders b on a.code = b.show)
where b.show = 357600;
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Nested Loop (cost=15050.79..15099.68 rows=1013 width=70)
-> Index Scan using show_index01 on show a
(cost=0.00..8.37 rows=1 width=26)
Index Cond: (code = 357600::numeric)
-> HashAggregate (cost=15050.79..15071.05 rows=1013
width=39)
-> Nested Loop Left Join (cost=0.00..15035.60
rows=1013 width=39)
-> Index Scan using credit_index04 on
credit a (cost=0.00..4027.30 rows=1013 width=31)
Index Cond: (show = 357600::numeric)
-> Index Scan using mult_ord_index02 on
mult_ord b (cost=0.00..10.85 rows=1 width=17)
Index Cond: (a.transno = b.transno)
(9 rows)

3) Then I have a table called "show" that is
indexed on the artist field, and a plan for listing the
shows for an artist is as below, again this doesn't look
too bad to me, as it's using the index on artist.

DB=# explain select * from show where artist =
'ALKALINE TRIO';
QUERY PLAN
-----------------------------------------------------------------------------
Bitmap Heap Scan on show (cost=9.59..582.41 rows=153
width=348)
Recheck Cond: ((artist)::text = 'ALKALINE
TRIO'::text)
-> Bitmap Index Scan on show_index07
(cost=0.00..9.56 rows=153 width=0)
Index Cond: ((artist)::text = 'ALKALINE
TRIO'::text)
(4 rows)

4) So.. I guess I can join "show" ->
"orders", expecting an index scan on
"show" for the artist, then an index scan on
"orders" for each show.

However it seems the planner has other ideas, it just looks
backwards to me:

DB=# explain select a.artist, a.date, b.mult_ref, b.items,
b.parts from (show a inner join orders b on a.code = b.show)
where artist = 'ALKALINE TRIO';
QUERY PLAN
----------------------------------------------------------------------------------------------------
Hash Join (cost=1576872.96..1786175.37 rows=1689
width=70)
Hash Cond: (a.show = a.code)
-> GroupAggregate (cost=1576288.64..1729424.39
rows=4083620 width=39)
-> Sort (cost=1576288.64..1586497.69
rows=4083620 width=39)
Sort Key: b.mult_ref, a.show
-> Hash Left Join
(cost=321406.05..792886.22 rows=4083620 width=39)
Hash Cond: (a.transno = b.transno)
-> Seq Scan on credit a
(cost=0.00..267337.20 rows=4083620 width=31)
-> Hash
(cost=160588.80..160588.80 rows=8759380 width=17)
-> Seq Scan on mult_ord b
(cost=0.00..160588.80 rows=8759380 width=17)
-> Hash (cost=582.41..582.41 rows=153 width=26)
-> Bitmap Heap Scan on show a
(cost=9.59..582.41 rows=153 width=26)
Recheck Cond: ((artist)::text =
'ALKALINE TRIO'::text)
-> Bitmap Index Scan on show_index07
(cost=0.00..9.56 rows=153 width=0)
Index Cond: ((artist)::text =
'ALKALINE TRIO'::text)
(15 rows)

Any idea if I can get around this?

--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

--
When fascism comes to America, it will be draped in a flag and
carrying a cross - Sinclair Lewis

#4Glyn Astill
glynastill@yahoo.co.uk
In reply to: Scott Marlowe (#3)
Re: [GENERAL] Planner picking topsey turvey plan?

Explain analyze below,

DB=# explain analyze select a.artist, a.date, b.mult_ref, b.items, b.parts from (show a inner join orders b on a.code = b.show) where artist = 'ALKALINE TRIO'; QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=1583955.94..1794350.36 rows=1552 width=70) (actual time=231496.678..243243.711 rows=892 loops=1)
Hash Cond: (a.show = a.code)
-> GroupAggregate (cost=1583418.91..1737354.68 rows=4104954 width=40) (actual time=223204.620..241813.067 rows=2856379 loops=1)
-> Sort (cost=1583418.91..1593681.29 rows=4104954 width=40) (actual time=223204.567..231296.896 rows=4104964 loops=1)
Sort Key: b.mult_ref, a.show
Sort Method: external merge Disk: 224328kB
-> Hash Left Join (cost=321999.38..795776.58 rows=4104954 width=40) (actual time=14850.320..165804.778 rows=4104964 loops=1)
Hash Cond: (a.transno = b.transno)
-> Seq Scan on credit a (cost=0.00..268740.54 rows=4104954 width=31) (actual time=7.563..76901.901 rows=4104954 loops=1)
-> Hash (cost=160885.28..160885.28 rows=8775528 width=18) (actual time=14831.810..14831.810 rows=8775528 loops=1)
-> Seq Scan on mult_ord b (cost=0.00..160885.28 rows=8775528 width=18) (actual time=4.716..4952.254 rows=8775528 loops=1)
-> Hash (cost=535.28..535.28 rows=140 width=26) (actual time=228.599..228.599 rows=54 loops=1)
-> Bitmap Heap Scan on show a (cost=9.49..535.28 rows=140 width=26) (actual time=77.723..228.488 rows=54 loops=1)
Recheck Cond: ((artist)::text = 'ALKALINE TRIO'::text)
-> Bitmap Index Scan on show_index07 (cost=0.00..9.46 rows=140 width=0) (actual time=62.228..62.228 rows=54 loops=1)
Index Cond: ((artist)::text = 'ALKALINE TRIO'::text)
Total runtime: 243367.640 ms

--- On Sat, 6/12/08, Scott Marlowe <scott.marlowe@gmail.com> wrote:
Show quoted text

From: Scott Marlowe <scott.marlowe@gmail.com>
Subject: Re: [ADMIN] [GENERAL] Planner picking topsey turvey plan?
To: glynastill@yahoo.co.uk
Cc: pgsql-general@postgresql.org, pgsql-admin@postgresql.org
Date: Saturday, 6 December, 2008, 8:35 PM
what does explain analyze yourqueryhere say?

On Sat, Dec 6, 2008 at 1:33 PM, Glyn Astill
<glynastill@yahoo.co.uk> wrote:

Anyone?

--- On Fri, 5/12/08, Glyn Astill

<glynastill@yahoo.co.uk> wrote:

From: Glyn Astill <glynastill@yahoo.co.uk>
Subject: [GENERAL] Planner picking topsey turvey

plan?

To: pgsql-general@postgresql.org
Date: Friday, 5 December, 2008, 2:23 PM
Hi people,

Does anyone know how I can change what I'm

doing to get

pgsql to pick a better plan?

I'll explain what I've done below but

please

forgive me if I interpret the plans wrong as I try

to

describe, I've split it into 4 points to try

and ease

the mess of pasting in the plans..

1) I've created a view "orders" that

joins

two tables "credit" and

"mult_ord"

together as below:

CREATE VIEW orders AS
SELECT b.mult_ref, a.show, MIN(a.transno) AS
"lead_transno", COUNT(a.transno) AS
"parts", SUM(a.tickets) AS

"items",

SUM(a.value) AS "value"
FROM (credit a LEFT OUTER JOIN mult_ord b ON

a.transno =

b.transno)
GROUP BY b.mult_ref, a.show;

2) And an explain on that view comes out as below,

it's

using the correct index for the field show on
"credit" which doesn't look too bad

to me:

DB=# explain select a.artist, a.date, b.mult_ref,

b.items,

b.parts from (show a inner join orders b on a.code

= b.show)

where b.show = 357600;

QUERY PLAN

--------------------------------------------------------------------------------------------------------

Nested Loop (cost=15050.79..15099.68 rows=1013

width=70)

-> Index Scan using show_index01 on show a
(cost=0.00..8.37 rows=1 width=26)
Index Cond: (code = 357600::numeric)
-> HashAggregate (cost=15050.79..15071.05

rows=1013

width=39)
-> Nested Loop Left Join

(cost=0.00..15035.60

rows=1013 width=39)
-> Index Scan using

credit_index04 on

credit a (cost=0.00..4027.30 rows=1013 width=31)
Index Cond: (show =

357600::numeric)

-> Index Scan using

mult_ord_index02 on

mult_ord b (cost=0.00..10.85 rows=1 width=17)
Index Cond: (a.transno =

b.transno)

(9 rows)

3) Then I have a table called "show"

that is

indexed on the artist field, and a plan for

listing the

shows for an artist is as below, again this

doesn't look

too bad to me, as it's using the index on

artist.

DB=# explain select * from show where artist =
'ALKALINE TRIO';
QUERY PLAN

-----------------------------------------------------------------------------

Bitmap Heap Scan on show (cost=9.59..582.41

rows=153

width=348)
Recheck Cond: ((artist)::text = 'ALKALINE
TRIO'::text)
-> Bitmap Index Scan on show_index07
(cost=0.00..9.56 rows=153 width=0)
Index Cond: ((artist)::text =

'ALKALINE

TRIO'::text)
(4 rows)

4) So.. I guess I can join "show" ->
"orders", expecting an index scan on
"show" for the artist, then an index

scan on

"orders" for each show.

However it seems the planner has other ideas, it

just looks

backwards to me:

DB=# explain select a.artist, a.date, b.mult_ref,

b.items,

b.parts from (show a inner join orders b on a.code

= b.show)

where artist = 'ALKALINE TRIO';
QUERY

PLAN

----------------------------------------------------------------------------------------------------

Hash Join (cost=1576872.96..1786175.37 rows=1689
width=70)
Hash Cond: (a.show = a.code)
-> GroupAggregate

(cost=1576288.64..1729424.39

rows=4083620 width=39)
-> Sort (cost=1576288.64..1586497.69
rows=4083620 width=39)
Sort Key: b.mult_ref, a.show
-> Hash Left Join
(cost=321406.05..792886.22 rows=4083620 width=39)
Hash Cond: (a.transno =

b.transno)

-> Seq Scan on credit a
(cost=0.00..267337.20 rows=4083620 width=31)
-> Hash
(cost=160588.80..160588.80 rows=8759380 width=17)
-> Seq Scan on

mult_ord b

(cost=0.00..160588.80 rows=8759380 width=17)
-> Hash (cost=582.41..582.41 rows=153

width=26)

-> Bitmap Heap Scan on show a
(cost=9.59..582.41 rows=153 width=26)
Recheck Cond: ((artist)::text =
'ALKALINE TRIO'::text)
-> Bitmap Index Scan on

show_index07

(cost=0.00..9.56 rows=153 width=0)
Index Cond: ((artist)::text =
'ALKALINE TRIO'::text)
(15 rows)

Any idea if I can get around this?

--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Sent via pgsql-admin mailing list

(pgsql-admin@postgresql.org)

To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

--
When fascism comes to America, it will be draped in a flag
and
carrying a cross - Sinclair Lewis

--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Glyn Astill (#1)
Re: Planner picking topsey turvey plan?

Glyn Astill <glynastill@yahoo.co.uk> writes:

Does anyone know how I can change what I'm doing to get pgsql to pick a better plan?

You've provided no evidence that this is a bad plan.

In particular, the plan you seem to think would be better would involve
an estimated 153 iterations of the cost-15071 hash aggregation, which
simple arithmetic shows is more expensive than the plan it did choose.

regards, tom lane

#6Glyn Astill
glynastill@yahoo.co.uk
In reply to: Tom Lane (#5)
Re: Planner picking topsey turvey plan?

From: Tom Lane <tgl@sss.pgh.pa.us>

You've provided no evidence that this is a bad plan.

Looks like I didn't take the time to understand properly what the explains were showing.

In particular, the plan you seem to think would be better
would involve
an estimated 153 iterations of the cost-15071 hash
aggregation, which
simple arithmetic shows is more expensive than the plan it
did choose.

I'd totally missed that all the cost was in the view that I'd created.

Thanks tom