Index not being used

Started by Shane Wegneralmost 22 years ago3 messagesgeneral
Jump to latest
#1Shane Wegner
shane-keyword-pgsql.a1e0d9@cm.nu

Hello,

I am trying to speed up a select which is taking around a
second to execute. It's a very common query though so the
faster I can get it the better. The query which I have
included below used a lot of seq scans so I created a
multicolumn index on books covering the values in the
select, ran analyze and it isn't being used. If someone
wouldn't mind taking a look and letting me know what I'm
doing wrong, I'd appreciate it.

select query:
select books.id as book_id,title,isbn,
publisher, publishers.id as publisher_id,
place, places.id as place_id,
illustrator, illustrators.id as illustrator_id,
edition, editions.id as edition_id,
type, types.id as type_id,
category, categories.id as category_id,
year,
binding, binding.id as binding_id,
weight,
books.price as price,discount,description,comments,books.status,
ctime,mtime
from books
left join publishers on publisher_id=publishers.id
left join places on place_id=places.id
left join illustrators on illustrator_id=illustrators.id
left join editions on edition_id=editions.id
left join types on type_id=types.id
left join categories on category_id=categories.id
left join binding on binding_id=binding.id
,orders_and_books where order_id = 753 and book_id = books.id

The order_id will vary.
The index I created reads:
create index books_idx1 on books(publisher_id,place_id,illustrator_id,
edition_id,type_id,category_id,binding_id,id);

The other ids in the joining tables are all serial values
and are primary keys so are indexed automatically.

explain analyze output:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=29.96..1282.10 rows=2 width=999) (actual time=329.336..609.402 rows=1 loops=1)
Hash Cond: ("outer".id = "inner".book_id)
-> Hash Left Join (cost=24.65..1248.12 rows=3819 width=999) (actual time=6.344..602.450 rows=3819 loops=1)
Hash Cond: ("outer".binding_id = "inner".id)
-> Hash Left Join (cost=23.59..1189.90 rows=3819 width=982) (actual time=6.098..511.002 rows=3819 loops=1)
Hash Cond: ("outer".category_id = "inner".id)
-> Hash Left Join (cost=22.31..1138.06 rows=3819 width=962) (actual time=5.922..428.875 rows=3819 loops=1)
Hash Cond: ("outer".type_id = "inner".id)
-> Hash Left Join (cost=21.30..1117.92 rows=3819 width=931) (actual time=5.857..359.763 rows=3819 loops=1)
Hash Cond: ("outer".edition_id = "inner".id)
-> Hash Left Join (cost=20.10..1097.21 rows=3819 width=916) (actual time=5.703..292.123 rows=3819 loops=1)
Hash Cond: ("outer".illustrator_id = "inner".id)
-> Hash Left Join (cost=18.09..1075.19 rows=3819 width=880) (actual time=5.190..224.422 rows=3819 loops=1)
Hash Cond: ("outer".place_id = "inner".id)
-> Hash Left Join (cost=13.01..1015.36 rows=3819 width=855) (actual time=3.721..148.384 rows=3819 loops=1)
Hash Cond: ("outer".publisher_id = "inner".id)
-> Seq Scan on books (cost=0.00..946.19 rows=3819 width=828) (actual time=0.034..70.883 rows=3819 loops=1)
-> Hash (cost=11.41..11.41 rows=641 width=31) (actual time=3.611..3.611 rows=0 loops=1)
-> Seq Scan on publishers (cost=0.00..11.41 rows=641 width=31) (actual time=0.010..2.268 rows=641 loops=1)
-> Hash (cost=4.46..4.46 rows=246 width=29) (actual time=1.417..1.417 rows=0 loops=1)
-> Seq Scan on places (cost=0.00..4.46 rows=246 width=29) (actual time=0.008..0.895 rows=246 loops=1)
-> Hash (cost=1.81..1.81 rows=81 width=40) (actual time=0.468..0.468 rows=0 loops=1)
-> Seq Scan on illustrators (cost=0.00..1.81 rows=81 width=40) (actual time=0.008..0.291 rows=81 loops=1)
-> Hash (cost=1.16..1.16 rows=16 width=19) (actual time=0.106..0.106 rows=0 loops=1)
-> Seq Scan on editions (cost=0.00..1.16 rows=16 width=19) (actual time=0.008..0.062 rows=16 loops=1)
-> Hash (cost=1.01..1.01 rows=1 width=35) (actual time=0.015..0.015 rows=0 loops=1)
-> Seq Scan on types (cost=0.00..1.01 rows=1 width=35) (actual time=0.008..0.010 rows=1 loops=1)
-> Hash (cost=1.22..1.22 rows=22 width=24) (actual time=0.126..0.126 rows=0 loops=1)
-> Seq Scan on categories (cost=0.00..1.22 rows=22 width=24) (actual time=0.009..0.082 rows=22 loops=1)
-> Hash (cost=1.05..1.05 rows=5 width=21) (actual time=0.043..0.043 rows=0 loops=1)
-> Seq Scan on binding (cost=0.00..1.05 rows=5 width=21) (actual time=0.014..0.031 rows=5 loops=1)
-> Hash (cost=5.30..5.30 rows=2 width=4) (actual time=0.070..0.070 rows=0 loops=1)
-> Index Scan using orders_and_books_pkey on orders_and_books (cost=0.00..5.30 rows=2 width=4) (actual time=0.057..0.062 rows=1 loops=1)
Index Cond: (order_id = 753)
Total runtime: 610.245 ms
(35 rows)

Thanks,
Shane

--
Shane Wegner
http://www.cm.nu/~shane/

#2Manfred Koizar
mkoi-pg@aon.at
In reply to: Shane Wegner (#1)
Re: Index not being used

On Tue, 15 Jun 2004 13:06:40 -0700, Shane Wegner
<shane-keyword-pgsql.a1e0d9@cm.nu> wrote:

The index I created reads:
create index books_idx1 on books(publisher_id,place_id,illustrator_id,
edition_id,type_id,category_id,binding_id,id);

This index is useless, drop it. Is there an index on books(id)?

The other ids in the joining tables are all serial values
and are primary keys so are indexed automatically.

explain analyze output: [lots of seq scans and hash joins]

Try

EXPLAIN ANALYSE
SELECT *
FROM orders_and_books AS o
INNER JOIN books AS b ON o.book_id = b.id
WHERE o.order_id = 753;

This should give a nested loop using primary key index scans on both
tables. Then add

LEFT JOIN publishers ON publisher_id=publishers.id
LEFT JOIN places ON place_id=places.id
...

one by one until the plan changes to hash joins again and show us the
results.

Servus
Manfred

#3Shane Wegner
shane-keyword-pgsql.a1e0d9@cm.nu
In reply to: Manfred Koizar (#2)
Re: Index not being used

On Thu, Jun 17, 2004 at 06:38:08AM +0200, Manfred Koizar wrote:

This index is useless, drop it. Is there an index on books(id)?

Yes it's a primary key.

Try
EXPLAIN ANALYSE
SELECT *
FROM orders_and_books AS o
INNER JOIN books AS b ON o.book_id = b.id
WHERE o.order_id = 753;

This should give a nested loop using primary key index scans on both
tables. Then add

LEFT JOIN publishers ON publisher_id=publishers.id
LEFT JOIN places ON place_id=places.id
...

one by one until the plan changes to hash joins again and show us the
results.

The inner join really speeds up the script. From it's
current 11 second runtime to 0.3 seconds. I'm curious for
future optimization efforts as to why this is the case.
That is why an inner join as is shown in your query yeilds
such a more efficient query than a regular join with a
where clause. Looks like I have some reading to do.

Thanks for your help with this. Very dramatic improvement.

S

--
Shane Wegner
http://www.cm.nu/~shane/