Slow left outer join

Started by Alexander Stauboover 19 years ago4 messagesgeneral
Jump to latest
#1Alexander Staubo
alex@purefiction.net

I have thee primary tables: comments, events and users. There's a
join table events_comments tying comments to events. There are B-tree
indexes on every column involved, and the tables are freshly vacuumed
and analyzed;

A simple select to retrieve all comments for a given event, with an
outer join to retrieve creators:

select comments.*, users.*
from comments
left outer join users on
users.id = comments.creator_id
inner join events_comments on
comments.id = events_comments.comment_id
and events_comments.event_id = 9244

...uses the following execution plan:

QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
-------------
Hash Join (cost=1138.11..29763.30 rows=9 width=805) (actual
time=1002.348..1493.016 rows=3 loops=1)
Hash Cond: ("outer".id = "inner".comment_id)
-> Hash Left Join (cost=1119.99..28858.50 rows=177318
width=805) (actual time=28.919..1440.155 rows=177448 loops=1)
Hash Cond: ("outer".creator_id = "inner".id)
-> Seq Scan on comments (cost=0.00..9230.18 rows=177318
width=325) (actual time=0.005..117.746 rows=177448 loops=1)
-> Hash (cost=531.19..531.19 rows=9119 width=480) (actual
time=28.883..28.883 rows=9119 loops=1)
-> Seq Scan on users (cost=0.00..531.19 rows=9119
width=480) (actual time=0.003..6.555 rows=9119 loops=1)
-> Hash (cost=18.10..18.10 rows=9 width=4) (actual
time=0.050..0.050 rows=3 loops=1)
-> Index Scan using events_comments_event_id_index on
events_comments (cost=0.00..18.10 rows=9 width=4) (actual
time=0.028..0.038 rows=3 loops=1)
Index Cond: (event_id = 9244)
Total runtime: 1493.565 ms

In this case there are three comments. Every comment has a creator
user here, so an inner join will produce the same results, but the
execution plan is considerably more efficient:

QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
-------------
Nested Loop (cost=0.00..104.52 rows=8 width=805) (actual
time=0.059..0.117 rows=3 loops=1)
-> Nested Loop (cost=0.00..50.75 rows=9 width=325) (actual
time=0.045..0.078 rows=3 loops=1)
-> Index Scan using events_comments_event_id_index on
events_comments (cost=0.00..18.10 rows=9 width=4) (actual
time=0.026..0.033 rows=3 loops=1)
Index Cond: (event_id = 9244)
-> Index Scan using comments_pkey on comments
(cost=0.00..3.62 rows=1 width=325) (actual time=0.011..0.012 rows=1
loops=3)
Index Cond: (comments.id = "outer".comment_id)
-> Index Scan using users_pkey on users (cost=0.00..5.96 rows=1
width=480) (actual time=0.009..0.010 rows=1 loops=3)
Index Cond: (users.id = "outer".creator_id)
Total runtime: 0.354 ms

The outer join seems unnecessarily slow. Is there anything I can do
to speed it up?

PostgreSQL 8.1.3 on OS X (MacPorts).

Alexander.

#2Richard Broersma Jr
rabroersma@yahoo.com
In reply to: Alexander Staubo (#1)
Re: Slow left outer join

The outer join seems unnecessarily slow. Is there anything I can do
to speed it up?

PostgreSQL 8.1.3 on OS X (MacPorts).

This is just a guess on my part, but would it help if you preformed the inner join first to filter
out the most of the unwated rows that then preform the outer join with what is left over?

But just double check that results are as you expect.

Regards,

Richard Broersma Jr.

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alexander Staubo (#1)
Re: Slow left outer join

Alexander Staubo <alex@purefiction.net> writes:

select comments.*, users.*
from comments
left outer join users on
users.id = comments.creator_id
inner join events_comments on
comments.id = events_comments.comment_id
and events_comments.event_id = 9244

The outer join seems unnecessarily slow.

Try flipping the order of the two joins. PG 8.2 will be smart enough to
do that for itself, but no existing release understands when it's safe
to rearrange outer-join order.

regards, tom lane

#4Alexander Staubo
alex@purefiction.net
In reply to: Tom Lane (#3)
Re: Slow left outer join

On Nov 27, 2006, at 02:45 , Tom Lane wrote:

Alexander Staubo <alex@purefiction.net> writes:

select comments.*, users.*
from comments
left outer join users on
users.id = comments.creator_id
inner join events_comments on
comments.id = events_comments.comment_id
and events_comments.event_id = 9244

The outer join seems unnecessarily slow.

Try flipping the order of the two joins. PG 8.2 will be smart
enough to
do that for itself, but no existing release understands when it's safe
to rearrange outer-join order.

That fixes it. Thanks.

Alexander.