Understanding INNER JOIN versus IN subquery

Started by Robert Jamesover 16 years ago3 messagesgeneral
Jump to latest
#1Robert James
srobertjames@gmail.com

I have two queries which should be equivalent. The Planner plans them
differently, although they are both about the same time. Can someone
explain why?
select word from dict
where
word in
(select substr('moon', 0, generate_series(3,length('moon'))))

select * from dict
inner join (select substr('moon', 0, generate_series(3,length('moon')))) as
m
on dict.word = m.substr

Is one preferred?

#2Bruce Momjian
bruce@momjian.us
In reply to: Robert James (#1)
Re: Understanding INNER JOIN versus IN subquery

On Mon, Jul 20, 2009 at 2:37 PM, Robert James<srobertjames@gmail.com> wrote:

I have two queries which should be equivalent.  The Planner plans them
differently, although they are both about the same time.  Can someone
explain why?

Uhm, please post the two plans and the server version. I know you've
posted them before but that was in another thread that I don't have
handy.

The short answer is that the server doesn't know that the subquery is
definitely going to produce distinct results so it doesn't know the
inner join won't produce duplicates. So it can't tell that these
queries might be equivalent.

--
greg
http://mit.edu/~gsstark/resume.pdf

#3Grzegorz Jaśkiewicz
gryzman@gmail.com
In reply to: Robert James (#1)
Re: Understanding INNER JOIN versus IN subquery

On Mon, Jul 20, 2009 at 2:37 PM, Robert James<srobertjames@gmail.com> wrote:

I have two queries which should be equivalent.  The Planner plans them
differently, although they are both about the same time.  Can someone
explain why?
select word from dict
where
 word in
 (select substr('moon', 0, generate_series(3,length('moon'))))

select * from dict
inner join (select substr('moon', 0, generate_series(3,length('moon')))) as
m
on dict.word = m.substr

Is one preferred?

it is hard to say from your example. But in my general expierence, I
noticed similar thing many times.
Needless to say, I prefer personally to choose JOINs instead of IN(),
because those tend to be faster on postgresql.