Understanding INNER JOIN versus IN subquery

Started by Robert Jamesabout 17 years ago3 messagesgeneral

srobertjames@gmail.com

about 17 years ago

I have two queries which should be equivalent. The Planner plans them
differently, although they are both about the same time. Can someone
explain why?
select word from dict
where
word in
(select substr('moon', 0, generate_series(3,length('moon'))))

select * from dict
inner join (select substr('moon', 0, generate_series(3,length('moon')))) as
m
on dict.word = m.substr

Is one preferred?

Bruce Momjian

bruce@momjian.us

about 17 years ago

In reply to: Robert James (#1)

Re: Understanding INNER JOIN versus IN subquery

On Mon, Jul 20, 2009 at 2:37 PM, Robert James<srobertjames@gmail.com> wrote:

I have two queries which should be equivalent. The Planner plans them
differently, although they are both about the same time. Can someone
explain why?

Uhm, please post the two plans and the server version. I know you've
posted them before but that was in another thread that I don't have
handy.

The short answer is that the server doesn't know that the subquery is
definitely going to produce distinct results so it doesn't know the
inner join won't produce duplicates. So it can't tell that these
queries might be equivalent.

--
greg
http://mit.edu/~gsstark/resume.pdf

Grzegorz Jaśkiewicz

gryzman@gmail.com

about 17 years ago

In reply to: Robert James (#1)

Re: Understanding INNER JOIN versus IN subquery

On Mon, Jul 20, 2009 at 2:37 PM, Robert James<srobertjames@gmail.com> wrote:

I have two queries which should be equivalent. The Planner plans them
differently, although they are both about the same time. Can someone
explain why?
select word from dict
where
word in
(select substr('moon', 0, generate_series(3,length('moon'))))

select * from dict
inner join (select substr('moon', 0, generate_series(3,length('moon')))) as
m
on dict.word = m.substr

Is one preferred?

it is hard to say from your example. But in my general expierence, I
noticed similar thing many times.
Needless to say, I prefer personally to choose JOINs instead of IN(),
because those tend to be faster on postgresql.