Using the IN clauise

Started by Terry Fielderover 23 years ago4 messagesgeneral
Jump to latest
#1Terry Fielder
terry@greatgulfhomes.com

I thought I read somewhere that it was better to avoid using the IN clause
(at least when that means doing a subselect) for efficiency reasons, but I
cannot find it on the website now.

Does anyone know where that is?

And can someone confirm the following:

This:
SELECT t1.f1
FROM t1
WHERE t1.f1 IN (select f1 from t2 AS t2 where t2.f1 = t1.f1 AND t2.f2 =
'v1')

is generally slower to run then:
SELECT t1.f1
FROM t1, t2
WHERE t1.f1 = t2.f1
AND t2.f3 = 'v1'

Thanks

Terry Fielder
Network Engineer
Great Gulf Homes / Ashton Woods Homes
terry@greatgulfhomes.com

#2Terry Fielder
terry@ashtonwoodshomes.com
In reply to: Terry Fielder (#1)
Re: Using the IN clauise

My post didn't go through the first time, so here goes again:

Show quoted text

I thought I read somewhere that it was better to avoid using
the IN clause (at least when that means doing a subselect)
for efficiency reasons, but I cannot find it on the website now.

Does anyone know where that is?

And can someone confirm the following:

This:
SELECT t1.f1
FROM t1
WHERE t1.f1 IN (select f1 from t2 AS t2 where t2.f1 = t1.f1
AND t2.f2 = 'v1')

is generally slower to run then:
SELECT t1.f1
FROM t1, t2
WHERE t1.f1 = t2.f1
AND t2.f3 = 'v1'

Thanks

Terry Fielder
Network Engineer
Great Gulf Homes / Ashton Woods Homes
terry@greatgulfhomes.com

#3Robert Treat
xzilla@users.sourceforge.net
In reply to: Terry Fielder (#2)
Re: Using the IN clauise

On Fri, 2002-10-25 at 14:40, terry@ashtonwoodshomes.com wrote:

My post didn't go through the first time, so here goes again:

I thought I read somewhere that it was better to avoid using
the IN clause (at least when that means doing a subselect)
for efficiency reasons, but I cannot find it on the website now.

Does anyone know where that is?

Do a search on the interactive docs for IN or EXISTS, I think it's
mentioned there. Either way it is generally recommended to use the
EXISTS grammer rather than the IN for performance reasons. When in
doubt, explain analyze is your friend.

And can someone confirm the following:

This:
SELECT t1.f1
FROM t1
WHERE t1.f1 IN (select f1 from t2 AS t2 where t2.f1 = t1.f1
AND t2.f2 = 'v1')

is generally slower to run then:
SELECT t1.f1
FROM t1, t2
WHERE t1.f1 = t2.f1
AND t2.f3 = 'v1'

I would believe it to be true, though depending on your table / data
structure I can't say 100% percent true. Try running explain analyze on
both queries and see what you return.

Thanks

Terry Fielder
Network Engineer
Great Gulf Homes / Ashton Woods Homes
terry@greatgulfhomes.com

Robert Treat

#4Terry Fielder
terry@greatgulfhomes.com
In reply to: Robert Treat (#3)
Re: Using the IN clauise

I could not find any doc on this, so:

In case anyone is interested, I did a test, and here are my results:
I created the tables like this:
<cfloop from="1" to="1000000" index="variables.idx">
<cfquery name="test_insert" datasource="PostgreSQL">
INSERT INTO test_in_1 (f1, v1) VALUES ('#variables.idx#',
'#Chr(RandRange(65, 90))#')
</cfquery>
<cfquery name="test_insert" datasource="PostgreSQL">
INSERT INTO test_in_2 (f1, v2) VALUES ('#variables.idx#',
'#Chr(RandRange(65, 90))#')
</cfquery>
</cfloop>

And when I ran the queries (twice to ensure that the first one wasn't worse
because of disk being cached for the second query to run on)

test_in (Records=501, Time=6279ms)
SQL =
SELECT test_in_1.v1
FROM test_in_1
WHERE test_in_1.f1 IN (SELECT test_in_2.f1 FROM test_in_2 WHERE
test_in_2.v2 = 'A');

test_in (Records=501, Time=113ms)
SQL =
SELECT test_in_1.v1
FROM test_in_1, test_in_2
WHERE test_in_1.f1 = test_in_2.f1
AND test_in_2.v2 = 'A'

test_in (Records=501, Time=5739ms)
SQL =
SELECT test_in_1.v1
FROM test_in_1
WHERE test_in_1.f1 IN (SELECT test_in_2.f1 FROM test_in_2 WHERE
test_in_2.v2 = 'A');

test_in (Records=501, Time=70ms)
SQL =
SELECT test_in_1.v1
FROM test_in_1, test_in_2
WHERE test_in_1.f1 = test_in_2.f1
AND test_in_2.v2 = 'A'

Terry Fielder
Network Engineer
Great Gulf Homes / Ashton Woods Homes
terry@greatgulfhomes.com

Show quoted text

-----Original Message-----
From: Robert Treat [mailto:xzilla@users.sourceforge.net]
Sent: Friday, October 25, 2002 2:52 PM
To: terry@ashtonwoodshomes.com
Cc: Postgres (E-mail)
Subject: Re: [GENERAL] Using the IN clauise

On Fri, 2002-10-25 at 14:40, terry@ashtonwoodshomes.com wrote:

My post didn't go through the first time, so here goes again:

I thought I read somewhere that it was better to avoid using
the IN clause (at least when that means doing a subselect)
for efficiency reasons, but I cannot find it on the website now.

Does anyone know where that is?

Do a search on the interactive docs for IN or EXISTS, I think it's
mentioned there. Either way it is generally recommended to use the
EXISTS grammer rather than the IN for performance reasons. When in
doubt, explain analyze is your friend.

And can someone confirm the following:

This:
SELECT t1.f1
FROM t1
WHERE t1.f1 IN (select f1 from t2 AS t2 where t2.f1 = t1.f1
AND t2.f2 = 'v1')

is generally slower to run then:
SELECT t1.f1
FROM t1, t2
WHERE t1.f1 = t2.f1
AND t2.f3 = 'v1'

I would believe it to be true, though depending on your table / data
structure I can't say 100% percent true. Try running explain
analyze on
both queries and see what you return.

Thanks

Terry Fielder
Network Engineer
Great Gulf Homes / Ashton Woods Homes
terry@greatgulfhomes.com

Robert Treat