IN optimization in 7.2 ?????

Started by hubert depesz lubaczewskiabout 24 years ago6 messagesgeneral
Jump to latest

hi
was there optimization for IN (SELECT ... ) usage in 7.2?
we just got query which runs 7 times as fast with in than with exists !
the query is like select field from table where id in (select ... where
fieldx in (... IN (...)));

depesz

--
hubert depesz lubaczewski http://www.depesz.pl/
------------------------------------------------------------------------
... vows are spoken to be broken ... [enjoy the silence]
... words are meaningless and forgettable ... [depeche mode]

#2Bruce Momjian
bruce@momjian.us
In reply to: hubert depesz lubaczewski (#1)
Re: IN optimization in 7.2 ?????

hubert depesz lubaczewski wrote:

hi
was there optimization for IN (SELECT ... ) usage in 7.2?
we just got query which runs 7 times as fast with in than with exists !
the query is like select field from table where id in (select ... where
fieldx in (... IN (...)));

Not that I know of.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
In reply to: Bruce Momjian (#2)
Re: IN optimization in 7.2 ?????

On Mon, Feb 18, 2002 at 09:52:19AM -0500, Bruce Momjian wrote:

Not that I know of.

strange. what could lead to this results then?
i used to think that IN (SELECT ...) is the slowest possible way at all.

depesz

p.s. of course both select's use indices, and table is vacuumed

--
hubert depesz lubaczewski http://www.depesz.pl/
------------------------------------------------------------------------
... vows are spoken to be broken ... [enjoy the silence]
... words are meaningless and forgettable ... [depeche mode]

#4Bruce Momjian
bruce@momjian.us
In reply to: hubert depesz lubaczewski (#3)
Re: IN optimization in 7.2 ?????

hubert depesz lubaczewski wrote:

On Mon, Feb 18, 2002 at 09:52:19AM -0500, Bruce Momjian wrote:

Not that I know of.

strange. what could lead to this results then?
i used to think that IN (SELECT ...) is the slowest possible way at all.

depesz

p.s. of course both select's use indices, and table is vacuumed

I have always wondered this too. Seems IN evaluates the entire query
while EXISTS evaluates it for each row, or at least that is how I
understand it, so saying EXISTS is always faster may be wrong.
Comments?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#5Mario Weilguni
mweilguni@sime.com
In reply to: Bruce Momjian (#2)
Re: IN optimization in 7.2 ?????

hi,

reading this I remembered that I had a performance problem with "IN" too,
but not with a subselect but a list of values.

I had (computed) queries like:
....
and xyz in (1,3,7,234......)
....
with up to 20 such numbers, and found out the queries to be much faster when
I wrote it this way:
....
and xyz >= 1
and xyz <= 234
and xyz in (1,3,7,234......)
....
where 1 is the minimum and 234 is the maximum of all values. The query plan
was much much better in the latter case. Don't know if something minor is
worth optimizing, but I just want let you know.

Best regards,
Mario Weilguni
----- Original Message -----
From: "Bruce Momjian" <pgman@candle.pha.pa.us>
To: <depesz@depesz.pl>
Cc: <pgsql-general@postgresql.org>
Sent: Monday, February 18, 2002 3:52 PM
Subject: Re: [GENERAL] IN optimization in 7.2 ?????

Show quoted text

hubert depesz lubaczewski wrote:

hi
was there optimization for IN (SELECT ... ) usage in 7.2?
we just got query which runs 7 times as fast with in than with exists !
the query is like select field from table where id in (select ... where
fieldx in (... IN (...)));

Not that I know of.

--
Bruce Momjian                        |  http://candle.pha.pa.us
pgman@candle.pha.pa.us               |  (610) 853-3000
+  If your life is a hard drive,     |  830 Blythe Avenue
+  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

#6Shane Wright
me@shanewright.co.uk
In reply to: Bruce Momjian (#4)
Re: IN optimization in 7.2 ?????

IIRC, the subselect is executed only once, but using IN (...) can be slow for
large result sets because the searching in IN is just a sequential scan of
that result set.

Not sure why it'd be faster to crop of the first and last ones though..

--
Shane

Show quoted text

On Monday 18 Feb 2002 3:42 pm, Bruce Momjian wrote:

hubert depesz lubaczewski wrote:

On Mon, Feb 18, 2002 at 09:52:19AM -0500, Bruce Momjian wrote:

Not that I know of.

strange. what could lead to this results then?
i used to think that IN (SELECT ...) is the slowest possible way at all.

depesz

p.s. of course both select's use indices, and table is vacuumed

I have always wondered this too. Seems IN evaluates the entire query
while EXISTS evaluates it for each row, or at least that is how I
understand it, so saying EXISTS is always faster may be wrong.
Comments?