OR or IN ?

Started by Luca Ferrariover 17 years ago10 messagesgeneral
Jump to latest
#1Luca Ferrari
fluca1978@infinito.it

Hi all,
I've got a query with a long (>50) list of ORs, like the following:

SELECT colB, colC FROM table WHERE colA=X OR colA=Y OR colA=Z OR ....

Is there any difference in how postgresql manages the above query and the
following one?

SELECT colB, colC FROM table WHERE colA IN (X,Y,Z,...)

Which is the suggested index to use on colA to get better performances?

Thanks,
Luca

#2A. Kretschmer
andreas.kretschmer@schollglas.com
In reply to: Luca Ferrari (#1)
Re: OR or IN ?

am Tue, dem 14.10.2008, um 8:33:21 +0200 mailte Luca Ferrari folgendes:

Hi all,
I've got a query with a long (>50) list of ORs, like the following:

SELECT colB, colC FROM table WHERE colA=X OR colA=Y OR colA=Z OR ....

Is there any difference in how postgresql manages the above query and the
following one?

SELECT colB, colC FROM table WHERE colA IN (X,Y,Z,...)

Depends on the version: (same table foo)

8.1:
test=*# explain select * from foo where a in (1,2,3);
QUERY PLAN
-----------------------------------------------------
Seq Scan on foo (cost=0.00..47.45 rows=32 width=4)
Filter: ((a = 1) OR (a = 2) OR (a = 3))

8.4devel:
test=# explain select * from foo where a in (1,2,3);
QUERY PLAN
-----------------------------------------------------
Seq Scan on foo (cost=0.00..43.00 rows=36 width=4)
Filter: (a = ANY ('{1,2,3}'::integer[]))

As you can see, 8.1 rewrite the query to many ORs.

Andreas
--
Andreas Kretschmer
Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net

#3Harald Fuchs
hari.fuchs@gmail.com
In reply to: Luca Ferrari (#1)
Re: OR or IN ?

In article <20081014064831.GB22137@a-kretschmer.de>,
"A. Kretschmer" <andreas.kretschmer@schollglas.com> writes:

am Tue, dem 14.10.2008, um 8:33:21 +0200 mailte Luca Ferrari folgendes:

Hi all,
I've got a query with a long (>50) list of ORs, like the following:

SELECT colB, colC FROM table WHERE colA=X OR colA=Y OR colA=Z OR ....

Is there any difference in how postgresql manages the above query and the
following one?

SELECT colB, colC FROM table WHERE colA IN (X,Y,Z,...)

Depends on the version: (same table foo)

8.1:
test=*# explain select * from foo where a in (1,2,3);
QUERY PLAN
-----------------------------------------------------
Seq Scan on foo (cost=0.00..47.45 rows=32 width=4)
Filter: ((a = 1) OR (a = 2) OR (a = 3))

8.4devel:
test=# explain select * from foo where a in (1,2,3);
QUERY PLAN
-----------------------------------------------------
Seq Scan on foo (cost=0.00..43.00 rows=36 width=4)
Filter: (a = ANY ('{1,2,3}'::integer[]))

As you can see, 8.1 rewrite the query to many ORs.

I think that "OR or IN" is the wrong question. Where do those >50
values come from? If they come from a DB operation, just include this
operation in your query. Otherwise, you should ask yourself how many
values you might get: >50, >500, >5000? There's a point where it's
probably more efficient to COPY all those values into an indexed
temporary table, ANALYZE it, and JOIN it into your query.

#4Scott Ribe
scott_ribe@killerbytes.com
In reply to: Luca Ferrari (#1)
Re: OR or IN ?

Older versions of PG were inefficient with larger numbers of elements in an
IN query, and in fact would error out with something about lack of stack
space if you used very many (a few hundred IIRC).

8.x something was supposed to have improved that. Using 8.3 recently, after
an "oopsie" with some development data, I inadvertently confirmed that it
works and performance is not too bad with >34,000 items in an IN clause ;-)

--
Scott Ribe
scott_ribe@killerbytes.com
http://www.killerbytes.com/
(303) 722-0567 voice

#5Luca Ferrari
fluca1978@infinito.it
In reply to: Scott Ribe (#4)
Re: OR or IN ?

On Friday 17 October 2008 Scott Ribe's cat, walking on the keyboard, wrote:

Older versions of PG were inefficient with larger numbers of elements in an
IN query, and in fact would error out with something about lack of stack
space if you used very many (a few hundred IIRC).

8.x something was supposed to have improved that. Using 8.3 recently, after
an "oopsie" with some development data, I inadvertently confirmed that it
works and performance is not too bad with >34,000 items in an IN clause ;-)

Interesting, since my queries sometimes expand to a few thousands of ORs (or
INs), and I had in fact stack problems and I had to expand it to around 20
MB.....

Luca

#6Andrus
kobruleht2@hot.ee
In reply to: Scott Ribe (#4)
Re: OR or IN ?

8.x something was supposed to have improved that.

Not in 8.0
8.0 (and maybe newer releases also) throw dreaded "Stack lenght limit
exceeded"

message if IN contains large number or items.

Andrus.

#7Scott Marlowe
scott.marlowe@gmail.com
In reply to: Andrus (#6)
Re: OR or IN ?

On Fri, Oct 17, 2008 at 8:24 AM, Andrus <kobruleht2@hot.ee> wrote:

8.x something was supposed to have improved that.

Not in 8.0
8.0 (and maybe newer releases also) throw dreaded "Stack lenght limit
exceeded"

message if IN contains large number or items.

Have y'all checked them there release notes? I'm sure they tell the
tale. (sorry too much True Blood...)

#8Andrus
kobruleht2@hot.ee
In reply to: Scott Marlowe (#7)
Re: OR or IN ?

8.0 (and maybe newer releases also) throw dreaded "Stack lenght limit
exceeded"

message if IN contains large number or items.

Have y'all checked them there release notes? I'm sure they tell the
tale. (sorry too much True Blood...)

No. I need to support 8.0.
After those crashes I changed all IN clauses to

docid = ANY ({1,2,3,4,5,7,8,96,5445,.., 34534534})

this takes the same number of characters per id as IN clause.

I hope that this does not have any bad effects.

Andrus.

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrus (#8)
Re: OR or IN ?

"Andrus" <kobruleht2@hot.ee> writes:

No. I need to support 8.0.
After those crashes I changed all IN clauses to
docid = ANY ({1,2,3,4,5,7,8,96,5445,.., 34534534})

I'm afraid that hardly counts as "supporting" 8.0 (or 8.1 for that
matter). The syntax may work but it can't be indexed.

regards, tom lane

#10Andrus
kobruleht2@hot.ee
In reply to: Tom Lane (#9)
Re: OR or IN ?

No. I need to support 8.0.
After those crashes I changed all IN clauses to
docid = ANY ({1,2,3,4,5,7,8,96,5445,.., 34534534})

I'm afraid that hardly counts as "supporting" 8.0 (or 8.1 for that
matter). The syntax may work but it can't be indexed.

I'm sorry. It seems that Richard Huxton suggestion in this mailing list

SELECT * FROM generate_series(1,10) s
WHERE s = ANY (string_to_array('1,3,5,7', ',')::int[]);

works in 8.0
I hope that this works as well as new array syntax.

Andrus.