Does PostgreSQL support EXISTS?

Started by Raymond Chuialmost 25 years ago26 messagesgeneral
Jump to latest
#1Raymond Chui
raymond.chui@noaa.gov

The Subject says its all.

--Raymond

#2Nils Zonneveld
nils@mbit.nl
In reply to: Raymond Chui (#1)
Re: Does PostgreSQL support EXISTS?

Raymond Chui wrote:

The Subject says its all.

Yes 'exists' works (though I never understood the advantage to the 'in' operator).

Nils

#3Martijn van Oosterhout
kleptog@svana.org
In reply to: Nils Zonneveld (#2)
Re: Re: Does PostgreSQL support EXISTS?

On Wed, Jun 06, 2001 at 10:44:04PM +0200, Nils Zonneveld wrote:

Raymond Chui wrote:

The Subject says its all.

Yes 'exists' works (though I never understood the advantage to the 'in' operator).

On postgres at least, exists is faster than in.

They are equivalent though.

select x from a where v in (select v from b)
select x from a where exists (select 1 from b where a.v = b.v)
select x from a, b where a.v = b.v

are all the same. Postgres doesn't quite know that yet though.
--
Martijn van Oosterhout <kleptog@svana.org>
http://svana.org/kleptog/

#4KuroiNeko
evpopkov@carrier.kiev.ua
In reply to: Raymond Chui (#1)
Re: Re: Does PostgreSQL support EXISTS?

Yes 'exists' works (though I never understood the advantage to the 'in'

operator).

Each has its advantages and weak points. Eg, in some DBMS, number of
entries of `in ( A1, ... AN )' has a pretty tight limit. Also, in PGSQL 7.0
having even fewer entries may cause `Tuple too big' error. Solution is to
replace explicit enumeration with comparisons. And so on.
`exists' is more `relational pure' on one hand, but is probably more
resource consuming.

--

������������������������������������

#5Michael Meskes
meskes@postgresql.org
In reply to: Martijn van Oosterhout (#3)
Re: Re: Does PostgreSQL support EXISTS?

On Wed, Jun 13, 2001 at 12:23:15PM +1000, Martijn van Oosterhout wrote:

select x from a where v in (select v from b)
select x from a where exists (select 1 from b where a.v = b.v)

The latter should be faster than the former on every relational database
system.

Michael
--
Michael Meskes
Michael@Fam-Meskes.De
Go SF 49ers! Go Rhein Fire!
Use Debian GNU/Linux! Use PostgreSQL!

#6Martijn van Oosterhout
kleptog@svana.org
In reply to: Michael Meskes (#5)
Re: Re: Does PostgreSQL support EXISTS?

On Wed, Jun 13, 2001 at 09:21:02AM +0200, Michael Meskes wrote:

On Wed, Jun 13, 2001 at 12:23:15PM +1000, Martijn van Oosterhout wrote:

select x from a where v in (select v from b)
select x from a where exists (select 1 from b where a.v = b.v)

The latter should be faster than the former on every relational database
system.

Huh? Since they do the same thing should they (in theory) run in the same
time.

Now, I can imagine most DBMSs would optimes the latter better than the
forward, but that doesn't change the theory. In theory someone should be
able to program postgres' rule rewrite to rewrite the former to the latter
and then they would execute identically.
--
Martijn van Oosterhout <kleptog@svana.org>
http://svana.org/kleptog/

#7Erick Papadakis
erick_papadakis@yahoo.com
In reply to: Martijn van Oosterhout (#3)
RE: Re: Does PostgreSQL support EXISTS?

| On postgres at least, exists is faster than in.
|
| They are equivalent though.
|
| select x from a where v in (select v from b)
| select x from a where exists (select 1 from b where a.v = b.v)
| select x from a, b where a.v = b.v
|
| are all the same. Postgres doesn't quite know that yet though.

If you do an EXPLAIN on Oracle with the first 2 queries above, you will
notice that the second one is faster in Oracle too (i.e., EXISTS is faster
than IN).

Cheers
Shanx

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.252 / Virus Database: 125 - Release Date: 09-May-01

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Martijn van Oosterhout (#3)
Re: Re: Does PostgreSQL support EXISTS?

Martijn van Oosterhout <kleptog@svana.org> writes:

On postgres at least, exists is faster than in.
They are equivalent though.

Not really.

For one thing, IN can return a NULL (don't know) result; EXISTS cannot.

regression=# select * from foo;
f1
----
1

(2 rows)

regression=# select 2 in (select f1 from foo);
?column?
----------

(1 row)

regression=# select exists (select 1 from foo where f1 = 2);
?column?
----------
f
(1 row)

Yes, this behavior is spec-compliant. Think: we don't know what the NULL
represents, therefore we don't know whether 2 is in the column or not,
therefore IN should return NULL.

regards, tom lane

#9Bruce Momjian
bruce@momjian.us
In reply to: Michael Meskes (#5)
Re: Re: Does PostgreSQL support EXISTS?

On Wed, Jun 13, 2001 at 12:23:15PM +1000, Martijn van Oosterhout wrote:

select x from a where v in (select v from b)
select x from a where exists (select 1 from b where a.v = b.v)

The latter should be faster than the former on every relational database
system.

That surprises me because the subquery is a correlated subquery which
are usually slower on other databases that normal subqueries.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#10Bruce Momjian
bruce@momjian.us
In reply to: Erick Papadakis (#7)
Re: Re: Does PostgreSQL support EXISTS?

| On postgres at least, exists is faster than in.
|
| They are equivalent though.
|
| select x from a where v in (select v from b)
| select x from a where exists (select 1 from b where a.v = b.v)
| select x from a, b where a.v = b.v
|
| are all the same. Postgres doesn't quite know that yet though.

If you do an EXPLAIN on Oracle with the first 2 queries above, you will
notice that the second one is faster in Oracle too (i.e., EXISTS is faster
than IN).

So it is confirmed. Interesting.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#11Michael Meskes
meskes@postgresql.org
In reply to: Bruce Momjian (#9)
Re: Re: Does PostgreSQL support EXISTS?

On Wed, Jun 13, 2001 at 10:03:24AM -0400, Bruce Momjian wrote:

select x from a where v in (select v from b)
select x from a where exists (select 1 from b where a.v = b.v)

The latter should be faster than the former on every relational database
system.

That surprises me because the subquery is a correlated subquery which
are usually slower on other databases that normal subqueries.

To be honest I didn't notice that. :-)

I was just talking about the difference with IN (where you have to compute
the complete result set) and EXISTS where you just look for one match.

Michael

--
Michael Meskes
Michael@Fam-Meskes.De
Go SF 49ers! Go Rhein Fire!
Use Debian GNU/Linux! Use PostgreSQL!

#12Bruce Momjian
bruce@momjian.us
In reply to: Michael Meskes (#11)
Re: Re: Does PostgreSQL support EXISTS?

On Wed, Jun 13, 2001 at 10:03:24AM -0400, Bruce Momjian wrote:

select x from a where v in (select v from b)
select x from a where exists (select 1 from b where a.v = b.v)

The latter should be faster than the former on every relational database
system.

That surprises me because the subquery is a correlated subquery which
are usually slower on other databases that normal subqueries.

To be honest I didn't notice that. :-)

I was just talking about the difference with IN (where you have to compute
the complete result set) and EXISTS where you just look for one match.

When you use IN and a subquery you _sometimes_ have to execute the
subquery for every row of the outer query. Ouch! It can be optimized
to run the subquery non-correlated and join the correlated values to the
outer query for specific rows. The trick is knowing when the subquery
is going to be run many times and when the subquery is going to be run
for only a few rows so the optimization is not used.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#9)
Re: Re: Does PostgreSQL support EXISTS?

Bruce Momjian <pgman@candle.pha.pa.us> writes:

On Wed, Jun 13, 2001 at 12:23:15PM +1000, Martijn van Oosterhout wrote:

select x from a where v in (select v from b)
select x from a where exists (select 1 from b where a.v = b.v)

The latter should be faster than the former on every relational database
system.

That surprises me because the subquery is a correlated subquery which
are usually slower on other databases that normal subqueries.

However, the second form is easily able to make use of an index on b.v,
whereas the first form is impossible to optimize unless you are able to
rewrite it into some weird form of JOIN.

BTW, I just realized that the "weird form of JOIN" would have to be
much stranger than I previously thought. The result of IN depends not
only on whether the subselect's output has any matches to the current
test value, but also on whether the subselect's output has any NULLs.
So it's not simply a matter of doing a join with a special rule about
producing no more than one output tuple per outer-query tuple. How
would you check for the NULLs?

regards, tom lane

#14Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#13)
Re: Re: Does PostgreSQL support EXISTS?

Bruce Momjian <pgman@candle.pha.pa.us> writes:

On Wed, Jun 13, 2001 at 12:23:15PM +1000, Martijn van Oosterhout wrote:

select x from a where v in (select v from b)
select x from a where exists (select 1 from b where a.v = b.v)

The latter should be faster than the former on every relational database
system.

That surprises me because the subquery is a correlated subquery which
are usually slower on other databases that normal subqueries.

However, the second form is easily able to make use of an index on b.v,
whereas the first form is impossible to optimize unless you are able to
rewrite it into some weird form of JOIN.

Assuming the index is there, yes, but again, do we use the index or run
the query. This has the same dynamics as index/heap scan and is
probably more complicated to figure out accurately.

BTW, I just realized that the "weird form of JOIN" would have to be
much stranger than I previously thought. The result of IN depends not
only on whether the subselect's output has any matches to the current
test value, but also on whether the subselect's output has any NULLs.
So it's not simply a matter of doing a join with a special rule about
producing no more than one output tuple per outer-query tuple. How
would you check for the NULLs?

I thought NOT IN was the only one that was concerned about any NULL?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#14)
Re: Re: Does PostgreSQL support EXISTS?

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I thought NOT IN was the only one that was concerned about any NULL?

No, they both are: in the presence of NULLs, IN can return TRUE or NULL,
NOT IN can return FALSE or NULL.

The reason the FAQ is always about NOT NULL is that WHERE treats NULL as
FALSE, so the average newbie writing an IN doesn't even realize he's
getting a NULL rather than a FALSE. With NOT NULL, he can't ignore it.

regards, tom lane

#16Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#15)
Re: Re: Does PostgreSQL support EXISTS?

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I thought NOT IN was the only one that was concerned about any NULL?

No, they both are: in the presence of NULLs, IN can return TRUE or NULL,
NOT IN can return FALSE or NULL.

The reason the FAQ is always about NOT NULL is that WHERE treats NULL as
FALSE, so the average newbie writing an IN doesn't even realize he's
getting a NULL rather than a FALSE. With NOT NULL, he can't ignore it.

Got it. How does an IN subquery returning NULL behave differently from
one returning FALSE? I can't think of a test that would be affected.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#16)
Re: Re: Does PostgreSQL support EXISTS?

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Got it. How does an IN subquery returning NULL behave differently from
one returning FALSE? I can't think of a test that would be affected.

After we fix IS TRUE and friends to respond to nulls correctly (Conway's
promised to do that, IIRC) it'll be possible to write

(foo IN (SELECT ...)) IS NOT FALSE

and get the "intuitive" behavior. But right now that doesn't work.

Hm. Maybe we could recognize that construct as a whole, and translate
it to an optimizable join? It'd become the usual locution, I imagine.

regards, tom lane

#18Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#17)
Re: Re: Does PostgreSQL support EXISTS?

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Got it. How does an IN subquery returning NULL behave differently from
one returning FALSE? I can't think of a test that would be affected.

After we fix IS TRUE and friends to respond to nulls correctly (Conway's
promised to do that, IIRC) it'll be possible to write

(foo IN (SELECT ...)) IS NOT FALSE

and get the "intuitive" behavior. But right now that doesn't work.

OK, so I wasn't missing anything in our current code. I can see how
this capability would change things.

Hm. Maybe we could recognize that construct as a whole, and translate
it to an optimizable join? It'd become the usual locution, I imagine.

Are we anywhere with optimizing IN to EXISTS? I didn't think there was
any work being done in that area.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#18)
Re: Re: Does PostgreSQL support EXISTS?

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Are we anywhere with optimizing IN to EXISTS? I didn't think there was
any work being done in that area.

I don't think it's that hard to do within the outer-join mechanism,
except for this little issue about NULLs. The planner already makes
querytree alterations of comparable complexity, to deal with views and
subselects...

regards, tom lane

#20Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#19)
Re: Re: Does PostgreSQL support EXISTS?

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Are we anywhere with optimizing IN to EXISTS? I didn't think there was
any work being done in that area.

I don't think it's that hard to do within the outer-join mechanism,
except for this little issue about NULLs. The planner already makes
querytree alterations of comparable complexity, to deal with views and
subselects...

Doing that would elimate the need for query sources, at least for a
while.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#20)
#22Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#21)
#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#22)
#24Dave Cramer
dave@ebox.com
In reply to: Bruce Momjian (#16)
#25Joe Conway
mail@joeconway.com
In reply to: Bruce Momjian (#16)
#26Alex Pilosov
alex@pilosoft.com
In reply to: Dave Cramer (#24)