Negating the list of selected rows of a join

Started by Manuel Lemosabout 27 years ago6 messagesgeneral
Jump to latest
#1Manuel Lemos
mlemos@acm.org

Hello,

I want to list the rows of a table with a text field whose values do not
exist in a similar field of another table. Basically what I want to get
is negated results of a join.

Lets say the tables table_a and table_b have the field name.

table_a table_b
name age name
----- --- -----
Peter 27 Paul
Paul 42
Mary 20

If I asked for a join like this:

SELECT table_a.name,table_a.age FROM table_a,table_b WHERE table_a.name=table_b.name

I would get:

name age
----- ---
Paul 42

But I want the opposite. I tried a non-equi join like this:

SELECT table_a.name,table_a.age FROM table_a,table_b WHERE table_a.name<>table_b.name

and I got:

name age
----- ---
Peter 27
Mary 20

It worked except for the case when table_b is empty. In this case the
nothing was returned. Is this the expected behaviour or is it a bug in
PostgreSQL?

How can I make a query that works the way I want all the time, even for the
case when table_b is empty?

Regards,
Manuel Lemos

E-mail: mlemos@acm.org
URL: http://www.e-na.net/the_author.html
PGP key: finger://mlemos@zeus.ci.ua.pt
--

#2Clark Evans
clark.evans@manhattanproject.com
In reply to: Manuel Lemos (#1)
Re: [GENERAL] Negating the list of selected rows of a join

Manuel Lemos wrote:

How can I make a query that works the way I want all the time, even for the
case when table_b is empty?

SELECT table_a.name, table_a.age
FROM table_a
WHERE NOT EXISTS (
SELECT 'x'
FROM table_b
WHERE table_b.name = table_a.name
);

Hope this will do the trick.

:) Clark Evans

#3Ulf Mehlig
umehlig@uni-bremen.de
In reply to: Manuel Lemos (#1)
Re: [GENERAL] Negating the list of selected rows of a join

Manuel Lemos <mlemos@acm.org> wrote:

I want to list the rows of a table with a text field whose values do not
exist in a similar field of another table. Basically what I want to get
is negated results of a join. [...]
It worked except for the case when table_b is empty. In this case the
nothing was returned. Is this the expected behaviour or is it a bug in
PostgreSQL?

If you list two (or more) tables in the 'from' clause of a select
(that is, if you do a 'join'), a result table is built, in which each
row of the first table is combined with each row from (all) the other
table(s). To clarify, do simply

SELECT table_a.name,table_b.name FROM table_a,table_b;

on your table. When one of the tables has no rows, all the rows from
the other(s) are combined with *nothing*; this gives nothing!
('combined' may be the wrong word; it's like a multiplication, and
people speak of a 'Cartesian product' of the tables)

The 'where' clause can restrict the rows of the result table to
something useful, e.g., you can restrict to 'table_a.name =
table_b.name'. A feature that probably will help you is the
construction of a so-called 'sub-select' in the where clause:

SELECT name FROM table_a
WHERE name NOT IN (SELECT name FROM table_b);

Hope it helps!
Ulf

--
======================================================================
Ulf Mehlig <umehlig@zmt.uni-bremen.de>
Center for Tropical Marine Ecology/ZMT, Bremen, Germany
----------------------------------------------------------------------

#4Ulf Mehlig
umehlig@uni-bremen.de
In reply to: Clark Evans (#2)
Re: [GENERAL] Negating the list of selected rows of a join

Clark Evans <clark.evans@manhattanproject.com>:

Manuel Lemos wrote:

How can I make a query that works the way I want all the time, even for the
case when table_b is empty?

SELECT table_a.name, table_a.age
FROM table_a
WHERE NOT EXISTS (
SELECT 'x'
FROM table_b
WHERE table_b.name = table_a.name
);

Hope this will do the trick.

Maybe not -- doesn't that mean, that the query won't return a single
row in case there is *any* pair of equal names in both tables?!

Have a nice (sun)day,
Ulf

--
======================================================================
Ulf Mehlig <umehlig@zmt.uni-bremen.de>
Center for Tropical Marine Ecology/ZMT, Bremen, Germany
----------------------------------------------------------------------

#5Clark Evans
clark.evans@manhattanproject.com
In reply to: Manuel Lemos (#1)
Re: [GENERAL] Negating the list of selected rows of a join

Manuel Lemos wrote:

I want to list the rows of a table with a text field whose values do not
exist in a similar field of another table. Basically what I want to get
is negated results of a join.

Ulf Mehlig wrote:

SELECT name FROM table_a
WHERE name NOT IN (SELECT name FROM table_b);

Clark Evans wrote:

SELECT table_a.name, table_a.age
FROM table_a
WHERE NOT EXISTS (
SELECT 'x'
FROM table_b
WHERE table_b.name = table_a.name
);

I'm not sure about how well PostgreSQL handles
these two. I'd try them both with your data set.
If table_b is small (less than a few thousand rows)
then Ulf's approach would work best. However,
if table_b is large (more than a thousand)
then I think the other approach may work better
if table_b.name is indexed.

Clark

#6Ulf Mehlig
umehlig@uni-bremen.de
In reply to: Manuel Lemos (#1)
Re: [GENERAL] Negating the list of selected rows of a join

Clark Evans <clark.evans@manhattanproject.com>/Ulf Mehlig
<umehlig@pandora3.uni-bremen.de> wrote:

Hope this will do the trick.

Maybe not -- doesn't that mean, that the query won't return a
single row in case there is *any* pair of equal names in both
tables?!

No. It's a correlated sub-query. It's actually much more efficient
with large tables than the other approach (which has to put
the entire result set of table_b on the heap before it can
process table_a). Your approach is, however, much better (by a
large factor) if table_a is very large and table_b is very small,
since you don't have to hit the index for table_b on every
row of table_a...

Yes, it *will* do the trick!! Sorry, Clark, I misinterpreted your
sub-query ... have to read more carefully ...

Thanks for your correction!
Ulf

--
======================================================================
Ulf Mehlig <umehlig@zmt.uni-bremen.de>
Center for Tropical Marine Ecology/ZMT, Bremen, Germany
----------------------------------------------------------------------