Negating the list of selected rows of a join
Hello,
I want to list the rows of a table with a text field whose values do not
exist in a similar field of another table. Basically what I want to get
is negated results of a join.
Lets say the tables table_a and table_b have the field name.
table_a table_b
name age name
----- --- -----
Peter 27 Paul
Paul 42
Mary 20
If I asked for a join like this:
SELECT table_a.name,table_a.age FROM table_a,table_b WHERE table_a.name=table_b.name
I would get:
name age
----- ---
Paul 42
But I want the opposite. I tried a non-equi join like this:
SELECT table_a.name,table_a.age FROM table_a,table_b WHERE table_a.name<>table_b.name
and I got:
name age
----- ---
Peter 27
Mary 20
It worked except for the case when table_b is empty. In this case the
nothing was returned. Is this the expected behaviour or is it a bug in
PostgreSQL?
How can I make a query that works the way I want all the time, even for the
case when table_b is empty?
Regards,
Manuel Lemos
E-mail: mlemos@acm.org
URL: http://www.e-na.net/the_author.html
PGP key: finger://mlemos@zeus.ci.ua.pt
--
Manuel Lemos wrote:
How can I make a query that works the way I want all the time, even for the
case when table_b is empty?
SELECT table_a.name, table_a.age
FROM table_a
WHERE NOT EXISTS (
SELECT 'x'
FROM table_b
WHERE table_b.name = table_a.name
);
Hope this will do the trick.
:) Clark Evans
Manuel Lemos <mlemos@acm.org> wrote:
I want to list the rows of a table with a text field whose values do not
exist in a similar field of another table. Basically what I want to get
is negated results of a join. [...]
It worked except for the case when table_b is empty. In this case the
nothing was returned. Is this the expected behaviour or is it a bug in
PostgreSQL?
If you list two (or more) tables in the 'from' clause of a select
(that is, if you do a 'join'), a result table is built, in which each
row of the first table is combined with each row from (all) the other
table(s). To clarify, do simply
SELECT table_a.name,table_b.name FROM table_a,table_b;
on your table. When one of the tables has no rows, all the rows from
the other(s) are combined with *nothing*; this gives nothing!
('combined' may be the wrong word; it's like a multiplication, and
people speak of a 'Cartesian product' of the tables)
The 'where' clause can restrict the rows of the result table to
something useful, e.g., you can restrict to 'table_a.name =
table_b.name'. A feature that probably will help you is the
construction of a so-called 'sub-select' in the where clause:
SELECT name FROM table_a
WHERE name NOT IN (SELECT name FROM table_b);
Hope it helps!
Ulf
--
======================================================================
Ulf Mehlig <umehlig@zmt.uni-bremen.de>
Center for Tropical Marine Ecology/ZMT, Bremen, Germany
----------------------------------------------------------------------
Clark Evans <clark.evans@manhattanproject.com>:
Manuel Lemos wrote:
How can I make a query that works the way I want all the time, even for the
case when table_b is empty?SELECT table_a.name, table_a.age
FROM table_a
WHERE NOT EXISTS (
SELECT 'x'
FROM table_b
WHERE table_b.name = table_a.name
);Hope this will do the trick.
Maybe not -- doesn't that mean, that the query won't return a single
row in case there is *any* pair of equal names in both tables?!
Have a nice (sun)day,
Ulf
--
======================================================================
Ulf Mehlig <umehlig@zmt.uni-bremen.de>
Center for Tropical Marine Ecology/ZMT, Bremen, Germany
----------------------------------------------------------------------
Manuel Lemos wrote:
I want to list the rows of a table with a text field whose values do not
exist in a similar field of another table. Basically what I want to get
is negated results of a join.
Ulf Mehlig wrote:
SELECT name FROM table_a
WHERE name NOT IN (SELECT name FROM table_b);
Clark Evans wrote:
SELECT table_a.name, table_a.age
FROM table_a
WHERE NOT EXISTS (
SELECT 'x'
FROM table_b
WHERE table_b.name = table_a.name
);
I'm not sure about how well PostgreSQL handles
these two. I'd try them both with your data set.
If table_b is small (less than a few thousand rows)
then Ulf's approach would work best. However,
if table_b is large (more than a thousand)
then I think the other approach may work better
if table_b.name is indexed.
Clark
Clark Evans <clark.evans@manhattanproject.com>/Ulf Mehlig
<umehlig@pandora3.uni-bremen.de> wrote:
Hope this will do the trick.
Maybe not -- doesn't that mean, that the query won't return a
single row in case there is *any* pair of equal names in both
tables?!No. It's a correlated sub-query. It's actually much more efficient
with large tables than the other approach (which has to put
the entire result set of table_b on the heap before it can
process table_a). Your approach is, however, much better (by a
large factor) if table_a is very large and table_b is very small,
since you don't have to hit the index for table_b on every
row of table_a...
Yes, it *will* do the trick!! Sorry, Clark, I misinterpreted your
sub-query ... have to read more carefully ...
Thanks for your correction!
Ulf
--
======================================================================
Ulf Mehlig <umehlig@zmt.uni-bremen.de>
Center for Tropical Marine Ecology/ZMT, Bremen, Germany
----------------------------------------------------------------------
Import Notes
Reply to msg id not found: umehlig@pandora3.uni-bremen.de'smessageof14Mar1999091720+0100