Tricky join question
Hi all
I have a join problem that seems to be too difficult for me to solve:
I have:
table person
id integer,
name varchar(32)
data:
1,"Jack"
2,"Jill"
3,"Bob"
table course
id integer,
name varchar(32)
data:
1,"SQL Beginner"
2,"SQL Advanced"
table person_course
person_id number,
course_id number
data:
(currently empty)
Now, I would like to know for every person the courses they have taken.
In mysql, the following statement:
SELECT c.id, c.name, pc.person_id
FROM person as p
left outer join person_course as pc on p.id = pc.person_id
right outer join course as c on pc.course_id = c.id
where p.id = 2 order by 1;
will get me the following result:
+---+-----------------------------------------+----------+
| id| name | person_id|
+---+-----------------------------------------+----------+
| 1 | SQL Beginner | |
| 2 | SQL Advanced | |
+---+-----------------------------------------+----------+
Can I get Postgres to give me the same result somehow? The above
statement will return nothing at all under postgres.
Bye
Tim
am Fri, dem 22.12.2006, um 12:12:06 +0100 mailte Tim Tassonis folgendes:
Hi all
I have a join problem that seems to be too difficult for me to solve:
I have:
table person
id integer,
name varchar(32)data:
1,"Jack"
2,"Jill"
3,"Bob"table course
id integer,
name varchar(32)data:
1,"SQL Beginner"
2,"SQL Advanced"table person_course
person_id number,
course_id numberdata:
First, you should use referential integrity:
test=# create table person(id int primary key, name text);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "person_pkey" for table "person"
CREATE TABLE
test=# create table course(id int primary key, name text);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "course_pkey" for table "course"
CREATE TABLE
test=# create table person_course(person_id int references person, course_id int references course );
CREATE TABLE
(currently empty)
Okay, i insert some data:
test=# insert into person_course values (1,1);
INSERT 0 1
test=# insert into person_course values (3,1);
INSERT 0 1
test=# insert into person_course values (3,2);
INSERT 0 1
Now, I would like to know for every person the courses they have taken.
Similar to your result:
test=# select b.id, b.name, c.id from person_course a left join course b on a.course_id=b.id left join person c on a.person_id=c.id;
id | name | id
----+--------------+----
1 | SQL Beginner | 1
1 | SQL Beginner | 3
2 | SQL Advanced | 3
(3 rows)
In my opinion better:
test=# select c.id, c.name, b.name from person_course a left join course b on a.course_id=b.id left join person c on a.person_id=c.id;
id | name | name
----+------+--------------
1 | Jack | SQL Beginner
3 | Bob | SQL Beginner
3 | Bob | SQL Advanced
(3 rows)
Please, read more about referential integrity.
Andreas
--
Andreas Kretschmer
Kontakt: Heynitz: 035242/47215, D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net
Hi Andreas
First, you should use referential integrity:
I do, that is not the point. It was a simplified data model. Of course I
have primary keys and stuff, but they don't affect join behaviour at all.
test=# create table person(id int primary key, name text);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "person_pkey" for table "person"
CREATE TABLE
test=# create table course(id int primary key, name text);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "course_pkey" for table "course"
CREATE TABLE
test=# create table person_course(person_id int references person, course_id int references course );
CREATE TABLE
Well, you shouldn't, thats not my problem.
Now, I would like to know for every person the courses they have taken.
Similar to your result:
test=# select b.id, b.name, c.id from person_course a left join course b on a.course_id=b.id left join person c on a.person_id=c.id;
id | name | id
----+--------------+----
1 | SQL Beginner | 1
1 | SQL Beginner | 3
2 | SQL Advanced | 3
(3 rows)
This is absolutely not what I want. I want a row for every person and
every course, regardless whether the person has taken the course or not.
If the person has not taken the course, I want a null value in the
person id column:
SELECT c.id, c.name, pc.person_id
FROM person as p
left outer join person_course as pc on p.id = pc.person_id
right outer join course as c on pc.course_id = c.id
where p.id = 2 order by 1;
+---+-----------------------------------------+----------+
| id| name | person_id|
+---+-----------------------------------------+----------+
| 1 | SQL Beginner | |
| 2 | SQL Advanced | |
+---+-----------------------------------------+----------+
Note here that I restrict my select to the person with the ID 2. Since
this person has not taken any course, the person_id is null. If I leave
the restriction on the person, I get person times courses rows, the
person_id only filled when a person has actually taken a course.
With the rows you added person_course and without restrictin to a
specific person, the result of your query should be:
+---+-----------------------------------------+------------+
| id| name | person_id |
+---+-----------------------------------------+------------+
| 1 | SQL Beginner | 1 |
| 1 | SQL Beginner | |
| 1 | SQL Beginner | 3 |
| 2 | SQL Advanced | |
| 2 | SQL Advanced | |
| 2 | SQL Advanced | 3 |
+---+-----------------------------------------+------------+
In mysql, you get this with the following clause:
SELECT c.id, c.name, pc.person_id
FROM person as p
left outer join person_course as pc on p.id = pc.person_id
right outer join course as c on pc.course_id = c.id
order by 1;
Please, read more about referential integrity.
Thanks, but I already have read a lot about it 14 years ago.
Bye
Tim
Hi,
Without restriction you're getting:
On Fri, Dec 22, 2006 at 02:55:56PM +0100, Tim Tassonis wrote:
+---+-----------------------------------------+------------+ | id| name | person_id | +---+-----------------------------------------+------------+ | 1 | SQL Beginner | 1 | | 1 | SQL Beginner | | | 1 | SQL Beginner | 3 | | 2 | SQL Advanced | | | 2 | SQL Advanced | | | 2 | SQL Advanced | 3 | +---+-----------------------------------------+------------+
There are no rows in the table with person_id=2, so PostgreSQL is
returning the correct result (no rows). Seems like a bug in MySQL.
In mysql, you get this with the following clause:
SELECT c.id, c.name, pc.person_id
FROM person as p
left outer join person_course as pc on p.id = pc.person_id
right outer join course as c on pc.course_id = c.id
order by 1;
I think what you want is to apply to restriction on person earlier,
maybe:
SELECT c.id, c.name, pc.person_id
FROM person as p
left outer join person_course as pc on (p.id = pc.person_id and p.id = 2)
right outer join course as c on pc.course_id = c.id
order by 1;
Hope this helps,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
From each according to his ability. To each according to his ability to litigate.
Tim Tassonis wrote:
Hi Andreas
First, you should use referential integrity:
I do, that is not the point. It was a simplified data model. Of course I
have primary keys and stuff, but they don't affect join behaviour at all.test=# create table person(id int primary key, name text);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index
"person_pkey" for table "person"
CREATE TABLE
test=# create table course(id int primary key, name text);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index
"course_pkey" for table "course"
CREATE TABLE
test=# create table person_course(person_id int references person,
course_id int references course );
CREATE TABLEWell, you shouldn't, thats not my problem.
Now, I would like to know for every person the courses they have taken.
Similar to your result:
test=# select b.id, b.name, c.id from person_course a left join course
b on a.course_id=b.id left join person c on a.person_id=c.id;
id | name | id
----+--------------+----
1 | SQL Beginner | 1
1 | SQL Beginner | 3
2 | SQL Advanced | 3
(3 rows)This is absolutely not what I want. I want a row for every person and
every course, regardless whether the person has taken the course or not.
If the person has not taken the course, I want a null value in the
person id column:SELECT c.id, c.name, pc.person_id
FROM person as p
left outer join person_course as pc on p.id = pc.person_id
right outer join course as c on pc.course_id = c.id
where p.id = 2 order by 1;+---+-----------------------------------------+----------+ | id| name | person_id| +---+-----------------------------------------+----------+ | 1 | SQL Beginner | | | 2 | SQL Advanced | | +---+-----------------------------------------+----------+Note here that I restrict my select to the person with the ID 2. Since
this person has not taken any course, the person_id is null. If I leave
the restriction on the person, I get person times courses rows, the
person_id only filled when a person has actually taken a course.With the rows you added person_course and without restrictin to a
specific person, the result of your query should be:+---+-----------------------------------------+------------+ | id| name | person_id | +---+-----------------------------------------+------------+ | 1 | SQL Beginner | 1 | | 1 | SQL Beginner | | | 1 | SQL Beginner | 3 | | 2 | SQL Advanced | | | 2 | SQL Advanced | | | 2 | SQL Advanced | 3 | +---+-----------------------------------------+------------+In mysql, you get this with the following clause:
SELECT c.id, c.name, pc.person_id
FROM person as p
left outer join person_course as pc on p.id = pc.person_id
right outer join course as c on pc.course_id = c.id
order by 1;Please, read more about referential integrity.
Thanks, but I already have read a lot about it 14 years ago.
Bye
Tim
In all your long years of experience, perhaps you haven't come across this?
http://catb.org/~esr/faqs/smart-questions.html
If you're going to ask a question here the least you could do is meet us
half-way.
b
Tim Tassonis <timtas@cubic.ch> writes:
In mysql, the following statement:
SELECT c.id, c.name, pc.person_id
FROM person as p
left outer join person_course as pc on p.id = pc.person_id
right outer join course as c on pc.course_id = c.id
where p.id = 2 order by 1;
will get me the following result:
+---+-----------------------------------------+----------+ | id| name | person_id| +---+-----------------------------------------+----------+ | 1 | SQL Beginner | | | 2 | SQL Advanced | | +---+-----------------------------------------+----------+
Really? It would be unbelievably broken if so, but a quick experiment
with mysql 5.0.27 says they return an empty set same as us.
You *would* get that answer without the WHERE clause, but neither of
those rows meet the WHERE. Look at the complete join output:
regression=# SELECT *
FROM person as p
left outer join person_course as pc on p.id = pc.person_id
right outer join course as c on pc.course_id = c.id
;
id | name | person_id | course_id | id | name
----+------+-----------+-----------+----+--------------
| | | | 1 | SQL Beginner
| | | | 2 | SQL Advanced
(2 rows)
The person-left-join-person_course join produces rows, but none of them
can match course during the right join, so they don't get through.
I think what you want might be a full join for the second step:
regression=# SELECT *
FROM person as p
left outer join person_course as pc on p.id = pc.person_id
full outer join course as c on pc.course_id = c.id
;
id | name | person_id | course_id | id | name
----+------+-----------+-----------+----+--------------
1 | Jack | | | |
2 | Jill | | | |
3 | Bob | | | |
| | | | 1 | SQL Beginner
| | | | 2 | SQL Advanced
(5 rows)
regression=# SELECT *
FROM person as p
left outer join person_course as pc on p.id = pc.person_id
full outer join course as c on pc.course_id = c.id
where p.id = 2 order by 1;
id | name | person_id | course_id | id | name
----+------+-----------+-----------+----+------
2 | Jill | | | |
(1 row)
regression=# insert into person_course values(2,2);
INSERT 0 1
regression=# SELECT *
FROM person as p
left outer join person_course as pc on p.id = pc.person_id
full outer join course as c on pc.course_id = c.id
where p.id = 2 order by 1;
id | name | person_id | course_id | id | name
----+------+-----------+-----------+----+--------------
2 | Jill | 2 | 2 | 2 | SQL Advanced
(1 row)
BTW, I tried to duplicate this in mysql and was surprised to find that
5.0.27 doesn't seem to support full join at all :-(
regards, tom lane
Tim Tassonis <timtas@cubic.ch> schrieb:
This is absolutely not what I want. I want a row for every person and every
course, regardless whether the person has taken the course or not. If the
person has not taken the course, I want a null value in the person id
column:
test=# select c.id, c.name, pc.person_id
from course c
cross join person p
left outer join person_course pc on (p.id,c.id)=(pc.person_id,pc.course_id);
id | name | person_id
----+--------------+-----------
1 | SQL Beginner | 1
1 | SQL Beginner |
1 | SQL Beginner | 3
2 | SQL Advanced |
2 | SQL Advanced |
2 | SQL Advanced | 3
(6 rows)
But i think, this is a little bit stupid, because row 4 and 5 are the
same.
Perhaps this would be better:
test=# select c.id, c.name, p.id as person, pc.person_id
from course c
cross join person p
left outer join person_course pc on (p.id,c.id)=(pc.person_id,pc.course_id);
id | name | person | person_id
----+--------------+--------+-----------
1 | SQL Beginner | 1 | 1
1 | SQL Beginner | 2 |
1 | SQL Beginner | 3 | 3
2 | SQL Advanced | 1 |
2 | SQL Advanced | 2 |
2 | SQL Advanced | 3 | 3
(6 rows)
Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect. (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly." (unknow)
Kaufbach, Saxony, Germany, Europe. N 51.05082�, E 13.56889�
Thanks to you all for your replies. I was able to solve my problem after
some more reading in the manual:
select c.id, c.name, pc.person_id
from person as p
cross join course as c
left outer join person_course as pc
on (p.id = pc.person_id and c.id = pc.course_id)
where p.id = 2;
A few remarks to the answers:
I seem to have been misunderstood in a way that people understood that I
implied that mysql is right and postgres is wrong. This was in no way my
opinion. I just reported what results I got under mysql. I very much
prefer Postgres over mysql and never questioned postgres' correctness.
To brian:
Please, read more about referential integrity.
Thanks, but I already have read a lot about it 14 years ago.
Bye
TimIn all your long years of experience, perhaps you haven't come across this?
http://catb.org/~esr/faqs/smart-questions.html
If you're going to ask a question here the least you could do is meet us half-way.
I think I asked quite politely, did not blame anybody else and just
described my problem. I agree, I felt a bit insulted when being told to
read about referential integrity, because that clearly had nothing to do
with my question and I do know about it. I'm coming from Oracle and not
from mysql.
And thanks Andreas, I just saw your latest post which has the same
solution as I got in it.
Bye
Tim