bug in JOIN or COUNT or ... ?
Sorry, worst Subject I've ever come up with, but this is one of those "I
haven't got a clue how to describe" emails ...
Simple query:
SELECT distinct s.gid, s.created, i.title
FROM status s LEFT JOIN images i ON (s.gid = i.gid AND i.active), personal_data pd, relationship_wanted rw
WHERE s.active AND s.status != 0
AND s.gid = 17111
AND (s.gid = pd.gid AND pd.gender = 0)
AND (s.gid = rw.gid AND rw.gender = 0 );
Produces:
gid | created | title
-------+------------------------+--------
17111 | 2000-10-19 15:20:46-04 | image1
17111 | 2000-10-19 15:20:46-04 | image2
17111 | 2000-10-19 15:20:46-04 | image3
(3 rows)
Great, what I expect ...
But:
SELECT distinct s.gid, s.created, count(i.title) AS images
FROM status s LEFT JOIN images i ON (s.gid = i.gid AND i.active), personal_data pd, relationship_wanted rw
WHERE s.active AND s.status != 0
AND s.gid = 17111
AND (s.gid = pd.gid AND pd.gender = 0)
AND (s.gid = rw.gid AND rw.gender = 0 )
GROUP BY s.gid, s.created;
Produces:
/tmp/psql.edit.70.62491: 7 lines, 353 characters.
gid | created | images
-------+------------------------+--------
17111 | 2000-10-19 15:20:46-04 | 15
(1 row)
So why is it counting 12 more images then are actually found/exist:
testdb=# select title from images where gid = 17111;
title
--------
image1
image3
image2
(3 rows)
Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
The Hermit Hacker <scrappy@hub.org> writes:
So why is it counting 12 more images then are actually found/exist:
Hm. Could we see the EXPLAIN output for both of those?
regards, tom lane
On Sat, 12 May 2001, Tom Lane wrote:
The Hermit Hacker <scrappy@hub.org> writes:
So why is it counting 12 more images then are actually found/exist:
Hm. Could we see the EXPLAIN output for both of those?
without count:
NOTICE: QUERY PLAN:
Unique (cost=8.66..8.67 rows=1 width=37)
-> Sort (cost=8.66..8.66 rows=1 width=37)
-> Nested Loop (cost=0.00..8.65 rows=1 width=37)
-> Nested Loop (cost=0.00..6.52 rows=1 width=33)
-> Nested Loop (cost=0.00..4.26 rows=1 width=29)
-> Index Scan using status_gid on status s (cost=0.00..2.23 rows=1 width=12)
-> Index Scan using images_gid on images i (cost=0.00..2.02 rows=1 width=17)
-> Index Scan using personal_data_gid on personal_data pd (cost=0.00..2.25 rows=1 width=4)
-> Index Scan using relationship_wanted_gid on relationship_wanted rw (cost=0.00..2.11 rows=1 width=4)
EXPLAIN
with count:
NOTICE: QUERY PLAN:
Unique (cost=8.68..8.69 rows=1 width=37)
-> Sort (cost=8.68..8.68 rows=1 width=37)
-> Aggregate (cost=8.66..8.67 rows=1 width=37)
-> Group (cost=8.66..8.67 rows=1 width=37)
-> Sort (cost=8.66..8.66 rows=1 width=37)
-> Nested Loop (cost=0.00..8.65 rows=1 width=37)
-> Nested Loop (cost=0.00..6.52 rows=1 width=33)
-> Nested Loop (cost=0.00..4.26 rows=1 width=29)
-> Index Scan using status_gid on status s (cost=0.00..2.23 rows=1 width=12)
-> Index Scan using images_gid on images i (cost=0.00..2.02 rows=1 width=17)
-> Index Scan using personal_data_gid on personal_data pd (cost=0.00..2.25 rows=1 width=4)
-> Index Scan using relationship_wanted_gid on relationship_wanted rw (cost=0.00..2.11 rows=1 width=4)
EXPLAIN
regards, tom lane
Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
Ah, I see it: your join against relationship_wanted isn't unique.
globalmatch=# select count(*) from personal_data pd
globalmatch-# where pd.gid = 17111 AND pd.gender = 0;
count
-------
1
(1 row)
globalmatch=# select count(*) from relationship_wanted rw
globalmatch-# where rw.gid = 17111 AND rw.gender = 0;
count
-------
5
(1 row)
globalmatch=#
So that inflates the number of rows coming out of the join by 5.
regards, tom lane
Import Notes
Reply to msg id not found: Pine.BSF.4.33.0105122225230.629-100000@mobile.hub.orgReference msg id not found: Pine.BSF.4.33.0105122225230.629-100000@mobile.hub.org | Resolved by subject fallback
On Sat, 12 May 2001, Tom Lane wrote:
Ah, I see it: your join against relationship_wanted isn't unique.
globalmatch=# select count(*) from personal_data pd
globalmatch-# where pd.gid = 17111 AND pd.gender = 0;
count
-------
1
(1 row)globalmatch=# select count(*) from relationship_wanted rw
globalmatch-# where rw.gid = 17111 AND rw.gender = 0;
count
-------
5
(1 row)globalmatch=#
So that inflates the number of rows coming out of the join by 5.
Okay, then I'm lost ... why wouldn't that show up without the COUNT()? I
doubt doubt your analysis, I just want to understand why ...
The Hermit Hacker <scrappy@hub.org> writes:
So that inflates the number of rows coming out of the join by 5.
Okay, then I'm lost ... why wouldn't that show up without the COUNT()? I
doubt doubt your analysis, I just want to understand why ...
You had DISTINCT on your query, which hid the duplicated rows from you.
But that happens *after* aggregate processing, so it doesn't hide the
dups from COUNT().
regards, tom lane
On Sat, 12 May 2001, Tom Lane wrote:
The Hermit Hacker <scrappy@hub.org> writes:
So that inflates the number of rows coming out of the join by 5.
Okay, then I'm lost ... why wouldn't that show up without the COUNT()? I
doubt doubt your analysis, I just want to understand why ...You had DISTINCT on your query, which hid the duplicated rows from you.
But that happens *after* aggregate processing, so it doesn't hide the
dups from COUNT().
Ahhhh, okay, that makes sense ... thanks for taking the time to check it
for me ... and explaining what I was missing ...