Full outer join question.

Started by Paul McGarryalmost 21 years ago6 messagesgeneral
Jump to latest
#1Paul McGarry
paul.mcgarry@gmail.com

Hi there everyone,

I'm having trouble getting the rows I want from a full outer join with
a where clause. Here is a simplified version of my tables:

======
create table lefty (
day date,
goodamount numeric(10,2),
grp integer
);

insert into lefty values ('2005-06-01',5.00,1);
insert into lefty values ('2005-06-02',10.00,1);
insert into lefty values ('2005-06-01',2.50,2);

create table righty (
day date,
badamount numeric(10,2),
grp integer
);

insert into righty values ('2005-06-01',-5.00,3);
insert into righty values ('2005-06-02',-10.00,1);
insert into righty values ('2005-06-02',-2.50,1);
======

The base query I want to do is:
SELECT grp, count(goodamount), sum(goodamount), count(badamount),
sum(badamount) FROM lefty FULL OUTER JOIN righty USING (grp) GROUP BY
grp;

This returns what I expect:
======
grp | count | sum | count | sum
-----+-------+-------+-------+--------
3 | 0 | | 1 | -5.00
2 | 1 | 2.50 | 0 |
1 | 4 | 30.00 | 4 | -25.00
======

However, in the real world I don't want to do a query for the entire
tables, but for a particular date period, ie add restraints on
lefty.day and righty.day.

When I do that I lose all the rows whose grp isn't in both tables. For example:
======
SELECT grp, count(goodamount), sum(goodamount), count(badamount),
sum(badamount) FROM lefty FULL OUTER JOIN righty USING (grp)
WHERE lefty.day >= '2005-06-01' AND righty.day>='2005-06-01'
GROUP BY grp;
grp | count | sum | count | sum
-----+-------+-------+-------+--------
1 | 4 | 30.00 | 4 | -25.00
======

Any ideas?

Paul

#2Paul McGarry
paul.mcgarry@gmail.com
In reply to: Paul McGarry (#1)
Joins with aggregate data

Hi there everyone,

I'm having trouble getting the data I want from my tables.
Here is a simplified version of my tables:

======
create table lefty (
day date,
good numeric(10,2),
grp integer
);

insert into lefty values ('2005-06-01',5.00,1);
insert into lefty values ('2005-06-02',10.00,1);
insert into lefty values ('2005-06-01',2.50,2);

create table righty (
day date,
bad numeric(10,2),
grp integer
);

insert into righty values ('2005-06-01',-5.00,3);
insert into righty values ('2005-06-02',-10.00,1);
insert into righty values ('2005-06-02',-2.50,1);
======

I basically want a query which will give me:
======
grp | count(good) | sum(good) | count(bad) | sum(bad)
-----+-------------+-------------+------------+----------
3 | 0 | | 1 | -5.00
2 | 1 | 2.50 | 0 |
1 | 2 | 15.00 | 2 | -12.50
======
(possibly with zeros rather than nulls but doesn't matter)

At first I thought:
======
SELECT grp, count(goodamount), sum(goodamount), count(badamount), sum(badamount)
FROM lefty FULL OUTER JOIN righty USING (grp)
GROUP BY grp;
======
might do it but the join happens before the aggregation and the grp 1
results match each other two ways in the join and thus get counted
twice:
======
grp | count | sum | count | sum
-----+-------+-------+-------+--------
3 | 0 | | 1 | -5.00
2 | 1 | 2.50 | 0 |
1 | 4 | 30.00 | 4 | -25.00
======

Can someone point me in the right direction?

Paul

#3Michael Fuhr
mike@fuhr.org
In reply to: Paul McGarry (#2)
Re: Joins with aggregate data

On Fri, Jul 08, 2005 at 11:45:59AM +1000, Paul McGarry wrote:

I basically want a query which will give me:
======
grp | count(good) | sum(good) | count(bad) | sum(bad)
-----+-------------+-------------+------------+----------
3 | 0 | | 1 | -5.00
2 | 1 | 2.50 | 0 |
1 | 2 | 15.00 | 2 | -12.50
======
(possibly with zeros rather than nulls but doesn't matter)

How about doing the aggregates in separate subqueries and then doing
the outer join? Something like this:

SELECT coalesce(g.grp, b.grp) AS grp,
coalesce(g.count, 0) AS countgood,
coalesce(g.sum, 0) AS sumgood,
coalesce(b.count, 0) AS countbad,
coalesce(b.sum, 0) AS sumbad
FROM
(SELECT grp, count(good), sum(good) FROM lefty GROUP BY grp) AS g
FULL OUTER JOIN
(SELECT grp, count(bad), sum(bad) FROM righty GROUP BY grp) AS b USING (grp);

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

#4Paul McGarry
paul.mcgarry@gmail.com
In reply to: Michael Fuhr (#3)
Re: Joins with aggregate data

Thank you Michael, your suggestion works a charm (though I didn't
bother coalescing the two grp because I think the USING takes care of
that anyway.

Paul

On 7/8/05, Michael Fuhr <mike@fuhr.org> wrote:

Show quoted text

SELECT coalesce(g.grp, b.grp) AS grp,
coalesce(g.count, 0) AS countgood,
coalesce(g.sum, 0) AS sumgood,
coalesce(b.count, 0) AS countbad,
coalesce(b.sum, 0) AS sumbad
FROM
(SELECT grp, count(good), sum(good) FROM lefty GROUP BY grp) AS g
FULL OUTER JOIN
(SELECT grp, count(bad), sum(bad) FROM righty GROUP BY grp) AS b USING (grp);

#5Michael Fuhr
mike@fuhr.org
In reply to: Paul McGarry (#4)
Re: Joins with aggregate data

On Fri, Jul 08, 2005 at 01:37:54PM +1000, Paul McGarry wrote:

Thank you Michael, your suggestion works a charm (though I didn't
bother coalescing the two grp because I think the USING takes care of
that anyway.

Oops, yeah -- I had started with "ON b.grp = g.grp" but changed it
to "USING (grp)" and neglected to get rid of that coalesce.

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

#6Peter Eisentraut
peter_e@gmx.net
In reply to: Paul McGarry (#1)
Re: Full outer join question.

Am Freitag, 8. Juli 2005 03:22 schrieb Paul McGarry:

When I do that I lose all the rows whose grp isn't in both tables. For
example: ======
SELECT grp, count(goodamount), sum(goodamount), count(badamount),
sum(badamount) FROM lefty FULL OUTER JOIN righty USING (grp)
WHERE lefty.day >= '2005-06-01' AND righty.day>='2005-06-01'
GROUP BY grp;

You need to write ... WHERE (lefty.day >= '2005-06-01' OR lefty.day IS NULL)
AND etc.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/