sum multiple tables gives wrong answer?

Started by Michael Dieneralmost 16 years ago8 messagesgeneral
Jump to latest
#1Michael Diener
m.diener@gomogi.com

Hi,

I'm new to the list and have the following situation happening "PostgreSQL
8.4.1, compiled by Visual C++ build 1400, 32-bit" running windows XP sp3

I have an SQL problem that I thought was easy to do but gives me always the
wrong answer.

2 Tables with a column called "flaeche" "double precision", in English
"area" and I want to sum up the values for flaeche in each table to give me
the total area for flaeche in each table.

Correct answer comes with this sql

select sum(flaeche)/10000 as "greens HA" from green;

result:

greenHA

1.25358085

Wrong Answer with this query

select

sum(green.flaeche)/10000 as "greens HA",

sum (fairway.flaeche)/10000 as "fairway HA"

from green, fairway;

result:

green HA fairway HA

48.8896531 508.94143659

Fairway correct answer is 14.96886578 HA

Green correct answer is 1.25358085 HA

What is going on ??

Cheers

michael

Michael Diener

_________________________________________________________________

GOMOGI Mobile Geographics

LAKESIDE PARK B01

9020 KLAGENFURT

T: ++043 (0) 676 520 3600

E: m.diener@gomogi.com

W: www.gomogi.com

#2Thom Brown
thombrown@gmail.com
In reply to: Michael Diener (#1)
Re: [NOVICE] sum multiple tables gives wrong answer?

On 2 June 2010 15:23, Michael Diener <m.diener@gomogi.com> wrote:

Hi,

I’m new to the list and have the following situation happening "PostgreSQL
8.4.1, compiled by Visual C++ build 1400, 32-bit" running windows XP sp3

I have an SQL problem that I thought was easy to do but gives me always the
wrong answer.

2 Tables with a column called “flaeche” “double precision”, in English
 “area” and I want to sum up the values for flaeche in each table to give me
the total area for flaeche in each table.

Correct answer comes with this sql

select  sum(flaeche)/10000 as "greens HA"  from green;

result:

greenHA

1.25358085

Wrong Answer with this query

select

 sum(green.flaeche)/10000 as "greens HA",

 sum (fairway.flaeche)/10000 as "fairway HA"

  from green, fairway;

result:

green HA                   fairway HA

48.8896531                 508.94143659

Fairway correct answer is  14.96886578 HA

Green correct answer is 1.25358085  HA

What is going on ??

Cheers

michael

Michael Diener

_________________________________________________________________

Could it be because you're effectively using a cartesian join?

Can't you do them separately? Like:

select sum(green.flaeche)/10000 as "greens HA" from green;
select sum (fairway.flaeche)/10000 as "fairway HA" from fairway;

Or if you must have both in the same result:

select (select sum(green.flaeche)/10000 from green) as "greens HA",
(select sum (fairway.flaeche)/10000 from fairway) as "fairway HA";

Regards

Thom

#3Richard Broersma
richard.broersma@gmail.com
In reply to: Thom Brown (#2)
Re: [NOVICE] sum multiple tables gives wrong answer?

On Wed, Jun 2, 2010 at 7:23 AM, Michael Diener <m.diener@gomogi.com> wrote:

select  sum(flaeche)/10000 as "greens HA"  from green;

result:

Wrong Answer with this query

select
 sum(green.flaeche)/10000 as "greens HA",
 sum (fairway.flaeche)/10000 as "fairway HA"
  from green, fairway;

It isn't easy to see but you are effectively joining green to fairway
using a cross project which product a Cartesian product.

you probably wanted this query:

SELECT (select sum(flaeche)/10000 from green) AS "greens HA",
(select sum(flaeche)/10000 from fairway) AS "fairway HA";

However, from what you've shown. I would wager that your database is
in need of some normalization. For example you could put both greens
and fair way into a single table like:

CREATE TABLE Lawns AS
SELECT flaech, "green"::VARCHAR AS lawntype
FROM green
UNION ALL
SELECT flaech, "fairway"::VARCHAR AS lawntype
FROM fairway;

Then you'd execute the following query:

SELECT lawntype, sum(flaech)/10000 AS "HA"
FROM Lawns
GROUP BY lawntype;

--
Regards,
Richard Broersma Jr.

Visit the Los Angeles PostgreSQL Users Group (LAPUG)
http://pugs.postgresql.org/lapug

#4Oliveiros
oliveiros.cristina@marktest.pt
In reply to: Michael Diener (#1)
Re: [NOVICE] sum multiple tables gives wrong answer?

Howdy, Michael.

Your query is failing because you are doing the cartesian product of the tables with that query
Can't you do it on two different queries?

Say
select sum(flaeche)/10000 as "greens HA" from green;
and then
select sum(flaeche)/10000 as "fairway HA" from fairway;
?

Do you really need one single query?

If so, try this

select

(sum(green.flaeche)/10000) / count(fairway.*) as "greens HA",

(sum (fairway.flaeche)/10000) / count(green.*) as "fairway HA"

from green, fairway;

NB: This is untested code, it might contain syntactic/semantic bugs.

Best,
Oliveiros Cristina

----- Original Message -----
From: Michael Diener
To: pgsql-novice@postgresql.org
Cc: pgsql-general@postgresql.org
Sent: Wednesday, June 02, 2010 3:23 PM
Subject: [NOVICE] sum multiple tables gives wrong answer?

Hi,

I'm new to the list and have the following situation happening "PostgreSQL 8.4.1, compiled by Visual C++ build 1400, 32-bit" running windows XP sp3

I have an SQL problem that I thought was easy to do but gives me always the wrong answer.

2 Tables with a column called "flaeche" "double precision", in English "area" and I want to sum up the values for flaeche in each table to give me the total area for flaeche in each table.

Correct answer comes with this sql

select sum(flaeche)/10000 as "greens HA" from green;

result:

greenHA

1.25358085

Wrong Answer with this query

select

sum(green.flaeche)/10000 as "greens HA",

sum (fairway.flaeche)/10000 as "fairway HA"

from green, fairway;

result:

green HA fairway HA

48.8896531 508.94143659

Fairway correct answer is 14.96886578 HA

Green correct answer is 1.25358085 HA

What is going on ??

Cheers

michael

Michael Diener

_________________________________________________________________

GOMOGI Mobile Geographics

LAKESIDE PARK B01

9020 KLAGENFURT

T: ++043 (0) 676 520 3600

E: m.diener@gomogi.com

W: www.gomogi.com

#5A. Kretschmer
andreas.kretschmer@schollglas.com
In reply to: Michael Diener (#1)
Re: sum multiple tables gives wrong answer?

In response to Michael Diener :

2 Tables with a column called ?flaeche? ?double precision?, in English ?area?
and I want to sum up the values for flaeche in each table to give me the total
area for flaeche in each table.

Wrong Answer with this query

select
sum(green.flaeche)/10000 as "greens HA",
sum (fairway.flaeche)/10000 as "fairway HA"
from green, fairway;

What is going on ??

It's a so called cross-join, every row form the first table crossed with
evvery row from the other table -> wrong result.

Simple example:

test=*# select * from t1;
id | flaeche
----+---------
1 | 10
2 | 20
(2 Zeilen)

Zeit: 0,229 ms
test=*# select * from t2;
id | flaeche
----+---------
1 | 100
2 | 200
(2 Zeilen)

Zeit: 0,182 ms
test=*# select sum(t1.flaeche), sum(t2.flaeche) from t1, t2;
sum | sum
-----+-----
60 | 600
(1 Zeile)

It's just this:

test=*# select * from t1, t2;
id | flaeche | id | flaeche
----+---------+----+---------
1 | 10 | 1 | 100
1 | 10 | 2 | 200
2 | 20 | 1 | 100
2 | 20 | 2 | 200
(4 Zeilen)

But you are looking for:

test=*# select (select sum(flaeche) from t1) as t1_flaeche, (select
sum(flaeche) from t2);
t1_flaeche | ?column?
------------+----------
30 | 300
(1 Zeile)

Regards, Andreas
--
Andreas Kretschmer
Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header)
GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431 2EB0 389D 1DC2 3172 0C99

#6Stephen Frost
sfrost@snowman.net
In reply to: A. Kretschmer (#5)
Re: sum multiple tables gives wrong answer?

* Michael Diener (m.diener@gomogi.com) wrote:

I have an SQL problem that I thought was easy to do but gives me always the
wrong answer.

I think it's the right answer- the problem is that you're asking SQL a
different question than what you want the answer to.

2 Tables with a column called "flaeche" "double precision", in English
"area" and I want to sum up the values for flaeche in each table to give me
the total area for flaeche in each table.

Correct answer comes with this sql

select sum(flaeche)/10000 as "greens HA" from green;

Wrong Answer with this query

select

sum(green.flaeche)/10000 as "greens HA",

sum (fairway.flaeche)/10000 as "fairway HA"

from green, fairway;

What is going on ??

These are very different queries. Take out the sum()'s and see what you
get. What you'll find is that, because the join is unconstrained,
you're getting a cartesian product. Basically, each row in green is
being repeated for each row in fairway.

eg:

green has:
column1, column2
a, 1
b, 2
c, 3

fairway has:
column1, column2
x, 1
y, 2
z, 3

select * from green, fairway;

results:

a, 1, x, 1
a, 1, y, 2
a, 1, z, 3
b, 2, x, 1
b, 2, y, 2
b, 2, z, 3
c, 3, x, 1
c, 3, y, 2
c, 3, z, 3

What you really want to do is JOIN these two tables together, such as
in this:

select
green.column1 as green,
fairway.column1 as fairway,
column2
from
green
join fairway
using (column2)
;

results:
green,fairway
a, x, 1
b, y, 2
c, z, 3

Thanks,

Stephen

#7Michael Diener
m.diener@gomogi.com
In reply to: Richard Broersma (#3)
Re: [NOVICE] sum multiple tables gives wrong answer?

I just want to say thanks to all the great responses it is now working like
a charm!! I knew I was missing some tid bit of DB SQL knowledge.

Thanks again!

Cheers
michael

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Richard Broersma
Sent: Mittwoch, 02. Juni 2010 17:39
To: m.diener@gomogi.com
Cc: pgsql-novice@postgresql.org; pgsql-general@postgresql.org
Subject: Re: [GENERAL] [NOVICE] sum multiple tables gives wrong answer?

On Wed, Jun 2, 2010 at 7:23 AM, Michael Diener <m.diener@gomogi.com> wrote:

select  sum(flaeche)/10000 as "greens HA"  from green;

result:

Wrong Answer with this query

select
 sum(green.flaeche)/10000 as "greens HA",
 sum (fairway.flaeche)/10000 as "fairway HA"
  from green, fairway;

It isn't easy to see but you are effectively joining green to fairway
using a cross project which product a Cartesian product.

you probably wanted this query:

SELECT (select sum(flaeche)/10000 from green) AS "greens HA",
(select sum(flaeche)/10000 from fairway) AS "fairway HA";

However, from what you've shown. I would wager that your database is
in need of some normalization. For example you could put both greens
and fair way into a single table like:

CREATE TABLE Lawns AS
SELECT flaech, "green"::VARCHAR AS lawntype
FROM green
UNION ALL
SELECT flaech, "fairway"::VARCHAR AS lawntype
FROM fairway;

Then you'd execute the following query:

SELECT lawntype, sum(flaech)/10000 AS "HA"
FROM Lawns
GROUP BY lawntype;

--
Regards,
Richard Broersma Jr.

Visit the Los Angeles PostgreSQL Users Group (LAPUG)
http://pugs.postgresql.org/lapug

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

In reply to: Oliveiros (#4)

I replied just to him and I didnt notice, so Im just forwarding to group, in
case anyone need.

---- Or you may try Union, one for green other for fairway ----

On Wed, Jun 2, 2010 at 12:44 PM, Oliveiros
<oliveiros.cristina@marktest.pt>wrote:

Show quoted text

Howdy, Michael.

Your query is failing because you are doing the cartesian product of the
tables with that query
Can't you do it on two different queries?

Say
select sum(flaeche)/10000 as "greens HA" from green;
and then
select sum(flaeche)/10000 as "fairway HA" from fairway;
?

Do you really need one single query?

If so, try this

select

(sum(green.flaeche)/10000) / count(fairway.*) as "greens HA",

(sum (fairway.flaeche)/10000) / count(green.*) as "fairway HA"

from green, fairway;

NB: This is untested code, it might contain syntactic/semantic bugs.

Best,
Oliveiros Cristina

----- Original Message -----
*From:* Michael Diener <m.diener@gomogi.com>
*To:* pgsql-novice@postgresql.org
*Cc:* pgsql-general@postgresql.org
*Sent:* Wednesday, June 02, 2010 3:23 PM
*Subject:* [NOVICE] sum multiple tables gives wrong answer?

Hi,

I’m new to the list and have the following situation happening "PostgreSQL
8.4.1, compiled by Visual C++ build 1400, 32-bit" running windows XP sp3

I have an SQL problem that I thought was easy to do but gives me always the
wrong answer.

2 Tables with a column called “flaeche” “double precision”, in English
“area” and I want to sum up the values for flaeche in each table to give me
the total area for flaeche in each table.

Correct answer comes with this sql

select sum(flaeche)/10000 as "greens HA" from green;

result:

greenHA

1.25358085

Wrong Answer with this query

select

sum(green.flaeche)/10000 as "greens HA",

sum (fairway.flaeche)/10000 as "fairway HA"

from green, fairway;

result:

green HA fairway HA

48.8896531 508.94143659

Fairway correct answer is 14.96886578 HA

Green correct answer is 1.25358085 HA

What is going on ??

Cheers

michael

*Michael Diener*

_________________________________________________________________

GOMOGI Mobile Geographics

LAKESIDE PARK B01

9020 KLAGENFURT

T: ++043 (0) 676 520 3600

E: m.diener@gomogi.com

W: www.gomogi.com