problem with the sum function

Started by Marco Kienzlealmost 24 years ago4 messagesbugs
Jump to latest
#1Marco Kienzle
m.kienzle@marlab.ac.uk

The following problem occur using the sum() function (see the attached
file for all the details and an example):
- if you use it on a portion of a table (example: table age) you get a
result that differ from the one you can get by hand (see the whole table
temp1 and do the sum by hand)
- if you use the sum() function on the complete table (i.e. if you
first create a temporary table and then run the sum() function) then you
get the right result (see example on temp1).

Conclusion: I am not sure its really a bug but this differences can be
really misleading.

Have a nice day,
marco
--
____________________________________________________________________________

Marco Kienzle
Fisheries Research Services
Marine Laboratory
PO Box 101 Victoria Road
Aberdeen AB119DB
United Kingdom

tel: +44 (0) 1224 876544
direct: +44 (0) 1224 295412
fax: +44 (0) 1224 295511
http://www.marlab.ac.uk

Attachments:

PgBugReport.txttext/plain; charset=ISO-8859-1; name=PgBugReport.txtDownload
#2Dmitry Tkach
dmitry@openratings.com
In reply to: Marco Kienzle (#1)
Re: problem with the sum function

'group by' must be your problem.
If you remove that clause from your second query, you should then get
the same result (77) sum'ing the temp table...

I hope, it helps...

Dima

Marco Kienzle wrote:

Show quoted text

The following problem occur using the sum() function (see the attached
file for all the details and an example):
- if you use it on a portion of a table (example: table age) you get a
result that differ from the one you can get by hand (see the whole table
temp1 and do the sum by hand)
- if you use the sum() function on the complete table (i.e. if you
first create a temporary table and then run the sum() function) then you
get the right result (see example on temp1).

Conclusion: I am not sure its really a bug but this differences can be
really misleading.

Have a nice day,
marco

------------------------------------------------------------------------

SYSTEM INFORMATION
Distribution: Red Hat Linux
Operating System: Linux
Distribution Version: Red Hat Linux release 7.2 (Enigma)

Operating System Version: #1 Thu Sep 6 17:27:27 EDT 2001
Operating System Release: 2.4.7-10
Processor Type: i686
------------------------------------------------------------------------------
THE VERSION OF POSTGRESQL

bash-2.05$ psql --version
psql (PostgreSQL) 7.1.3
contains readline, history, multibyte support
Portions Copyright (c) 1996-2001, PostgreSQL Global Development Group
Portions Copyright (c) 1996 Regents of the University of California
Read the file COPYRIGHT or use the command \copyright to see the
usage and distribution terms.

------------------------------------------------------------------------------
THE DESCRIPTION OF THE TABLE AGE

herring=# \d age
Table "age"
Attribute | Type | Modifier
---------------+-----------------------+----------
code | character varying(10) |
inst | character varying(10) |
year | smallint |
quart | smallint |
month | smallint |
reg | character varying(10) |
div | character varying(10) |
subdiv | character varying(10) |
gridcell | character varying(10) |
sp | character varying(50) |
stock | character varying(10) |
samptype | character varying(10) |
lengthcell | smallint |
sex | character varying(10) |
age | smallint |
agenum | bigint |
weightmeanage | double precision |

----------------------------------------------------------------------------
THE SUM QUERY ON THE TABLE AGE

herring=# select sum(agenum) from age where weightmeanage>0 and lengthcell=160;
sum
-----
77
(1 row)

-----------------------------------------------------------------------------
CREATION OF THE TABLE TEMP1

select INTO TABLE temp1 inst, year, month, lengthcell, sex, age, agenum, weightmeanage from age where weightmeanage>0 and lengthcell=160 group by inst, year, month, lengthcell, sex,age,agenum, weightmeanage;
------------------------------------------------------------------------------
THE TABLE DESCRIPTION OF THE TABLE TEMP1

herring=# \d temp1
Table "temp1"
Attribute | Type | Modifier
---------------+-----------------------+----------
inst | character varying(10) |
year | smallint |
month | smallint |
lengthcell | smallint |
sex | character varying(10) |
age | smallint |
agenum | bigint |
weightmeanage | double precision |

----------------------------------------------------------------------
THE WHOLE TABLE TEMP1

herring=# select * from temp1;
inst | year | month | lengthcell | sex | age | agenum | weightmeanage
------+------+-------+------------+-----+-----+--------+---------------
IMR | 1991 | 2 | 160 | F | 1 | 10 | 31
IMR | 1991 | 2 | 160 | M | 1 | 3 | 32
IMR | 1992 | 2 | 160 | F | 1 | 1 | 25
IMR | 1992 | 2 | 160 | F | 1 | 1 | 26
IMR | 1992 | 2 | 160 | M | 1 | 1 | 25
IMR | 1992 | 2 | 160 | M | 1 | 1 | 28
IMR | 1992 | 5 | 160 | M | 1 | 1 | 34
IMR | 1992 | 6 | 160 | F | 1 | 1 | 30
IMR | 1992 | 10 | 160 | F | 0 | 0 | 34
IMR | 1992 | 10 | 160 | M | 0 | 0 | 25
IMR | 1993 | 2 | 160 | F | 1 | 1 | 27
IMR | 1993 | 2 | 160 | F | 1 | 1 | 28
IMR | 1993 | 2 | 160 | F | 1 | 1 | 29
IMR | 1993 | 2 | 160 | F | 1 | 1 | 30
IMR | 1993 | 2 | 160 | M | 1 | 1 | 27
IMR | 1993 | 2 | 160 | M | 1 | 1 | 28
IMR | 1993 | 2 | 160 | M | 1 | 1 | 29
IMR | 1993 | 11 | 160 | F | 0 | 0 | 30
IMR | 1993 | 11 | 160 | M | 0 | 0 | 28
IMR | 1994 | 5 | 160 | F | 1 | 1 | 38
IMR | 1994 | 5 | 160 | F | 1 | 1 | 42
IMR | 1994 | 5 | 160 | M | 1 | 1 | 34
IMR | 1994 | 5 | 160 | M | 1 | 1 | 39
IMR | 1994 | 11 | 160 | F | 0 | 0 | 31
IMR | 1995 | 1 | 160 | F | 1 | 1 | 27
IMR | 1995 | 1 | 160 | F | 1 | 1 | 28
IMR | 1995 | 1 | 160 | F | 2 | 2 | 28
IMR | 1995 | 1 | 160 | F | 2 | 2 | 29
IMR | 1995 | 1 | 160 | M | 1 | 1 | 26
IMR | 1995 | 1 | 160 | M | 1 | 1 | 27
IMR | 1995 | 1 | 160 | M | 1 | 1 | 29
IMR | 1995 | 1 | 160 | M | 1 | 1 | 30
IMR | 1995 | 1 | 160 | M | 2 | 2 | 26
IMR | 1995 | 5 | 160 | F | 1 | 1 | 29
IMR | 1995 | 5 | 160 | F | 1 | 1 | 34
IMR | 1995 | 5 | 160 | F | 1 | 1 | 37
IMR | 1995 | 5 | 160 | F | 1 | 1 | 38
IMR | 1995 | 5 | 160 | M | 1 | 1 | 23
IMR | 1995 | 5 | 160 | M | 1 | 1 | 37
IMR | 1995 | 6 | 160 | F | 1 | 1 | 32
IMR | 1995 | 6 | 160 | M | 1 | 1 | 32
IMR | 1995 | 7 | 160 | F | 1 | 1 | 31
IMR | 1995 | 7 | 160 | F | 1 | 1 | 37
IMR | 1995 | 7 | 160 | M | 1 | 1 | 32
IMR | 1995 | 7 | 160 | M | 1 | 1 | 35
IMR | 1995 | 11 | 160 | F | 0 | 0 | 29
IMR | 1995 | 11 | 160 | F | 0 | 0 | 31
IMR | 1995 | 11 | 160 | M | 0 | 0 | 29
IMR | 1995 | 11 | 160 | M | 0 | 0 | 31
IMR | 1995 | 11 | 160 | M | 0 | 0 | 33
IMR | 1996 | 2 | 160 | F | 1 | 1 | 29
IMR | 1996 | 2 | 160 | F | 1 | 1 | 31
IMR | 1996 | 2 | 160 | M | 1 | 1 | 28
IMR | 1996 | 11 | 160 | M | 0 | 0 | 32
IMR | 1997 | 5 | 160 | F | 1 | 1 | 41
IMR | 1997 | 5 | 160 | M | 1 | 1 | 39
IMR | 1997 | 5 | 160 | M | 1 | 1 | 46
IMR | 1997 | 7 | 160 | F | 1 | 1 | 30
IMR | 1997 | 7 | 160 | M | 1 | 1 | 28
IMR | 1997 | 7 | 160 | M | 1 | 1 | 32
IMR | 1998 | 6 | 160 | F | 1 | 1 | 40
IMR | 1998 | 6 | 160 | M | 1 | 1 | 31
IMR | 1998 | 11 | 160 | F | 0 | 0 | 29
IMR | 1998 | 11 | 160 | M | 0 | 0 | 29
IMR | 1999 | 2 | 160 | F | 1 | 1 | 26
IMR | 1999 | 6 | 160 | F | 1 | 1 | 29
IMR | 1999 | 7 | 160 | F | 1 | 1 | 32
IMR | 1999 | 7 | 160 | F | 1 | 1 | 33
IMR | 1999 | 7 | 160 | F | 1 | 1 | 34
IMR | 1999 | 7 | 160 | M | 1 | 1 | 30
IMR | 1999 | 7 | 160 | M | 1 | 1 | 31
IMR | 1999 | 7 | 160 | M | 1 | 1 | 32
(72 rows)
------------------------------------------------------------------------------
THE SUM QUERY ON TEMP1

herring=# select sum(agenum) from temp1;
sum
-----
73
(1 row)

______________________________________________________________________________

------------------------------------------------------------------------

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Marco Kienzle (#1)
Re: problem with the sum function

Marco Kienzle <m.kienzle@marlab.ac.uk> writes:

herring=3D# select sum(agenum) from age where weightmeanage>0 and lengthcel=
l=3D160;
sum=20
-----
77
(1 row)

select INTO TABLE temp1 inst, year, month, lengthcell, sex, age, agenum, w=
eightmeanage from age where weightmeanage>0 and lengthcell=3D160 group by i=
nst, year, month, lengthcell, sex,age,agenum, weightmeanage;

herring=3D# select sum(agenum) from temp1;
sum=20
-----
73
(1 row)

Why would you expect these to give the same result? The "group by"
effectively eliminates duplicate rows, thus removing some contributions
to the sum.

regards, tom lane

#4Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Marco Kienzle (#1)
Re: problem with the sum function

On 7 May 2002, Marco Kienzle wrote:

The following problem occur using the sum() function (see the attached
file for all the details and an example):
- if you use it on a portion of a table (example: table age) you get a
result that differ from the one you can get by hand (see the whole table
temp1 and do the sum by hand)
- if you use the sum() function on the complete table (i.e. if you
first create a temporary table and then run the sum() function) then you
get the right result (see example on temp1).

Well, since you didn't give data on age, I can't tell for certain, but
your temp table is not necessarily the same as the source. With that
group by I believe you're dropping duplicates if there are any. I'd
suggest seeing what (untested sql)

select inst,year,month,lengthcell,sex,age,agenum,weightmeanage,count(*)
from age where weightmeanage>0 and lengthcell=160 group by inst,
year,month,lengthcell,sex,age,agenum,weightmeanage having count(*)>1;

gives you.