Re: [HACKERS] 6.4 Aggregate Bug

Started by David Hartwigover 27 years ago7 messages
#1David Hartwig
daveh@insightdist.com

Bruce Momjian wrote:

Did we fix this yet?

While testing my 6.4 patch to allow functions/expressions to be
specified in the ORDER/GROUP BY clause (and not in the target list) I
came across a nasty little bug. A segmentation fault gets thrown
somewhere in replace_agg_clause() when using aggregates, in combination
with a function or expression. (I am still tracking down the
offending lines of code. Sorry, the Linux/GCC environment is still new
to me.)

I backed out my patch, and discovered the bug was still present. The
bug does not exist in version 6.3.2. Here is an example:

-- This crashes the backend
select upper(a) as x, count(k) from t group by x;

-- This works fine
select upper(a) as x, count(a) from t group by x;

Notice how in the first query, (the one that does not work) upper() has
a different argument than count(). And in the second query (the one
that works) upper() has the same argument as count(). When using
count(*) it will always fail.

This is the the pattern that I have observed. If the arguments in the
aggregate and non-aggregate functions are the same, it runs; if the
arguments in the aggregate and non-aggregate functions are different, it
crashes.

I have attached a test script for anyone able to help with (or verify)
this problem.

create table t (
j integer,
k integer,
a varchar
);
insert into t values (1, 1, 'a');
insert into t values (2, 2, 'b');
insert into t values (2, 3, 'c');
insert into t values (3, 4, 'A');
insert into t values (3, 5, 'B');
insert into t values (3, 6, 'C');
insert into t values (4, 7, 'a');
insert into t values (4, 8, 'b');
insert into t values (4, 9, 'c');
insert into t values (4, 0, 'a');

-- OK
select upper(a) as x, count(a) from t group by x;

-- OK
select k/2 as x, max(k) from t group by x;

-- OK
-- select k as x, max(j) from t group by x;

-- OK
select upper(a) as x, count(k), count(a) from t group by x;

-- CRASH
select k/2 as x, max(j) from t group by x;

-- CRASH
select upper(a) as x, count(k) from t group by x;

-- CRASH
select upper(a) as x, count(xmin) from t group by x;

-- CRASH
select upper(a) as x, count(oid) from t group by x;

-- CRASH
select upper(a) as x, count(*) from t group by x;

It seems that the last case has corrected itself.

select upper(a) as x, count(*) from t group by x
WORKS

The other conditions are still a problem. In general, as long as the
argument in the grouped function is used somewhere else in the target list
there is no problem.

select func(x), x, aggfunc(y) group by func
WORKS

select func(x), aggfunc(y) group by func
CRASHES

#2Andreas Zeugswetter
andreas.zeugswetter@telecom.at
In reply to: David Hartwig (#1)
AW: [HACKERS] 6.4 Aggregate Bug

select func(x), aggfunc(y) group by func
CRASHES

I think this should be disallowed syntax. Note that func is a system generated label
in this case, that's probably why the parser does not complain. The syntax has to be:

select func(x), aggfunc(y) group by func(x); -- or

select func(x) as func, aggfunc(y) group by func;

Andreas

#3David Hartwig
daveh@insightdist.com
In reply to: Andreas Zeugswetter (#2)
Re: AW: [HACKERS] 6.4 Aggregate Bug

Andreas Zeugswetter wrote:

select func(x), aggfunc(y) group by func
CRASHES

I think this should be disallowed syntax. Note that func is a system generated label
in this case, that's probably why the parser does not complain. The syntax has to be:

select func(x), aggfunc(y) group by func(x); -- or

select func(x) as func, aggfunc(y) group by func;

To my knowledge postgres has always allowed this syntax. IMO is seems harmless to
continue. In any case, all of these syntax permutations will crash the backend in using
6.4.

#4Bruce Momjian
maillist@candle.pha.pa.us
In reply to: David Hartwig (#1)

Bruce Momjian wrote:

Did we fix this yet?

While testing my 6.4 patch to allow functions/expressions to be
specified in the ORDER/GROUP BY clause (and not in the target list) I
came across a nasty little bug. A segmentation fault gets thrown
somewhere in replace_agg_clause() when using aggregates, in combination
with a function or expression. (I am still tracking down the
offending lines of code. Sorry, the Linux/GCC environment is still new
to me.)

I backed out my patch, and discovered the bug was still present. The
bug does not exist in version 6.3.2. Here is an example:

-- This crashes the backend
select upper(a) as x, count(k) from t group by x;

-- This works fine
select upper(a) as x, count(a) from t group by x;

Notice how in the first query, (the one that does not work) upper() has
a different argument than count(). And in the second query (the one
that works) upper() has the same argument as count(). When using
count(*) it will always fail.

This is the the pattern that I have observed. If the arguments in the
aggregate and non-aggregate functions are the same, it runs; if the
arguments in the aggregate and non-aggregate functions are different, it
crashes.

I have attached a test script for anyone able to help with (or verify)
this problem.

create table t (
j integer,
k integer,
a varchar
);
insert into t values (1, 1, 'a');
insert into t values (2, 2, 'b');
insert into t values (2, 3, 'c');
insert into t values (3, 4, 'A');
insert into t values (3, 5, 'B');
insert into t values (3, 6, 'C');
insert into t values (4, 7, 'a');
insert into t values (4, 8, 'b');
insert into t values (4, 9, 'c');
insert into t values (4, 0, 'a');

-- OK
select upper(a) as x, count(a) from t group by x;

-- OK
select k/2 as x, max(k) from t group by x;

-- OK
-- select k as x, max(j) from t group by x;

-- OK
select upper(a) as x, count(k), count(a) from t group by x;

-- CRASH
select k/2 as x, max(j) from t group by x;

-- CRASH
select upper(a) as x, count(k) from t group by x;

-- CRASH
select upper(a) as x, count(xmin) from t group by x;

-- CRASH
select upper(a) as x, count(oid) from t group by x;

-- CRASH
select upper(a) as x, count(*) from t group by x;

It seems that the last case has corrected itself.

select upper(a) as x, count(*) from t group by x
WORKS

The other conditions are still a problem. In general, as long as the
argument in the grouped function is used somewhere else in the target list
there is no problem.

select func(x), x, aggfunc(y) group by func
WORKS

select func(x), aggfunc(y) group by func
CRASHES

Added to TODO:

* select upper(usename), count(usesysid) from pg_shadow group by 1 fails

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)
#5Vadim Mikheev
vadim@krs.ru
In reply to: Bruce Momjian (#4)
Re: [HACKERS] 6.4 Aggregate Bug and HAVING problems...

Added to TODO:

* select upper(usename), count(usesysid) from pg_shadow group by 1 fails

Fixed. Was broken by new HAVING code...

I see another problems in HAVING:

1.
drop table x;
create table x (a int, b int, c int);
insert into x values (1,1,0);
insert into x values (1,1,1);
select a, min (b) from x group by a having min(b) > c;
-- a|min
-- -+---
-- 1| 1
-- (1 row)
delete from x;
vacuum x;
insert into x values (1,1,1);
insert into x values (1,1,0);
select a, min (b) from x group by a having min(b) > c;
-- a|min
-- -+---
-- (0 rows)

Using C-column is illegal!

2.
vac=> select a, min (b) from x group by a having min(b) > 0 or a = 0;
^^^^^^^^
ERROR: This could have been done in a where clause!!
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
No, couldn't! Using GroupBy expr in having must be allowed.

I'm going to change HAVING behaviour...

Also, could someone test is HAVING without aggregates
disallowed or not:

select a, min (b) from x group by a having a = 0;

- ???

Vadim

#6Andreas Zeugswetter
andreas.zeugswetter@telecom.at
In reply to: Vadim Mikheev (#5)
Re: [HACKERS] 6.4 Aggregate Bug and HAVING problems...

Also, could someone test is HAVING without aggregates
disallowed or not:

select a, min (b) from x group by a having a = 0;

allowed in Informix:
a (min)
No rows found.

Andreas

#7Vadim Mikheev
vadim@krs.ru
In reply to: Andreas Zeugswetter (#6)
Re: [HACKERS] 6.4 Aggregate Bug and HAVING problems...

Andreas Zeugswetter wrote:

Also, could someone test is HAVING without aggregates
disallowed or not:

select a, min (b) from x group by a having a = 0;

allowed in Informix:
a (min)
No rows found.

Thanks, Andreas!
I'll comment out some code... Actually, non-aggregate expressions
could be moved to WHERE, but at the moment I'll just allow them in
HAVING.

Ok, there are also some problems with subselects in HAVING
but I haven't time to fix them now:

select a as a2, b as b2, c as c2 into table x2 from x;
select a, sum(b) from x group by a having avg(c) =
(select max(c2) from x2 where a2 = a/2);
-- ok

select a/2, sum(b) from x group by a/2 having avg(c) =
(select max(c2) from x2 where a2 = a/2);
-- ERROR: You must group by the attribute used from outside!
-- this means that GroupBy func doesn't work here...

select a, sum(b) from x group by a having avg(c) =
(select max(c2) from x2 where a2 = max(b));
-- ERROR: parser: aggregates not allowed in WHERE clause
-- Is this allowed in another dbms-es ???

-- This is not problem of HAVING but subselects...
select a as f, sum(b) from x group by f having avg(c) =
(select max(c2) from x2 where a2 = f);
-- ERROR: attribute 'f' not found
-- Should be aliasing handled in subselects ???

Vadim