BUG #1528: Rows returned that should be excluded by WHERE clause

Started by Peter Wrightabout 21 years ago45 messageshackersbugs
Jump to latest
#1Peter Wright
pete@flooble.net
hackersbugs

The following bug has been logged online:

Bug reference: 1528
Logged by: Peter Wright
Email address: pete@flooble.net
PostgreSQL version: 7.4.7, 8.0.1
Operating system: Debian Linux (unstable)
Description: Rows returned that should be excluded by WHERE clause
Details:

Hopefully this example SQL will paste correctly -
I think this demonstrates the problem much better than I could explain in
words. The bug is shown in the two
SELECT queries with a WHERE clause. Very bizarre.

The same bug crops up on 7.4.6, 7.4.7 and 8.0.1.

pete@serf [07/Mar 6:28:50] pts/10 !19 ~ $ createdb test1

CREATE DATABASE

pete@serf [07/Mar 6:28:59] pts/10 !20 ~ $ psql test1

Welcome to psql 7.4.7, the PostgreSQL interactive terminal.

Type: \copyright for distribution terms

\h for help with SQL commands

\? for help on internal slash commands

\g or terminate with semicolon to execute query

\q to quit

test1=# create table t1 ( a smallint primary key, b smallint ) ;

NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "t1_pkey" for
table "t1"
CREATE TABLE

test1=# create table t2 ( a smallint primary key, b smallint ) ;

NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "t2_pkey" for
table "t2"
CREATE TABLE

test1=# insert into t1 values (1, 1);

INSERT 118413888 1

test1=# insert into t1 values (2, 2);

INSERT 118413889 1

test1=# insert into t2 values (1, 4);

INSERT 118413890 1

test1=# insert into t2 values (2, 8);

INSERT 118413891 1

test1=# select id, min(b) from ( select 1 as id, max(b) as b from t1 union
select 2 as id, max(b) from t2 ) as q1 group by id ;
id | min

----+-----

1 | 2

2 | 8

(2 rows)

test1=# create view qry1 as select id, min(b) from ( select 1 as id, max(b)
as b from t1 union select 2 as id, max(b) from t2 ) as q1 group by id ;

CREATE VIEW

test1=# select * from qry1 where id = 1;

id | min

----+-----

1 | 2

2 |

(2 rows)

test1=# select * from qry1 where id = 2;

id | min

----+-----

1 |

2 | 8

(2 rows)

test1=# select * from qry1;

id | min

----+-----

1 | 2

2 | 8

(2 rows)

test1=#

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Wright (#1)
hackersbugs
Re: BUG #1528: Rows returned that should be excluded by WHERE clause

"Peter Wright" <pete@flooble.net> writes:

Description: Rows returned that should be excluded by WHERE clause

Interesting point. The view and union don't seem to be the issue;
I think the problem can be expressed as

regression=# select 2 as id, max(b) from t2 having 2 = 1;
id | max
----+-----
2 |
(1 row)

Now, if this were a WHERE clause, I think the answer would be right:

regression=# select 2 as id, max(b) from t2 where 2 = 1;
id | max
----+-----
2 |
(1 row)

but since it's HAVING I think this is probably wrong. Looking at the
EXPLAIN output

regression=# explain select 2 as id, max(b) from t2 having 2 = 1;
QUERY PLAN
----------------------------------------------------------------
Aggregate (cost=3.68..3.68 rows=1 width=2)
-> Result (cost=0.00..3.14 rows=214 width=2)
One-Time Filter: false
-> Seq Scan on t2 (cost=0.00..3.14 rows=214 width=2)
(4 rows)

the issue is clearly that the known-false HAVING clause is pushed down
inside the aggregation, as though it were WHERE. The existing code
pushes down HAVING to WHERE if the clause contains no aggregates, but
evidently this is too simplistic. What are the correct conditions for
pushing down HAVING clauses to WHERE?

regards, tom lane

#3Gill, Jerry T.
JTGill@west.com
In reply to: Tom Lane (#2)
bugs
Re: BUG #1528: Rows returned that should be excluded by WHERE clause

Just an interesting side note here, this behavior is identical to DB2. I am not sure if that makes it correct or not, but here is an example.

[gill@c2n2 gill]$ db2 "select 2 as id, max(apn3) from phoenix.client where 2 =1"

ID 2
----------- ------
2 -

1 record(s) selected.

-jgill

-----Original Message-----
From: pgsql-bugs-owner@postgresql.org
[mailto:pgsql-bugs-owner@postgresql.org]On Behalf Of Tom Lane
Sent: Tuesday, March 08, 2005 2:07 AM
To: Peter Wright
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #1528: Rows returned that should be excluded by
WHERE clause

"Peter Wright" <pete@flooble.net> writes:

Description: Rows returned that should be excluded by WHERE clause

Interesting point. The view and union don't seem to be the issue;
I think the problem can be expressed as

regression=# select 2 as id, max(b) from t2 having 2 = 1;
id | max
----+-----
2 |
(1 row)

Now, if this were a WHERE clause, I think the answer would be right:

regression=# select 2 as id, max(b) from t2 where 2 = 1;
id | max
----+-----
2 |
(1 row)

but since it's HAVING I think this is probably wrong. Looking at the
EXPLAIN output

regression=# explain select 2 as id, max(b) from t2 having 2 = 1;
QUERY PLAN
----------------------------------------------------------------
Aggregate (cost=3.68..3.68 rows=1 width=2)
-> Result (cost=0.00..3.14 rows=214 width=2)
One-Time Filter: false
-> Seq Scan on t2 (cost=0.00..3.14 rows=214 width=2)
(4 rows)

the issue is clearly that the known-false HAVING clause is pushed down
inside the aggregation, as though it were WHERE. The existing code
pushes down HAVING to WHERE if the clause contains no aggregates, but
evidently this is too simplistic. What are the correct conditions for
pushing down HAVING clauses to WHERE?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Gill, Jerry T. (#3)
bugs
Re: BUG #1528: Rows returned that should be excluded by WHERE clause

"Gill, Jerry T." <JTGill@west.com> writes:

Just an interesting side note here, this behavior is identical to DB2. I am not sure if that makes it correct or not, but here is an example.
[gill@c2n2 gill]$ db2 "select 2 as id, max(apn3) from phoenix.client where 2 =1"

ID 2
----------- ------
2 -

1 record(s) selected.

In the WHERE case I think there's no question that the above is correct:
WHERE is defined to filter rows before application of aggregates, so
zero rows arrive at the MAX aggregate, and that means it produces a
NULL.

But HAVING is supposed to filter after aggregation, so I think probably
there should be no row out in that case.

What does DB2 do when you say HAVING 2 = 1?

regards, tom lane

#5Gill, Jerry T.
JTGill@west.com
In reply to: Tom Lane (#4)
bugs
Re: BUG #1528: Rows returned that should be excluded by WHERE clause

Sorry Tom, I missed a sentence in you previous email. My understanding of the having clause is that the row should be filtered. Here is the same example with the having clause in DB2.

[gill@c2n2 gill]$ db2 "select 2 as id, max(apn3) from phoenix.client having 2 =1"

ID 2
----------- ------

0 record(s) selected.

[gill@c2n2 gill]$ db2 "select 2 as id, max(apn3) from phoenix.client where 2 =1 having 2 = 1"

ID 2
----------- ------

0 record(s) selected.

-jgill

-----Original Message-----
From: pgsql-bugs-owner@postgresql.org
[mailto:pgsql-bugs-owner@postgresql.org]On Behalf Of Tom Lane
Sent: Tuesday, March 08, 2005 11:15 AM
To: Gill, Jerry T.
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #1528: Rows returned that should be excluded by
WHERE clause

"Gill, Jerry T." <JTGill@west.com> writes:

Just an interesting side note here, this behavior is identical to DB2. I am not sure if that makes it correct or not, but here is an example.
[gill@c2n2 gill]$ db2 "select 2 as id, max(apn3) from phoenix.client where 2 =1"

ID 2
----------- ------
2 -

1 record(s) selected.

In the WHERE case I think there's no question that the above is correct:
WHERE is defined to filter rows before application of aggregates, so
zero rows arrive at the MAX aggregate, and that means it produces a
NULL.

But HAVING is supposed to filter after aggregation, so I think probably
there should be no row out in that case.

What does DB2 do when you say HAVING 2 = 1?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#2)
hackersbugs
Re: BUG #1528: Rows returned that should be excluded by WHERE clause

I wrote:

I think the problem can be expressed as

regression=# select 2 as id, max(b) from t2 having 2 = 1;
id | max
----+-----
2 |
(1 row)

the issue is clearly that the known-false HAVING clause is pushed down
inside the aggregation, as though it were WHERE. The existing code
pushes down HAVING to WHERE if the clause contains no aggregates, but
evidently this is too simplistic. What are the correct conditions for
pushing down HAVING clauses to WHERE?

After reading the spec a little, I think that we have oversimplified our
handling of aggregate-free HAVING clauses. If you look in planner.c
you'll find that such a clause is converted into a WHERE clause, but
this is not what the spec says to do, and you can tell the difference
in cases like the above.

What the spec actually says, or at least implies, is that a HAVING
clause is to be evaluated only once per group --- where the "group"
is the whole table if there's no GROUP BY clause. The group is
to be discarded if the HAVING clause doesn't return true. SQL92 7.8:

1) Let T be the result of the preceding <from clause>, <where
clause>, or <group by clause>. If that clause is not a <group
by clause>, then T consists of a single group and does not have
a grouping column.

2) The <search condition> is applied to each group of T. The result
of the <having clause> is a grouped table of those groups of T
for which the result of the <search condition> is true.

So it's clear that what the above case should return is a grouped table
having no groups ... ie, no rows out. What we are actually returning is
one group containing no rows, which is visibly different because of the
presence of the aggregate function in the SELECT list.

There are really four cases to think about, depending on whether the
query has GROUP BY and on whether it has any aggregates outside the
HAVING clause:

1. No GROUP BY, no aggregates

Per spec, the HAVING clause should be evaluated once and either we
return the whole input or none of it. Since there are no grouped
columns and (by assumption) no aggregates in the HAVING clause, the
HAVING clause must in fact be variable-free, ie, it's a pseudoconstant
clause. (Only pseudoconstant, because it might contain outer-level
variables or volatile functions.) I think the correct implementation
in this case is to generate a gating Result node with the HAVING clause
as a one-time filter, so that we don't evaluate any of the query if the
HAVING is false. The current code gets this almost right: it will make
a variable-free WHERE clause into a Result gating condition *if it
contains no volatile functions*. So it's wrong for the volatile
function case but right otherwise.

2. GROUP BY, no aggregates

In this case the HAVING clause might contain references to the grouping
columns. It is legitimate to push down the HAVING to become WHERE,
but *only* if it doesn't contain any volatile functions --- otherwise it
might be possible to tell that the HAVING clause was executed more than
once. It would be useful to push down the HAVING if, for example, it
could become an indexscan qualifier. However if the HAVING condition
is expensive to compute (eg it contains a subselect) we'd probably be
better off not to push it into WHERE, but to arrange to evaluate it
only once per group. Right now the executor cannot support testing
such a condition, but I think it would be easy enough to improve nodeGroup.c
to allow testing a qual condition for each group.

3. No GROUP BY, has aggregates

As in case 1, the HAVING clause must be variable-free, so the best
implementation would be to put it into a gating Result node. It would
be correct to treat it the same way as we do for a HAVING clause
containing aggregates (ie, attach it as a qual condition to the Agg plan
node) --- but that would mean computing and throwing away the aggregate
result when the HAVING fails, when we could skip computing it altogether.

4. GROUP BY and has aggregates

This is really the same as case 2: we could push down the HAVING
condition if it contains no volatile functions, but unless it is
cheap to evaluate we are probably best off to attach it as a qual
condition to the Agg node, ie, evaluate it only once per group.
The only difference is that we don't need an executor fix to support
this, since Agg does quals already.

So, aside from the originally reported bug, there are two other problems
in this logic: it isn't ensuring that volatile functions will be
evaluated only once per group, and it isn't considering evaluation
cost in deciding whether a clause that could be converted to WHERE
should be or not.

I haven't yet tried to make a patch that fixes all of these things.
It'll likely come out complex enough that we don't want to back-patch
it into 8.0 or before. If so, I'll try to make a simpler variant that
fixes the semantic bugs but doesn't try to be smart about evaluation
cost.

Comments?

regards, tom lane

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#6)
hackersbugs
We are not following the spec for HAVING without GROUP BY

I wrote in reference to bug#1528:

What the spec actually says, or at least implies, is that a HAVING
clause is to be evaluated only once per group --- where the "group"
is the whole table if there's no GROUP BY clause.

In fact, reading the spec more closely, it is clear that the presence
of HAVING turns the query into a grouped query even if there is no
GROUP BY. I quote SQL92 7.8 again:

7.8 <having clause>

Function

Specify a grouped table derived by the elimination of groups from
^^^^^^^^^^^^^^^^^^^^^^^
the result of the previously specified clause that do not meet the
<search condition>.

...

1) Let T be the result of the preceding <from clause>, <where
clause>, or <group by clause>. If that clause is not a <group
by clause>, then T consists of a single group and does not have
a grouping column.

2) The <search condition> is applied to each group of T. The result
of the <having clause> is a grouped table of those groups of T
^^^^^^^^^^^^^^^^^^
for which the result of the <search condition> is true.

This is quite clear that the output of a HAVING clause is a "grouped
table" no matter whether the query uses GROUP BY or aggregates or not.

What that means is that neither the HAVING clause nor the targetlist
can use any ungrouped columns except within aggregate calls; that is,

select col from tab having 2>1

is in fact illegal per SQL spec, because col isn't a grouping column
(there are no grouping columns in this query).

What we are currently doing with this construct is pretending that it
means

select col from tab where 2>1

but it does not mean that according to the spec.

As I look into this, I find that several warty special cases in the
parser and planner arise from our misunderstanding of this point,
and could be eliminated if we enforced the spec's interpretation.
In particular this whole business of "moving HAVING into WHERE" is
wrong and should go away.

Comments? Can anyone confirm whether DB2 or other databases allow
ungrouped column references with HAVING?

regards, tom lane

#8Kevin Brown
kevin@sysexperts.com
In reply to: Tom Lane (#7)
hackersbugs
Re: We are not following the spec for HAVING without GROUP BY

Tom Lane wrote:

What that means is that neither the HAVING clause nor the targetlist
can use any ungrouped columns except within aggregate calls; that is,

select col from tab having 2>1

is in fact illegal per SQL spec, because col isn't a grouping column
(there are no grouping columns in this query).

[...]

Comments? Can anyone confirm whether DB2 or other databases allow
ungrouped column references with HAVING?

Oracle does not allow such references. It issues "ORA-00979: not a
GROUP BY expression" when you try to hand it such a reference.

MS SQL Server does not allow such references either, yielding
"columnname is invalid in the HAVING clause because it is not
contained in either an aggregate function or the GROUP BY clause.".

Can't comment about DB2.

--
Kevin Brown kevin@sysexperts.com

#9Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#7)
hackersbugs
Re: We are not following the spec for HAVING without GROUP BY

Tom Lane <tgl@sss.pgh.pa.us> writes:

In particular this whole business of "moving HAVING into WHERE" is
wrong and should go away.

It sort of seems like "select aggregate(col) from tab" with no GROUP BY clause
is a bit of a special case. The consistent thing to do would be to return no
records. It's only due to the special case that SQL returns a single record
for this case.

It seems like this special case is the only way to expose this difference
between a WHERE clause and a HAVING clause with an aggregate-free expression.

It seems like all that's needed is a simple flag on the Aggregate node that
says whether to output a single record if there are no input records or to
output no records.

--
greg

#10Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Kevin Brown (#8)
hackersbugs
Re: We are not following the spec for HAVING without GROUP

Comments? Can anyone confirm whether DB2 or other databases allow
ungrouped column references with HAVING?

Oracle does not allow such references. It issues "ORA-00979: not a
GROUP BY expression" when you try to hand it such a reference.

MS SQL Server does not allow such references either, yielding
"columnname is invalid in the HAVING clause because it is not
contained in either an aggregate function or the GROUP BY clause.".

Can't comment about DB2.

MySQL allows it:

mysql> create table tab (col integer);
Query OK, 0 rows affected (0.01 sec)

mysql> select col from tab having 2 > 1;
Empty set (0.00 sec)

mysql> insert into tab values (1);
Query OK, 1 row affected (0.00 sec)

mysql> select col from tab having 2 > 1;
+------+
| col |
+------+
| 1 |
+------+
1 row in set (0.00 sec)

Of course, that's not saying much!

Chris

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christopher Kings-Lynne (#10)
hackersbugs
Re: We are not following the spec for HAVING without GROUP

Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:

Comments? Can anyone confirm whether DB2 or other databases allow
ungrouped column references with HAVING?

MySQL allows it:

A slightly tighter experiment shows that they treat HAVING like WHERE
in this case:

mysql> create table tab(col int);
Query OK, 0 rows affected (0.00 sec)

mysql> insert into tab values(1);
Query OK, 1 row affected (0.00 sec)

mysql> insert into tab values(2);
Query OK, 1 row affected (0.01 sec)

mysql> select col from tab having col>1;
+------+
| col |
+------+
| 2 |
+------+
1 row in set (0.00 sec)

I think it's fairly likely that they copied our misinterpretation ...

regards, tom lane

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#9)
hackersbugs
Re: We are not following the spec for HAVING without GROUP BY

Greg Stark <gsstark@mit.edu> writes:

It sort of seems like "select aggregate(col) from tab" with no GROUP BY clause
is a bit of a special case. The consistent thing to do would be to return no
records.

I don't think so. SQL99 defines this stuff in a way that might make you
feel better: it says that the presence of either HAVING or any aggregate
functions in the target list implies "GROUP BY ()", which is the case
that they identify as <grand total> in the <group by clause> syntax.
Basically this legitimizes the concept of turning the whole input table
into one group, which is what's really going on here. We get this right
in the case where it's driven by the appearance of aggregate functions,
but not when it's just driven by HAVING.

It seems like all that's needed is a simple flag on the Aggregate node that
says whether to output a single record if there are no input records or to
output no records.

The implementation problem is that there *is* no aggregate node if there
are no aggregates. The definitional problem is that we are allowing
cases that are illegal per spec and are going to be difficult to
continue to support if we handle all the spec-required cases properly.

regards, tom lane

#13Mark Shewmaker
mark@primefactor.com
In reply to: Tom Lane (#7)
hackersbugs
Re: We are not following the spec for HAVING without GROUP

On Wed, 2005-03-09 at 21:21 -0500, Tom Lane wrote:

Comments? Can anyone confirm whether DB2 or other databases allow
ungrouped column references with HAVING?

In Sybase:

1> select 2 as id, max(myfield) from mytable where 2=1
2> go
id
----------- ----------
2 NULL

(1 row affected)
1> select 2 as id, max(myfield) from mytable having 2=1
2> go
id
----------- ----------

(0 rows affected)

--
Mark Shewmaker
mark@primefactor.com

#14Dann Corbit
DCorbit@connx.com
In reply to: Mark Shewmaker (#13)
hackers
Re: We are not following the spec for HAVING without GROUP

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Tom Lane
Sent: Wednesday, March 09, 2005 8:45 PM
To: Christopher Kings-Lynne
Cc: Kevin Brown; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] We are not following the spec for HAVING without
GROUP

Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:

Comments? Can anyone confirm whether DB2 or other databases allow
ungrouped column references with HAVING?

DB2 does not like it.

This runs and returns data:

SELECT INFO5FILES.APAMT.DEBAMT FROM INFO5FILES.APAMT

SELECT statement run complete.

This fails to prepare:

SELECT INFO5FILES.APAMT.DEBAMT FROM INFO5FILES.APAMT having 1 > 2

Column DEBAMT or function specified in SELECT list not valid.

#15Michael Wimmer
newsgroup1@ecom.at
In reply to: Tom Lane (#12)
hackers
Re: We are not following the spec for HAVING without GROUP BY

Just a quick test on the DBMS I have available at work.

IBM Informix Dynamic Server Version 10.00.TC1TL
Error: The column (id) must be in the GROUP BY list.

Oracle 9.2.0.11
Returns the same records as if where would be used.

MSSQL Express 2005 Beta February TP
Error: Column 'tab.id' is invalid in the select list because it is not
contained in either an aggregate function or the GROUP BY clause.

MySQL 5.0.2
Returns the same records as if where would be used.

Sybase 12.5
Returns the same records as if where would be used.

Firebird 1.5.2
Error: GDS Exception. 335544569. Dynamic SQL Error
SQL error code = -104
No message for code 335544824 found.
null

Best regards,

Michael Wimmer

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#7)
hackersbugs
Re: [HACKERS] We are not following the spec for HAVING without GROUP BY

I wrote:

This is quite clear that the output of a HAVING clause is a "grouped
table" no matter whether the query uses GROUP BY or aggregates or not.

What that means is that neither the HAVING clause nor the targetlist
can use any ungrouped columns except within aggregate calls; that is,
select col from tab having 2>1
is in fact illegal per SQL spec, because col isn't a grouping column
(there are no grouping columns in this query).

Actually, it's even more than that: a query with HAVING and no GROUP BY
should always return 1 row (if the HAVING succeeds) or 0 rows (if not).
If there are no aggregates, the entire from/where clause can be thrown
away, because it can have no impact on the result!

Would those of you with access to other DBMSes try this:

create table tab (col integer);
select 1 from tab having 1=0;
select 1 from tab having 1=1;
insert into tab values(1);
insert into tab values(2);
select 1 from tab having 1=0;
select 1 from tab having 1=1;

I claim that a SQL-conformant database will return 0, 1, 0, and 1 rows
from the 4 selects --- that is, the contents of tab make no difference
at all. (MySQL returns 0, 0, 0, and 2 rows, so they are definitely
copying our mistake...)

regards, tom lane

#17Gill, Jerry T.
JTGill@west.com
In reply to: Tom Lane (#16)
hackersbugs
Re: [HACKERS] We are not following the spec for HAVING without GROUP BY

Here is your Sql run in a DB2 database.
connect to phoenix

Database Connection Information

Database server = DB2/LINUX 8.1.5
SQL authorization ID = GILL
Local database alias = PHOENIX

create table tab (col integer)
DB20000I The SQL command completed successfully.

select 1 from tab having 1=0

1
-----------

0 record(s) selected.

select 1 from tab having 1=1

1
-----------
1

1 record(s) selected.

insert into tab values(1)
DB20000I The SQL command completed successfully.

insert into tab values(2)
DB20000I The SQL command completed successfully.

select 1 from tab having 1=0

1
-----------

0 record(s) selected.

select 1 from tab having 1=1

1
-----------
1

1 record(s) selected.

Hope that helps.
-Jgill

-----Original Message-----
From: pgsql-bugs-owner@postgresql.org
[mailto:pgsql-bugs-owner@postgresql.org]On Behalf Of Tom Lane
Sent: Thursday, March 10, 2005 11:45 AM
To: pgsql-hackers@postgresql.org; pgsql-bugs@postgresql.org
Subject: Re: [BUGS] [HACKERS] We are not following the spec for HAVING
without GROUP BY

I wrote:

This is quite clear that the output of a HAVING clause is a "grouped
table" no matter whether the query uses GROUP BY or aggregates or not.

What that means is that neither the HAVING clause nor the targetlist
can use any ungrouped columns except within aggregate calls; that is,
select col from tab having 2>1
is in fact illegal per SQL spec, because col isn't a grouping column
(there are no grouping columns in this query).

Actually, it's even more than that: a query with HAVING and no GROUP BY
should always return 1 row (if the HAVING succeeds) or 0 rows (if not).
If there are no aggregates, the entire from/where clause can be thrown
away, because it can have no impact on the result!

Would those of you with access to other DBMSes try this:

create table tab (col integer);
select 1 from tab having 1=0;
select 1 from tab having 1=1;
insert into tab values(1);
insert into tab values(2);
select 1 from tab having 1=0;
select 1 from tab having 1=1;

I claim that a SQL-conformant database will return 0, 1, 0, and 1 rows
from the 4 selects --- that is, the contents of tab make no difference
at all. (MySQL returns 0, 0, 0, and 2 rows, so they are definitely
copying our mistake...)

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

#18Kevin HaleBoyes
khaleboyes@chartwelltechnology.com
In reply to: Tom Lane (#16)
hackersbugs
Re: [HACKERS] We are not following the spec for HAVING without GROUP

Tom Lane wrote:

Would those of you with access to other DBMSes try this:

create table tab (col integer);
select 1 from tab having 1=0;
select 1 from tab having 1=1;
insert into tab values(1);
insert into tab values(2);
select 1 from tab having 1=0;
select 1 from tab having 1=1;

I claim that a SQL-conformant database will return 0, 1, 0, and 1 rows
from the 4 selects --- that is, the contents of tab make no difference
at all. (MySQL returns 0, 0, 0, and 2 rows, so they are definitely
copying our mistake...)

regards, tom lane

From SQL server 2000 with a service pack, I get:

zero rows from the first query (having 1=0);
one row, col value 1, from second query (having 1=1);
...run inserts...
zero rows from the third query (having 1=0);
one row, col value 1, from forth query (having 1=1);

K.

#19Barry Lind
barry@xythos.com
In reply to: Gill, Jerry T. (#17)
hackersbugs
Re: [HACKERS] We are not following the spec for HAVING without GROUP BY

On Oracle 9.2 you get 0, 0, 0, and 2 rows.

--Barry

SQL> create table tab (col integer);

Table created.

SQL> select 1 from tab having 1=0;

no rows selected

SQL> select 1 from tab having 1=1;

no rows selected

SQL> insert into tab values (1);

1 row created.

SQL> insert into tab values (2);

1 row created.

SQL> select 1 from tab having 1=0;

no rows selected

SQL> select 1 from tab having 1=1;

1
----------
1
1

SQL> exit
Disconnected from Oracle9i Enterprise Edition Release 9.2.0.1.0 -
Production
JServer Release 9.2.0.1.0 - Production

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Tom Lane
Sent: Thursday, March 10, 2005 9:45 AM
To: pgsql-hackers@postgresql.org; pgsql-bugs@postgresql.org
Subject: Re: [HACKERS] We are not following the spec for HAVING without
GROUP BY

I wrote:

This is quite clear that the output of a HAVING clause is a "grouped
table" no matter whether the query uses GROUP BY or aggregates or not.

What that means is that neither the HAVING clause nor the targetlist
can use any ungrouped columns except within aggregate calls; that is,
select col from tab having 2>1
is in fact illegal per SQL spec, because col isn't a grouping column
(there are no grouping columns in this query).

Actually, it's even more than that: a query with HAVING and no GROUP BY
should always return 1 row (if the HAVING succeeds) or 0 rows (if not).
If there are no aggregates, the entire from/where clause can be thrown
away, because it can have no impact on the result!

Would those of you with access to other DBMSes try this:

create table tab (col integer);
select 1 from tab having 1=0;
select 1 from tab having 1=1;
insert into tab values(1);
insert into tab values(2);
select 1 from tab having 1=0;
select 1 from tab having 1=1;

I claim that a SQL-conformant database will return 0, 1, 0, and 1 rows
from the 4 selects --- that is, the contents of tab make no difference
at all. (MySQL returns 0, 0, 0, and 2 rows, so they are definitely
copying our mistake...)

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

#20Gary Doades
gpd@gpdnet.co.uk
In reply to: Tom Lane (#16)
hackersbugs
Re: [HACKERS] We are not following the spec for HAVING without

Tom Lane wrote:

Would those of you with access to other DBMSes try this:

create table tab (col integer);
select 1 from tab having 1=0;
select 1 from tab having 1=1;
insert into tab values(1);
insert into tab values(2);
select 1 from tab having 1=0;
select 1 from tab having 1=1;

I claim that a SQL-conformant database will return 0, 1, 0, and 1 rows

MS SQL Server 2000 returns 0, 1, 0 and 1 rows correctly.

Cheers,
Gary.

#21Michael Fuhr
mike@fuhr.org
In reply to: Gary Doades (#20)
hackersbugs
#22johnnnnnn
john@phaedrusdeinus.org
In reply to: Tom Lane (#16)
hackersbugs
#23Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Tom Lane (#16)
hackersbugs
#24Mark Kirkwood
mark.kirkwood@catalyst.net.nz
In reply to: Tom Lane (#16)
hackersbugs
#25Mark Shewmaker
mark@primefactor.com
In reply to: Tom Lane (#16)
hackersbugs
#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Barry Lind (#19)
hackersbugs
#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Wright (#1)
hackersbugs
#28John R Pierce
pierce@hogranch.com
In reply to: Jaime Casanova (#23)
hackersbugs
#29Peter Wright
pete@flooble.net
In reply to: Peter Wright (#1)
hackersbugs
#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Wright (#29)
hackersbugs
#31Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#16)
hackersbugs
#32Terry Yapt
yapt@NOtechSAPMnovell.com
In reply to: Peter Wright (#1)
hackersbugs
#33Bruce Momjian
bruce@momjian.us
In reply to: Mark Shewmaker (#13)
hackersbugs
#34Bruno Wolff III
bruno@wolff.to
In reply to: Bruce Momjian (#33)
hackersbugs
#35Bruce Momjian
bruce@momjian.us
In reply to: Bruno Wolff III (#34)
hackersbugs
#36Bruno Wolff III
bruno@wolff.to
In reply to: Bruce Momjian (#35)
hackersbugs
#37Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#33)
hackersbugs
#38Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruno Wolff III (#34)
hackersbugs
#39Bruno Wolff III
bruno@wolff.to
In reply to: Tom Lane (#38)
hackersbugs
#40Bruce Momjian
bruce@momjian.us
In reply to: Bruno Wolff III (#39)
hackersbugs
#41Manfred Koizar
mkoi-pg@aon.at
In reply to: Mark Kirkwood (#24)
hackersbugs
#42Dennis Bjorklund
db@zigo.dhs.org
In reply to: Bruce Momjian (#40)
hackersbugs
#43Bruce Momjian
bruce@momjian.us
In reply to: Dennis Bjorklund (#42)
hackersbugs
#44Dennis Bjorklund
db@zigo.dhs.org
In reply to: Bruce Momjian (#43)
hackersbugs
#45Bruce Momjian
bruce@momjian.us
In reply to: Dennis Bjorklund (#44)
hackersbugs