Is this a bug?

Started by Prateek Sanyalabout 10 years ago3 messagesbugs
Jump to latest
#1Prateek Sanyal
sanyal.prateek@gmail.com

I discovered a difference between sqlite3 and psql and I am not sure if it
is intentional but it is definitely inconvenient.

If you use an alias for an aggregate function and then do a GROUP BY with
HAVING, you can't use that alias in the HAVING and you have to type in the
entire function again.

This is not the case with sqlite3 where you can just use the alias. Why
doesn't psql allow this? It seems more efficient.

Take a look at the following example:

-- Schema for sample table named "invoices":

-- id => int

-- customer_id => int

-- total_sale => decimal
Here is the query:
********************

SELECT SUM(total_sale) AS sum_sales FROM invoices GROUP BY customer_id
HAVING sum_sales > 20 ORDER BY sum_sales DESC;

********************

This code above works in sqlite3 but not psql. For psql I have to do the
following:

********************

SELECT SUM(total_sale) AS sum_sales FROM invoices GROUP BY customer_id
HAVING SUM(total_sale) > 20 ORDER BY sum_sales DESC;

********************

Regards,
Prateek.

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Prateek Sanyal (#1)
Re: Is this a bug?

Prateek Sanyal <sanyal.prateek@gmail.com> writes:

I discovered a difference between sqlite3 and psql and I am not sure if it
is intentional but it is definitely inconvenient.

If you use an alias for an aggregate function and then do a GROUP BY with
HAVING, you can't use that alias in the HAVING and you have to type in the
entire function again.

This is not the case with sqlite3 where you can just use the alias. Why
doesn't psql allow this?

Because it is contrary to both the letter and the spirit of the SQL
standard. The SELECT's result list is logically computed after every
operation except ORDER BY, so it's really not sensible for clauses
like GROUP BY or HAVING to refer to outputs of the SELECT list.
What's more, it's ambiguous, because the same name might mean
different things depending on whether you consider it to be an
input column name or a result column name.

PG deviates from the spec to the extent of allowing GROUP BY items
to be output column names *as long as they are just that, and not
expressions*. (To my mind, that's probably a long-ago mistake,
but it's handy enough that there's never been any serious move to
remove it.) Even if we applied that policy to HAVING, which
we don't, your example wouldn't work because the reference is
within an expression.

SELECT SUM(total_sale) AS sum_sales FROM invoices GROUP BY customer_id
HAVING SUM(total_sale) > 20 ORDER BY sum_sales DESC;

This code is correct per spec and should work in any SQL DBMS. At least
as far as PG is concerned, there's no performance penalty, since the
common SUM() expression is evaluated only once anyway.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#3Prateek Sanyal
sanyal.prateek@gmail.com
In reply to: Tom Lane (#2)
Re: Is this a bug?

I was going to present my argument until I saw who signed off this email.
Thank you Mr. Lane. Now I will send a bug report to Microsoft and hope that
Bill Gates responds.

Regards,
Prateek.

On Fri, Jan 22, 2016 at 9:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Prateek Sanyal <sanyal.prateek@gmail.com> writes:

I discovered a difference between sqlite3 and psql and I am not sure if

it

is intentional but it is definitely inconvenient.

If you use an alias for an aggregate function and then do a GROUP BY with
HAVING, you can't use that alias in the HAVING and you have to type in

the

entire function again.

This is not the case with sqlite3 where you can just use the alias. Why
doesn't psql allow this?

Because it is contrary to both the letter and the spirit of the SQL
standard. The SELECT's result list is logically computed after every
operation except ORDER BY, so it's really not sensible for clauses
like GROUP BY or HAVING to refer to outputs of the SELECT list.
What's more, it's ambiguous, because the same name might mean
different things depending on whether you consider it to be an
input column name or a result column name.

PG deviates from the spec to the extent of allowing GROUP BY items
to be output column names *as long as they are just that, and not
expressions*. (To my mind, that's probably a long-ago mistake,
but it's handy enough that there's never been any serious move to
remove it.) Even if we applied that policy to HAVING, which
we don't, your example wouldn't work because the reference is
within an expression.

SELECT SUM(total_sale) AS sum_sales FROM invoices GROUP BY customer_id
HAVING SUM(total_sale) > 20 ORDER BY sum_sales DESC;

This code is correct per spec and should work in any SQL DBMS. At least
as far as PG is concerned, there's no performance penalty, since the
common SUM() expression is evaluated only once anyway.

regards, tom lane