Re: Postgres Planner Bug
Here is an email from Gavin showing a problem with subqueries and casts
causing errors when they shouldn't.
---------------------------------------------------------------------------
Gavin Sherry wrote:
Hi Bruce,
Thanks to a user query (handle: lltd, IRC) I came across a bug in the
planner. The query was:---
select o1.timestamp::date as date, count(*), (select sum(oi.price) from
"order" o2, "order_item" oi where oi.order_id = o2.id and
o2.timestamp::date = o1.timestamp::date and o2.timestamp is not
null) as total from "order" o1 where o1.timestamp is not null
group by o1.timestamp::date order by o1.timestamp::date desc;
---The error he was receiving:
---
ERROR: Sub-SELECT uses un-GROUPed attribute o1.timestamp from outer query
---After a bit of looking around, I determined that the cast in the order by
clause was causing the error, not the fact that the query had an ungrouped
attribute in the outer query.I have come up with a simpler demonstration which works under 7.1 and CVS.
create table a ( i int);
insert into a values(1);
insert into a values(1);
insert into a values(1);
insert into a values(1);
insert into a values(1);
insert into a values(2);
insert into a values(2);
insert into a values(2);
insert into a values(2);
insert into a values(3);
insert into a values(3);
insert into a values(3);
insert into a values(3);--- NO ERROR ---select o1.i::smallint,count(*),(select sum(o2.i) from a o2 where
o2.i=o1.i::smallint) as sum from a o1 group by o1.i;--- ERROR --- select o1.i::smallint,count(*),(select sum(o2.i) from a o2 where o2.i=o1.i::smallint) as sum from a o1 group by o1.i::smallint;----
Notice that the difference is only the cast in the order by clause. Here
are my results:template1=# select version();
version
------------------------------------------------------------------------
PostgreSQL 7.3devel on i686-pc-linux-gnu, compiled by GCC egcs-2.91.66
(1 row)template1=# \d a
Table "public.a"
Column | Type | Modifiers
--------+---------+-----------
i | integer |template1=# select * from a;
i
---
1
1
1
1
1
2
2
2
2
3
3
3
3
(13 rows)te1=# select o1.i::smallint,count(*),(select sum(o2.i) from a o2 where
o2.i=o1.i::smallint) as sum from a o1 group by o1.i;
i | count | sum
---+-------+-----
1 | 5 | 5
2 | 4 | 8
3 | 4 | 12
(3 rows)template1=# select o1.i::smallint,count(*),(select sum(o2.i) from a o2
where o2.i=o1.i::smallint) as sum from a o1 group by o1.i::smallint;
ERROR: Sub-SELECT uses un-GROUPed attribute o1.i from outer query[under patched version]
template1=# select o1.i::smallint,count(*),(select sum(o2.i) from a o2
where o2.i=o1.i::smallint) as sum from a o1 group by o1.i;
i | count | sum
---+-------+-----
1 | 5 | 5
2 | 4 | 8
3 | 4 | 12
(3 rows)template1=# select o1.i::smallint,count(*),(select sum(o2.i) from a o2
where o2.i=o1.i::smallint) as sum from a o1 group by o1.i::smallint;
i | count | sum
---+-------+-----
1 | 5 | 5
2 | 4 | 8
3 | 4 | 12
(3 rows)As it works out, the bug is caused by these lines in
optimizer/util/clauses.cif (equal(thisarg, lfirst(gl)))
{
contained_in_group_clause = true;
break;
}'thisarg' is an item from the args list used by Expr *. We only access
this code inside check_subplans_for_ungrouped_vars_walker() if the node is
a subplan. The problem is that equal() is not sufficiently intelligent to
consider the equality of 'thisarg' and lfirst(gl) (an arg from the group
by clause) considering that thisarg and lfirst(gl) are not necessarily of
the same node type. This means we fail out in equal():/*
* are they the same type of nodes?
*/
if (nodeTag(a) != nodeTag(b))
return false;The patch below 'fixes' this (and possibly breaks everything else). I
haven't tested it rigorously and it *just* special cases group by
clauses with functions in them. Here's the patch:Index: ./src/backend/optimizer/util/clauses.c =================================================================== RCS file: /projects/cvsroot/pgsql-server/src/backend/optimizer/util/clauses.c,v retrieving revision 1.107 diff -2 -c -r1.107 clauses.c *** ./src/backend/optimizer/util/clauses.c 2002/08/31 22:10:43 1.107 --- ./src/backend/optimizer/util/clauses.c 2002/09/30 15:02:47 *************** *** 703,706 **** --- 703,718 ---- contained_in_group_clause = true; break; + } else { + if(IsA(lfirst(gl),Expr) && + length(((Expr *)lfirst(gl))->args) == 1 && + IsA(lfirst(((Expr *)lfirst(gl))->args),Var) ) { + + Var *tvar = (Var *) lfirst(((Expr *)lfirst(gl))->args); + if(var->varattno == tvar->varattno) { + contained_in_group_clause = true; + break; + } + + } } }----
There are two assumptions here: 1) the only time this bug occurs is when
the group by clause argument is an expression and a function at that (even
though I do no test for this correctly) 2) We can see whether thisarg ==
lfirst(gl) by looking at the varattno of each and comparing. It occurs to
me that this is just plain wrong and works only for the specific query.The reason why I've sent this email to you and not to the list is I do not
have time to follow through on this -- as much as I would like to. I
simply do no have the time. :-(Gavin
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Import Notes
Reply to msg id not found: Pine.LNX.4.21.0210010101080.20410-100000@linuxworld.com.au
Thanks to a user query (handle: lltd, IRC) I came across a bug in the
planner. The query was:---
select o1.timestamp::date as date, count(*), (select sum(oi.price) from
"order" o2, "order_item" oi where oi.order_id = o2.id and
o2.timestamp::date = o1.timestamp::date and o2.timestamp is not
null) as total from "order" o1 where o1.timestamp is not null
group by o1.timestamp::date order by o1.timestamp::date desc;
---The error he was receiving:
---
ERROR: Sub-SELECT uses un-GROUPed attribute o1.timestamp from outer query
---After a bit of looking around, I determined that the cast in the order by
clause was causing the error, not the fact that the query had an ungrouped
Mistake here. It relates to the group by clause, not order by.
Gavin
Bruce Momjian <pgman@candle.pha.pa.us> writes:
The patch below 'fixes' this (and possibly breaks everything else). I
haven't tested it rigorously and it *just* special cases group by
clauses with functions in them.
Surely this cure is worse than the disease.
The general problem is that we don't attempt to match
arbitrary-expression GROUP BY clauses against arbitrary subexpressions
of sub-SELECTs. While that could certainly be done, I'm concerned about
the cycles that we'd expend trying to match everything against
everything else. This would be an exponential cost imposed on every
group-by-with-subselect query whether it needed the feature or not.
Given that GROUP BY is restricted to a simple column reference in both
SQL92 and SQL99, is it worth a large performance hit on unrelated
queries to support this feature? What other DBs support it?
regards, tom lane
The following patch completes the open item:
Change log_min_error_statement to be off by default (Gavin)
Gavin was busy so I did the work. Basically, it allows fatal/panic as a
value, and defaults it to panic so it is effectively OFF by default.
There was agreement that we can allow these values as a way of turning
this option off. Because of this, we can continue using the same
validation routines for all the server message level GUC parameters.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Attachments:
/bjm/difftext/plainDownload
Index: doc/src/sgml/runtime.sgml
===================================================================
RCS file: /cvsroot/pgsql-server/doc/src/sgml/runtime.sgml,v
retrieving revision 1.141
diff -c -c -r1.141 runtime.sgml
*** doc/src/sgml/runtime.sgml 27 Sep 2002 02:04:39 -0000 1.141
--- doc/src/sgml/runtime.sgml 2 Oct 2002 16:05:57 -0000
***************
*** 1036,1050 ****
<term><varname>LOG_MIN_ERROR_STATEMENT</varname> (<type>string</type>)</term>
<listitem>
<para>
! This controls which log messages are accompanied by the original
! query which generated the message. All queries matching the setting
! or which are of a higher severity than the setting are logged. The
! default is <literal>ERROR</literal>. Valid values are
! <literal>DEBUG5</literal>, <literal>DEBUG4</literal>,
! <literal>DEBUG3</literal>, <literal>DEBUG2</literal>,
<literal>DEBUG1</literal>, <literal>INFO</literal>,
! <literal>NOTICE</literal>, <literal>WARNING</literal>
! and <literal>ERROR</literal>.
</para>
<para>
It is recommended you enable <literal>LOG_PID</literal> as well
--- 1036,1050 ----
<term><varname>LOG_MIN_ERROR_STATEMENT</varname> (<type>string</type>)</term>
<listitem>
<para>
! This controls which message types output the original query to
! the server logs. All queries matching the setting or higher are
! logged. The default is <literal>PANIC</literal>. Valid values
! are <literal>DEBUG5</literal>, <literal>DEBUG4</literal>,
! <literal>DEBUG3</literal>, <literal>DEBUG2</literal>,
<literal>DEBUG1</literal>, <literal>INFO</literal>,
! <literal>NOTICE</literal>, <literal>WARNING</literal>,
! <literal>ERROR</literal>, <literal>FATAL</literal>, and
! <literal>PANIC</literal>.
</para>
<para>
It is recommended you enable <literal>LOG_PID</literal> as well
Index: src/backend/utils/misc/guc.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/utils/misc/guc.c,v
retrieving revision 1.96
diff -c -c -r1.96 guc.c
*** src/backend/utils/misc/guc.c 22 Sep 2002 19:52:38 -0000 1.96
--- src/backend/utils/misc/guc.c 2 Oct 2002 16:06:09 -0000
***************
*** 104,110 ****
int log_min_error_statement = ERROR;
char *log_min_error_statement_str = NULL;
! const char log_min_error_statement_str_default[] = "error";
int server_min_messages = NOTICE;
char *server_min_messages_str = NULL;
--- 104,110 ----
int log_min_error_statement = ERROR;
char *log_min_error_statement_str = NULL;
! const char log_min_error_statement_str_default[] = "panic";
int server_min_messages = NOTICE;
char *server_min_messages_str = NULL;
***************
*** 2999,3004 ****
--- 2999,3015 ----
{
if (doit)
(*var) = ERROR;
+ }
+ /* We allow FATAL/PANIC for client-side messages too. */
+ else if (strcasecmp(newval, "fatal") == 0)
+ {
+ if (doit)
+ (*var) = FATAL;
+ }
+ else if (strcasecmp(newval, "panic") == 0)
+ {
+ if (doit)
+ (*var) = PANIC;
}
else
return NULL; /* fail */