Odd error in complex query (7.2): Sub-SELECT uses un-GROUPed...
This is using an almost up-to-date CVS version.
Sorry for the convoluted example:
Create table t1(n text, f1 int, f2 int);
create table g1(n text, t1n text);
create table s1(k1 text, f1a int, f1b int, f2 int, x int, d timestamp);
create view v1 as select k1, d,
(select g1.n from g1, t1 where g1.t1n=t1.n and t1.f1 = s1.f1a and t1.f2 =
s1.f2 limit 1) as a,
(select g1.n from g1, t1 where g1.t1n=t1.n and t1.f1 = s1.f1b and t1.f2 =
s1.f2 limit 1) as b,
x
from
s1
;
explain select coalesce(a, b, 'other') as name, k1, sum(x) as tot
from v1 where
d>'28-oct-2001 12:00' and d<current_timestamp
group by 1,2 order by tot desc limit 40;
ERROR: Sub-SELECT uses un-GROUPed attribute s1.f2 from outer query
Maybe I am asking too much of views?
----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes:
Sorry for the convoluted example:
A simplified example is
create table t1(n text, f1 int);
create table s1(f1a int, x int);
create view v1 as select x,
(select t1.n from t1 where t1.f1 = s1.f1a) as a
from s1;
select a from v1 group by 1;
ERROR: Sub-SELECT uses un-GROUPed attribute s1.f1a from outer query
The expanded-out equivalent of the problem query is
select (select t1.n from t1 where t1.f1 = s1.f1a) as a from s1
group by 1;
which I believe is indeed illegal. But it seems like it ought to be
legal with the view in between ... ie, a view isn't purely a macro.
The implementation issue here is how to decide not to pull up the view
subquery (ie, not to flatten the query into the illegal form). We
already do that for certain conditions; we just have to figure out what
additional restriction should be used to preclude this case. The
restriction should be as tight as possible to avoid losing the ability
to optimize queries using views.
A simplistic idea is to not pull up views that contain subselects in
the targetlist, but I have a feeling that's not the right restriction.
Or maybe it is --- maybe the point is that the view targetlist is
logically evaluated *before* the outer query executes, and we can't do
a pullup if evaluating it later would change the results.
Comments? I suspect this is trickier than it looks :-(
regards, tom lane
At 14:36 29/10/01 -0500, Tom Lane wrote:
The expanded-out equivalent of the problem query is
select (select t1.n from t1 where t1.f1 = s1.f1a) as a from s1
group by 1;which I believe is indeed illegal. But it seems like it ought to be
legal with the view in between ... ie, a view isn't purely a macro.
FWIW, MS SQL/Server won't even allow the view to be defined
Dec/RDB does, and it allows the query as well, with the following plannner
output:
Reduce Sort
Cross block of 2 entries
Cross block entry 1
Get Retrieval sequentially of relation S1
Cross block entry 2
Aggregate Conjunct Get
Retrieval sequentially of relation T1
It also allows:
select (select t1.n from t1 where t1.f1 = s1.f1a) as a from s1
group by (select t1.n from t1 where t1.f1 = s1.f1a);
with the same plan. Which does not, on the face of it, seem illegal to me.
RDB usually rewrites column-select-expressions as cross-joins (with
appropriate checking for multiple/no rows). Which seems to work well with
my expectations for both queries, although I presume this it not what the
spec says?
The implementation issue here is how to decide not to pull up the view
subquery (ie, not to flatten the query into the illegal form).
It's not clear to me that it should be illegal - for every row in s1, it
should return the result of the column-select (which may be NULL) - or is
that what 'not flattening the query' does?
We
already do that for certain conditions; we just have to figure out what
additional restriction should be used to preclude this case. The
restriction should be as tight as possible to avoid losing the ability
to optimize queries using views.
How about whenenever it will throw this error? ;-).,
A simplistic idea is to not pull up views that contain subselects in
the targetlist, but I have a feeling that's not the right restriction.
That does seem excessive. I'm way over my head here, but can a column
select be implemented as a special JOIN that always returns 1 row (maybe
NULL), and throws an error if more than one row?
Or maybe it is --- maybe the point is that the view targetlist is
logically evaluated *before* the outer query executes,
This is very nasty, and would really hurt the utility of views.
and we can't do
a pullup if evaluating it later would change the results.
Huh?
----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes:
At 14:36 29/10/01 -0500, Tom Lane wrote:
The expanded-out equivalent of the problem query is
select (select t1.n from t1 where t1.f1 = s1.f1a) as a from s1
group by 1;
which I believe is indeed illegal.
Dec/RDB ... allows the query
It also allows:
select (select t1.n from t1 where t1.f1 = s1.f1a) as a from s1
group by (select t1.n from t1 where t1.f1 = s1.f1a);
with the same plan. Which does not, on the face of it, seem illegal to me.
Hmm. Maybe the query is legal, and the problem is just one of an
incorrect check for ungrouped vars in subselects. Need to think more.
regards, tom lane
At 14:36 29/10/01 -0500, Tom Lane wrote:
Philip Warner <pjw@rhyme.com.au> writes:
Sorry for the convoluted example:
A simplified example is
And here's a simpler one that seems to avoid views altogether:
create table lkp(f1 int);
create table t1(f1 int, x int);
Select
case when Exists(Select * From lkp where lkp.f1 = t1.f1) then
'known'
else
'unknown'
end as status,
sum(x)
from t1
group by 1;
It's pretty similar to the sample you gave, but also presents the sort of
operation people may well want to perform.
----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/
Philip Warner wrote:
At 14:36 29/10/01 -0500, Tom Lane wrote:
Philip Warner <pjw@rhyme.com.au> writes:
Sorry for the convoluted example:
A simplified example is
And here's a simpler one that seems to avoid views altogether:
create table lkp(f1 int);
create table t1(f1 int, x int);Select
case when Exists(Select * From lkp where lkp.f1 = t1.f1) then
'known'
else
'unknown'
end as status,
sum(x)
from t1
group by 1;
A bit off-tppic question, but is our optimiser smart enough to
recognize the query inside exists as LIMIT 1 query ?
------------
Hannu
At 10:43 30/10/01 +0200, Hannu Krosing wrote:
A bit off-tppic question, but is our optimiser smart enough to
recognize the query inside exists as LIMIT 1 query ?
Yep.
----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes:
Select
case when Exists(Select * From lkp where lkp.f1 = t1.f1) then
'known'
else
'unknown'
end as status,
sum(x)
from t1
group by 1;
Okay, I'm convinced: the problem is that the test for ungrouped vars
used inside subselects is too simplistic. I think it's failing to
consider that if the whole subselect can be considered a grouped
expression, we shouldn't object to ungrouped individual vars within it.
Will work on it.
regards, tom lane
On Tue, 30 Oct 2001 11:49:28 +1100
Philip Warner wrote:
It's not clear to me that it should be illegal - for every row in s1, it
should return the result of the column-select (which may be NULL) - or is
that what 'not flattening the query' does?We
already do that for certain conditions; we just have to figure out what
additional restriction should be used to preclude this case. The
restriction should be as tight as possible to avoid losing the ability
to optimize queries using views.How about whenenever it will throw this error? ;-).,
A simplistic idea is to not pull up views that contain subselects in
the targetlist, but I have a feeling that's not the right restriction.That does seem excessive. I'm way over my head here, but can a column
select be implemented as a special JOIN that always returns 1 row (maybe
NULL), and throws an error if more than one row?
Hi,
I wouldn't think most people need a query like this, but also
had been in puzzle as to how not to pull up. Finally the
problem could be solved by using a statement of an ORDER BY.
Therefore, if you add an ORDER BY to a view of your complex
query, it will work correctly.
And, as long as each of correlative subselects which are
in columns always returns one row, I feel it is legal
rather than illegal that its subselects can be GROUPed.
-- on 7.1.2
create table t1(n text, f1 int, f2 int);
create table g1(n text, t1n text);
create table s1(k1 text, f1a int, f1b int, f2 int, x int, d timestamp);
create view v1 as
select k1, d,
(select g1.n from g1, t1
where g1.t1n=t1.n and t1.f1 = s1.f1a and t1.f2 = s1.f2 limit 1) as a,
(select g1.n from g1, t1
where g1.t1n=t1.n and t1.f1 = s1.f1b and t1.f2 = s1.f2 limit 1) as b,
x
from s1
order by 1 -- *** an additional statement ***
;
explain
select coalesce(a, b, 'other') as name, k1, sum(x) as tot
from v1
where d > '28-oct-2001 12:00' and d < current_timestamp
group by 1,2
order by tot desc limit 40;
Regards,
Masaru Sugawara
Okay, I'm convinced: the problem is that the test for ungrouped vars
used inside subselects is too simplistic.
Not only was that true, but the handling of GROUP BY expressions was
pretty grotty in general: they'd be re-evaluated at multiple levels of
the resulting plan tree. Which is not too bad for "GROUP BY a+b",
but it's unpleasant when a complex subselect is involved.
I've committed fixes to CVS.
regards, tom lane