aliases break my query

Started by Joseph Shraibmanover 25 years ago15 messages
#1Joseph Shraibman
jks@selectacast.net

These two queries are exactly alike. The first one uses aliases except
for the order by. The second uses aliases also for the order by. The
third uses whole names. The third has the behavior I want.

Someone please tell me what I am doing wrong. I don't want to have to
use whole names for my query.

The data for the tables are at the end.

playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy =
ta.a) from tablea ta, tableb tb order by tablea.a;
a|b|c|?column?
-+-+-+--------
1|2| | 0
2|3|4| 1
3|4|5| 0
4|5|4| 0
1|2| | 0
2|3|4| 1
3|4|5| 0
4|5|4| 0
1|2| | 0
2|3|4| 0
3|4|5| 1
4|5|4| 0
1|2| | 0
2|3|4| 0
3|4|5| 0
4|5|4| 1
1|2| | 0
2|3|4| 0
3|4|5| 0
4|5|4| 0
1|2| | 0
2|3|4| 1
3|4|5| 0
4|5|4| 0
1|2| | 0
2|3|4| 1
3|4|5| 0
4|5|4| 0
1|2| | 0
2|3|4| 0
3|4|5| 1
4|5|4| 0
1|2| | 0
2|3|4| 0
3|4|5| 0
4|5|4| 1
1|2| | 0
2|3|4| 0
3|4|5| 0
4|5|4| 0
1|2| | 0
2|3|4| 1
3|4|5| 0
4|5|4| 0
1|2| | 0
2|3|4| 1
3|4|5| 0
4|5|4| 0
1|2| | 0
2|3|4| 0
3|4|5| 1
4|5|4| 0
1|2| | 0
2|3|4| 0
3|4|5| 0
4|5|4| 1
1|2| | 0
2|3|4| 0
3|4|5| 0
4|5|4| 0
1|2| | 0
2|3|4| 1
3|4|5| 0
4|5|4| 0
1|2| | 0
2|3|4| 1
3|4|5| 0
4|5|4| 0
1|2| | 0
2|3|4| 0
3|4|5| 1
4|5|4| 0
1|2| | 0
2|3|4| 0
3|4|5| 0
4|5|4| 1
1|2| | 0
2|3|4| 0
3|4|5| 0
4|5|4| 0
(80 rows)

playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy =
ta.a) from tablea ta, tableb tb order by ta.a;
a|b|c|?column?
-+-+-+--------
1|2| | 0
1|2| | 0
1|2| | 0
1|2| | 0
1|2| | 0
2|3|4| 1
2|3|4| 1
2|3|4| 0
2|3|4| 0
2|3|4| 0
3|4|5| 0
3|4|5| 0
3|4|5| 1
3|4|5| 0
3|4|5| 0
4|5|4| 0
4|5|4| 0
4|5|4| 0
4|5|4| 1
4|5|4| 0
(20 rows)

playpen=> select tablea.a,tablea.b,tablea.c, (select count (tableb.zz)
where tableb.yy = tablea.a) order by tablea.a;
a|b|c|?column?
-+-+-+--------
1|2| | 0
2|3|4| 2
3|4|5| 1
4|5|4| 1
(4 rows)

playpen=>
playpen=> select * from tablea;
a|b|c
-+-+-
1|2|
2|3|4
3|4|5
4|5|4
(4 rows)

playpen=> select * from tableb;
yy|zz
--+--
2| 4
2| 5
3| 9
4|14
5|15
(5 rows)

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joseph Shraibman (#1)
Re: aliases break my query

Joseph Shraibman <jks@selectacast.net> writes:

These two queries are exactly alike. The first one uses aliases except
for the order by. The second uses aliases also for the order by. The
third uses whole names. The third has the behavior I want.

I think you are confusing yourself by leaving out FROM clauses.
In particular, with no FROM for the inner SELECT it's not real clear
what should happen there. I can tell you what *is* happening, but
who's to say if it's right or wrong?

playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy =
ta.a) from tablea ta, tableb tb order by tablea.a;

[ produces 80 rows ]

playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy =
ta.a) from tablea ta, tableb tb order by ta.a;

[ produces 20 rows ]

The difference between these two is that by explicitly specifying
"tablea" in the order-by clause, you've created a three-way join,
as if you had written "from tablea ta, tableb tb, tablea tablea".
Once you write an alias in a from-clause entry, you must refer to
that from-clause entry by its alias, not by its true table name.

Meanwhile, what of the inner select? It has no FROM clause *and*
no valid table names. The only way to interpret the names in it
is as references to the outer select. So, on any given iteration
of the outer select, the inner select collapses to constants.
It looks like "SELECT count(constant1) WHERE constant2 = constant3"
and so you get either 0 or 1 depending on whether tb.yy and ta.a
from the outer scan are different or equal.

playpen=> select tablea.a,tablea.b,tablea.c, (select count (tableb.zz)
where tableb.yy = tablea.a) order by tablea.a;

[ produces 4 rows ]

Here the outer select is not a join at all --- it mentions only tablea,
so you are going to get one output for each tablea row. The inner
select looks like "select count (zz) FROM tableb WHERE yy = <constant>",
so you get an actual scan of tableb for each iteration of the outer
scan.

It's not very clear from these examples what you actually wanted to have
happen, but I suggest that you will have better luck if you specify
explicit FROM lists in both the inner and outer selects, and be careful
that each variable you use clearly refers to exactly one of the
FROM-list entries.

regards, tom lane

#3Peter Eisentraut
e99re41@DoCS.UU.SE
In reply to: Tom Lane (#2)
Re: aliases break my query

playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy =
ta.a) from tablea ta, tableb tb order by tablea.a;

[ produces 80 rows ]

playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy =
ta.a) from tablea ta, tableb tb order by ta.a;

[ produces 20 rows ]

playpen=> select tablea.a,tablea.b,tablea.c, (select count (tableb.zz)
where tableb.yy = tablea.a) order by tablea.a;

[ produces 4 rows ]

Once again, I think that we *really* need to discuss whether implicit
range table entries in SELECT are a good idea. We invariably get a
question like this every week and invariably the answer is "if you give a
table an alias you *must* refer to it by that alias". (I'm sure Tom has
this reply automated by now.) I claim the only thing that buys is
confusion for very little convenience at the other end.

Stop the madness! :)

--
Peter Eisentraut Sernanders v�g 10:115
peter_e@gmx.net 75262 Uppsala
http://yi.org/peter-e/ Sweden

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#3)
Re: aliases break my query

Peter Eisentraut <e99re41@DoCS.UU.SE> writes:

Once again, I think that we *really* need to discuss whether implicit
range table entries in SELECT are a good idea. We invariably get a
question like this every week and invariably the answer is "if you give a
table an alias you *must* refer to it by that alias". (I'm sure Tom has
this reply automated by now.)

No, this one was actually a pretty original way of shooting oneself in
the foot ;-). I thought the interesting point was the confusion between
whether variables in the inner select were supposed to be local to the
inner select or references to the outer select. I'm not sure getting
rid of implicit rangetable entries would've helped prevent that.

I claim the only thing that buys is
confusion for very little convenience at the other end.

Stop the madness! :)

I doubt that it's worth breaking a lot of existing applications for.

At one time Bruce had made some patches to emit informative notice
messages about implicit FROM entries, but that got turned off again
for reasons that I forget...

regards, tom lane

#5Joseph Shraibman
jks@selectacast.net
In reply to: Peter Eisentraut (#3)
Re: aliases break my query

Peter Eisentraut wrote:

playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy =
ta.a) from tablea ta, tableb tb order by tablea.a;

[ produces 80 rows ]

playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy =
ta.a) from tablea ta, tableb tb order by ta.a;

[ produces 20 rows ]

playpen=> select tablea.a,tablea.b,tablea.c, (select count (tableb.zz)
where tableb.yy = tablea.a) order by tablea.a;

[ produces 4 rows ]

Once again, I think that we *really* need to discuss whether implicit
range table entries in SELECT are a good idea.

What is an "implicit range table entry"?

We invariably get a

question like this every week and invariably the answer is "if you give a
table an alias you *must* refer to it by that alias".

Hey, I *did* do that in the second query, and that still produced extra
results. I tried putting the aliases in the inner select too but that
didn't help. In fact the inner select always is 4 in that case. Unless I
only alias tableb in the inner query, and let it get the definition of
tablea from the outer query.

(I'm sure Tom has

Show quoted text

this reply automated by now.) I claim the only thing that buys is
confusion for very little convenience at the other end.

Stop the madness! :)

--
Peter Eisentraut Sernanders v�g 10:115
peter_e@gmx.net 75262 Uppsala
http://yi.org/peter-e/ Sweden

#6Joseph Shraibman
jks@selectacast.net
In reply to: Joseph Shraibman (#1)
Re: aliases break my query

Tom Lane wrote:

Joseph Shraibman <jks@selectacast.net> writes:

These two queries are exactly alike. The first one uses aliases except
for the order by. The second uses aliases also for the order by. The
third uses whole names. The third has the behavior I want.

I think you are confusing yourself by leaving out FROM clauses.
In particular, with no FROM for the inner SELECT it's not real clear
what should happen there. I can tell you what *is* happening, but
who's to say if it's right or wrong?

Well I assumed that the aliases would be inerited from the outer query.

playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy =
ta.a) from tablea ta, tableb tb order by tablea.a;

[ produces 80 rows ]

playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy =
ta.a) from tablea ta, tableb tb order by ta.a;

[ produces 20 rows ]

The difference between these two is that by explicitly specifying
"tablea" in the order-by clause, you've created a three-way join,
as if you had written "from tablea ta, tableb tb, tablea tablea".
Once you write an alias in a from-clause entry, you must refer to
that from-clause entry by its alias, not by its true table name.

I guess I made the mistake of assuming that SQL is logical. I don't know
what I was thinking. ;)

Meanwhile, what of the inner select? It has no FROM clause *and*
no valid table names. The only way to interpret the names in it
is as references to the outer select. So, on any given iteration
of the outer select, the inner select collapses to constants.
It looks like "SELECT count(constant1) WHERE constant2 = constant3"
and so you get either 0 or 1 depending on whether tb.yy and ta.a
from the outer scan are different or equal.

OK that sorta makes sense to be. What I want is the behavior I got with
the third query (below). I want the values in table a, and then a count
of how many entries in tableb have the yy field of tableb that matches
that entry in tablea's a field.

playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) from tableb tb
where tb.yy = ta.a) from tablea ta, tableb tb group by ta.a, ta.b, ta.c
order by ta.a;
a|b|c|?column?
-+-+-+--------
1|2| | 0
2|3|4| 2
3|4|5| 1
4|5|4| 1
(4 rows)

... which is what I want. Thanks.

Show quoted text

playpen=> select tablea.a,tablea.b,tablea.c, (select count (tableb.zz)
where tableb.yy = tablea.a) order by tablea.a;

[ produces 4 rows ]

Here the outer select is not a join at all --- it mentions only tablea,
so you are going to get one output for each tablea row. The inner
select looks like "select count (zz) FROM tableb WHERE yy = <constant>",
so you get an actual scan of tableb for each iteration of the outer
scan.

It's not very clear from these examples what you actually wanted to have
happen, but I suggest that you will have better luck if you specify
explicit FROM lists in both the inner and outer selects, and be careful
that each variable you use clearly refers to exactly one of the
FROM-list entries.

regards, tom lane

#7Zeugswetter Andreas
andreas.zeugswetter@telecom.at
In reply to: Peter Eisentraut (#3)
Re: Re: [SQL] aliases break my query

I claim the only thing that buys is
confusion for very little convenience at the other end.

Stop the madness! :)

I doubt that it's worth breaking a lot of existing applications for.

At one time Bruce had made some patches to emit informative notice
messages about implicit FROM entries, but that got turned off again
for reasons that I forget...

I think we could get agreement to not allow implicit from entries
if there is a from clause in the statement, but allow them if a from clause
is missing altogether. The patch did not distinguish the two cases.

Andreas

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Zeugswetter Andreas (#7)
Re: Re: [SQL] aliases break my query

"Zeugswetter Andreas" <andreas.zeugswetter@telecom.at> writes:

I think we could get agreement to not allow implicit from entries
if there is a from clause in the statement, but allow them if a from clause
is missing altogether. The patch did not distinguish the two cases.

Hmm, that's a thought. Taking it a little further, how about this:

"Emit a notice [or error if you insist] when an implicit FROM item is
added that refers to the same underlying table as any existing FROM
item."

95% of the complaints I can remember seeing were from people who got
confused by the behavior of "FROM table alias" combined with a reference
like "table.column". Seems to me the above rule would catch this case
without being obtrusive in the useful cases. Comments?

regards, tom lane

#9Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#8)
Re: Re: [SQL] aliases break my query

Tom Lane writes:

"Zeugswetter Andreas" <andreas.zeugswetter@telecom.at> writes:

I think we could get agreement to not allow implicit from entries
if there is a from clause in the statement, but allow them if a from clause
is missing altogether.

That's what I had in mind.

"Emit a notice [or error if you insist] when an implicit FROM item is
added that refers to the same underlying table as any existing FROM
item."

That's a step in the right direction, but I'd still like to catch

SELECT a.a1, b.b1 FROM a;

SELECT a.a1 FROM a WHERE a.a2 = b.b1;

both of which are more or less obviously incorrect and easily fixed.

95% of the complaints I can remember seeing were from people who got
confused by the behavior of "FROM table alias" combined with a reference
like "table.column". Seems to me the above rule would catch this case
without being obtrusive in the useful cases. Comments?

--
Peter Eisentraut Sernanders v�g 10:115
peter_e@gmx.net 75262 Uppsala
http://yi.org/peter-e/ Sweden

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#9)
Re: Re: [SQL] aliases break my query

Peter Eisentraut <peter_e@gmx.net> writes:

"Emit a notice [or error if you insist] when an implicit FROM item is
added that refers to the same underlying table as any existing FROM
item."

That's a step in the right direction, but I'd still like to catch
SELECT a.a1, b.b1 FROM a;
SELECT a.a1 FROM a WHERE a.a2 = b.b1;
both of which are more or less obviously incorrect and easily fixed.

More or less obviously nonstandard, you mean. It's unlikely that
either of those examples are incorrect in the sense of not doing what
the user expected them to.

If we were working in a green field then I'd agree that we ought to be
100% SQL-spec-compliant on this point. But as is, we are talking about
rejecting an extension that Postgres has always had and a lot of people
find useful. I'm not eager to do that; I think it'd be putting pedantry
ahead of usefulness and backwards-compatibility. What I want to see is
the minimum restriction that will catch likely errors, not an "I'll
annoy you until you change your queries to meet the letter of the spec"
kind of message.

regards, tom lane

#11Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#8)
Re: Re: [SQL] aliases break my query

"Zeugswetter Andreas" <andreas.zeugswetter@telecom.at> writes:

I think we could get agreement to not allow implicit from entries
if there is a from clause in the statement, but allow them if a from clause
is missing altogether. The patch did not distinguish the two cases.

Hmm, that's a thought. Taking it a little further, how about this:

"Emit a notice [or error if you insist] when an implicit FROM item is
added that refers to the same underlying table as any existing FROM
item."

95% of the complaints I can remember seeing were from people who got
confused by the behavior of "FROM table alias" combined with a reference
like "table.column". Seems to me the above rule would catch this case
without being obtrusive in the useful cases. Comments?

Yes, I even added a define called FROM_WARN. It was disabled, and never
enabled. When can we enable it?

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#12Mike Mascari
mascarm@mascari.com
In reply to: Bruce Momjian (#11)
Re: Re: [SQL] aliases break my query

Bruce Momjian wrote:

"Zeugswetter Andreas" <andreas.zeugswetter@telecom.at> writes:

I think we could get agreement to not allow implicit from entries
if there is a from clause in the statement, but allow them if a from clause
is missing altogether. The patch did not distinguish the two cases.

Hmm, that's a thought. Taking it a little further, how about this:

"Emit a notice [or error if you insist] when an implicit FROM item is
added that refers to the same underlying table as any existing FROM
item."

95% of the complaints I can remember seeing were from people who got
confused by the behavior of "FROM table alias" combined with a reference
like "table.column". Seems to me the above rule would catch this case
without being obtrusive in the useful cases. Comments?

Yes, I even added a define called FROM_WARN. It was disabled, and never
enabled. When can we enable it?

How about a SET variable which allows PostgreSQL to reject any
queries which are not entirely within the specificaton; kind of
like -ansi -pedantic with gcc? Perhaps that's quite a bit of
work, but it seems quite valuable for developing portable
applications...Of course dependency on PostgreSQL extensions
isn't a bad thing either ;-)

Mike Mascari

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mike Mascari (#12)
Re: Re: [SQL] aliases break my query

Mike Mascari <mascarm@mascari.com> writes:

How about a SET variable which allows PostgreSQL to reject any
queries which are not entirely within the specificaton; kind of
like -ansi -pedantic with gcc? Perhaps that's quite a bit of
work, but it seems quite valuable for developing portable
applications...Of course dependency on PostgreSQL extensions
isn't a bad thing either ;-)

Hmm. Some aspects of that seem fairly straightforward, like rejecting
the table-not-in-FROM extension being discussed here. On the other
hand, it'd be painful to check for uses of datatypes or functions not
present in the standard.

In any case, I think the general reaction will be "good idea but a huge
amount of work compared to the reward". Unless someone steps forward
who's willing to do the work, I'd bet this won't happen...

regards, tom lane

#14Andreas Zeugswetter
andreas.zeugswetter@telecom.at
In reply to: Tom Lane (#8)
Re: Re: [SQL] aliases break my query

On Fri, 26 May 2000, Tom Lane wrote:

"Zeugswetter Andreas" <andreas.zeugswetter@telecom.at> writes:

I think we could get agreement to not allow implicit from entries
if there is a from clause in the statement, but allow them if a from clause
is missing altogether. The patch did not distinguish the two cases.

Hmm, that's a thought. Taking it a little further, how about this:

"Emit a notice [or error if you insist] when an implicit FROM item is
added that refers to the same underlying table as any existing FROM
item."

95% of the complaints I can remember seeing were from people who got
confused by the behavior of "FROM table alias" combined with a reference
like "table.column". Seems to me the above rule would catch this case
without being obtrusive in the useful cases. Comments?

I guess I would be more strict on the reason, that people playing with implicit
from entries usually know what they are doing, and thus know how to avoid a from
clause if they want that behavior. I don't see a reason to have one table in the
from clause but not another. This is too misleading for me.

Andreas

#15Thomas Lockhart
lockhart@alumni.caltech.edu
In reply to: Peter Eisentraut (#3)
Re: [HACKERS] Re: aliases break my query

At one time Bruce had made some patches to emit informative notice
messages about implicit FROM entries, but that got turned off again
for reasons that I forget...

It was triggered with common cases from the "outer join" syntax. It took
a while to track down since it was introduced while I was working on the
syntax feature :(

If it *really* needs to be put back in, then we should do so with a flag
so we can disable the warning at compile time, run time, and/or in the
outer join parser area. But imho sprinkling the parser with warnings for
allowed syntax is heading the wrong direction. If it is legal, allow it.
If it is illegal, disallow it. If it is confusing for some, but works
fine for others, it shouldn't become "sort of legal" with a warning.

- Thomas