Strange behavior with missing column in SQL function

Started by Marcelo Lacerdaalmost 8 years ago3 messagesgeneral

marceloslacerda@gmail.com

almost 8 years ago

Here's the code that reproduces the behavior:
http://paste.debian.net/1035412/

I have already discussed this in the IRC channel but there doesn't seem to
be a consensus on whether this is a bug here's a brief transcript of
RhodiumToad's opinion:

this isn't new, goes back to 9.1 at least
basically, the error path in sql_fn_post_column_ref is a bit confused.
seeing r.c it tries to resolve it as parameter.field, fails, and rather

than reporting the error directly as being a missing field, it just returns
with the reference unresolved

then the outer parser code, having failed to resolve it as table.column

and having had the hook function not override it, reports it on the
assumption that it's a missing table

so it's probably been this way for as long as named parameters have

worked in language sql

msl09: as far as I can tell it's just giving the wrong error in an error

path, everything that's supposed to work does work

msl09: but the error is definitely misleading

My question is "Is this a bug? Should it be reported?"

Tom Lane

tgl@sss.pgh.pa.us

almost 8 years ago

In reply to: Marcelo Lacerda (#1)

Re: Strange behavior with missing column in SQL function

Marcelo Lacerda <marceloslacerda@gmail.com> writes:

Here's the code that reproduces the behavior:
http://paste.debian.net/1035412/

For the archives' sake, the issue of concern here is what error
message to throw for

CREATE OR REPLACE FUNCTION myfunction(myrow mytable)
RETURNS INTEGER AS $$
SELECT myrow.c + myrow.b;
$$ LANGUAGE sql;

if mytable is a composite type that has no "c" column.

I have already discussed this in the IRC channel but there doesn't seem to
be a consensus on whether this is a bug here's a brief transcript of
RhodiumToad's opinion:

this isn't new, goes back to 9.1 at least
basically, the error path in sql_fn_post_column_ref is a bit confused.
seeing r.c it tries to resolve it as parameter.field, fails, and rather
than reporting the error directly as being a missing field, it just returns
with the reference unresolved
then the outer parser code, having failed to resolve it as table.column
and having had the hook function not override it, reports it on the
assumption that it's a missing table
so it's probably been this way for as long as named parameters have
worked in language sql

I can't get real excited about changing this. Yeah, it's obvious to a
human that the other error message wording would be more apropos here,
but there are nearby cases where that's not so. So I'm afraid we'd
just be improving some cases by de-improving others.

The core of the problem is what about ambiguous references such as

CREATE OR REPLACE FUNCTION myfunction(myrow mytable)
RETURNS INTEGER AS $$
SELECT myrow.c + myrow.b FROM myrow;
$$ LANGUAGE sql;

where "myrow" is a table with a different set of column names from
"mytable". The existing behavior for that is to seek the column name
in "myrow" (the table), failing that to seek it in the parameter,
and only to throw an error if both fail. You could make a reasonable
argument that it shouldn't work like that --- having the query-local
table name mask the parameter entirely would likely be better. But
I don't know if we can get away with changing that from a backwards
compatibility standpoint. Even if we thought we could, the existing
hook definition doesn't support doing that; we'd need some other API.

If we did change this, then it'd be reasonable to tighten the error
message behavior; but as things stand, sql_fn_post_column_ref doesn't
know what's going on so it's not in a position to throw a reliable
error message.

regards, tom lane

Marcelo Lacerda

marceloslacerda@gmail.com

almost 8 years ago

In reply to: Tom Lane (#2)

Re: Strange behavior with missing column in SQL function

CREATE OR REPLACE FUNCTION myfunction(myrow mytable)
RETURNS INTEGER AS $$
SELECT myrow.c + myrow.b FROM myrow;
$$ LANGUAGE sql;

where "myrow" is a table with a different set of column names from
"mytable". The existing behavior for that is to seek the column name
in "myrow" (the table), failing that to seek it in the parameter,
and only to throw an error if both fail.

Wow I never thought this would be possible. why didn't the designers of the
language use myrow mytable%ROWTYPE for rows of a table as a parameter,
given that it's a valid type in PL/PGSQL? I figure that way it would have
been way easier to disambiguate the definitions.