Using scalar function as set-returning: bug or feature?

Started by Konstantin Knizhnikalmost 8 years ago12 messages
#1Konstantin Knizhnik
k.knizhnik@postgrespro.ru

Hi hackers,

I wonder if the following behavior is considered to be a bug in plpgsql
or it is expected result:

create table my_data(id serial primary key, time timestamp, type text);

create or replace function my_insert(type text) RETURNS BOOLEAN
AS
$BODY$
BEGIN
    insert into my_data (time, type) values (now(), type);
    return true;
END
$BODY$
LANGUAGE plpgsql;

create or replace function my_call_insert() RETURNS BOOLEAN
AS
$BODY$
DECLARE
  b boolean;
BEGIN
    select into b from my_insert('from func atx');
    return b;
END
$BODY$
LANGUAGE plpgsql;

select my_call_insert();
 my_call_insert
----------------

(1 row)

================================================

So, if function returning boolean is used in from list, then no error or
warning is produced, but null value is assigned to the target variable.
If this code is rewritten in more natural way:

    b := my_insert('from func atx');
or
   select my_insert('from func atx') into b;

then function returns expected result (true).
I spent some time investigating this code under debugger and looks like
the reason of the problem is that tuple descriptor for the row returned
by my_insert() contains no columns (number of attributes is zero). May
be there are some reasons for such behavior, but I find it quite
confusing and unnatural. I prefer to report error in this case or return
tuple with single column, containing return value of the function.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In reply to: Konstantin Knizhnik (#1)
Re: Using scalar function as set-returning: bug or feature?

Hello

select into b from my_insert('from func atx');

You missed select something into b. For example,
select ret into b from my_insert('from func atx') as ret;

Using scalar function in from is not bug.
Silent assigning NULL for variables in "into" not matches same in "select"... I think better would be raise warning.

regards, Sergei

#3Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Sergei Kornilov (#2)
Re: Using scalar function as set-returning: bug or feature?

On 09.02.2018 10:47, Sergei Kornilov wrote:

Hello

select into b from my_insert('from func atx');

You missed select something into b. For example,
select ret into b from my_insert('from func atx') as ret;

Using scalar function in from is not bug.
Silent assigning NULL for variables in "into" not matches same in "select"... I think better would be raise warning.

regards, Sergei

Thank you.
Really the problem is caused by empty source list for INTO.
If I rewrite query as

    select my_insert into b from my_insert('from func atx');

then it works as expected.
But I wonder if the original statement should be considered as error or
at least we should produce warning for such empty projections?

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#4Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Konstantin Knizhnik (#3)
1 attachment(s)
Re: Using scalar function as set-returning: bug or feature?

On 09.02.2018 11:02, Konstantin Knizhnik wrote:

On 09.02.2018 10:47, Sergei Kornilov wrote:

Hello

select into b from my_insert('from func atx');

You missed select something into b. For example,
select ret into b from my_insert('from func atx') as ret;

Using scalar function in from is not bug.
Silent assigning NULL for variables in "into" not matches same in
"select"... I think better would be raise warning.

regards, Sergei

Thank you.
Really the problem is caused by empty source list for INTO.
If I rewrite query as

    select my_insert into b from my_insert('from func atx');

then it works as expected.
But I wonder if the original statement should be considered as error
or at least we should produce warning for such empty projections?

Attached please find patch reporting error in case of empty attribute
list for SELECT INTO.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

plpgsql.patchtext/x-patch; name=plpgsql.patchDownload
diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y
index 7f52849..c43d572 100644
--- a/src/pl/plpgsql/src/pl_gram.y
+++ b/src/pl/plpgsql/src/pl_gram.y
@@ -2853,6 +2853,7 @@ make_execsql_stmt(int firsttoken, int location)
 	int					prev_tok;
 	bool				have_into = false;
 	bool				have_strict = false;
+	bool                have_attributes = false;
 	int					into_start_loc = -1;
 	int					into_end_loc = -1;
 
@@ -2907,6 +2908,8 @@ make_execsql_stmt(int firsttoken, int location)
 				continue;		/* INSERT INTO is not an INTO-target */
 			if (firsttoken == K_IMPORT)
 				continue;		/* IMPORT ... INTO is not an INTO-target */
+			if (!have_attributes)
+				yyerror("Empty list for INTO-target");
 			if (have_into)
 				yyerror("INTO specified more than once");
 			have_into = true;
@@ -2915,6 +2918,8 @@ make_execsql_stmt(int firsttoken, int location)
 			read_into_target(&rec, &row, &have_strict);
 			plpgsql_IdentifierLookup = IDENTIFIER_LOOKUP_EXPR;
 		}
+		else
+			have_attributes = true;
 	}
 
 	plpgsql_IdentifierLookup = save_IdentifierLookup;
#5Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Konstantin Knizhnik (#4)
Re: Using scalar function as set-returning: bug or feature?

On 09.02.2018 11:58, Konstantin Knizhnik wrote:

On 09.02.2018 11:02, Konstantin Knizhnik wrote:

On 09.02.2018 10:47, Sergei Kornilov wrote:

Hello

select into b from my_insert('from func atx');

You missed select something into b. For example,
select ret into b from my_insert('from func atx') as ret;

Using scalar function in from is not bug.
Silent assigning NULL for variables in "into" not matches same in
"select"... I think better would be raise warning.

regards, Sergei

Thank you.
Really the problem is caused by empty source list for INTO.
If I rewrite query as

    select my_insert into b from my_insert('from func atx');

then it works as expected.
But I wonder if the original statement should be considered as error
or at least we should produce warning for such empty projections?

Attached please find patch reporting error in case of empty attribute
list for SELECT INTO.

Sorry, this patch doesn't correctly handle case:

SELECT INTO [STRICT] target select_expressions FROM ...;

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#6Marko Tiikkaja
marko@joh.to
In reply to: Konstantin Knizhnik (#4)
Re: Using scalar function as set-returning: bug or feature?

On Fri, Feb 9, 2018 at 9:58 AM, Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> wrote:

Attached please find patch reporting error in case of empty attribute list
for SELECT INTO

This is quite short-sighted. The better way to do this is to complain if
the number of expressions is different from the number of target variables
(and the target variable is not a record-ish type). There's been at least
two patches for this earlier (one my me, and one by, I think Pavel
Stehule). I urge you to dig around in the archives to avoid wasting your
time.

.m

#7Pavel Stehule
pavel.stehule@gmail.com
In reply to: Marko Tiikkaja (#6)
Re: Using scalar function as set-returning: bug or feature?

2018-02-09 12:02 GMT+01:00 Marko Tiikkaja <marko@joh.to>:

On Fri, Feb 9, 2018 at 9:58 AM, Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> wrote:

Attached please find patch reporting error in case of empty attribute
list for SELECT INTO

This is quite short-sighted. The better way to do this is to complain if
the number of expressions is different from the number of target variables
(and the target variable is not a record-ish type). There's been at least
two patches for this earlier (one my me, and one by, I think Pavel
Stehule). I urge you to dig around in the archives to avoid wasting your
time.

This issue can be detected by plpgsql_check and commitfest pipe is patch
that raise warning or error in this case.

Regards

Pavel

Show quoted text

.m

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#7)
Re: Using scalar function as set-returning: bug or feature?

Pavel Stehule <pavel.stehule@gmail.com> writes:

2018-02-09 12:02 GMT+01:00 Marko Tiikkaja <marko@joh.to>:

This is quite short-sighted. The better way to do this is to complain if
the number of expressions is different from the number of target variables
(and the target variable is not a record-ish type). There's been at least
two patches for this earlier (one my me, and one by, I think Pavel
Stehule). I urge you to dig around in the archives to avoid wasting your
time.

This issue can be detected by plpgsql_check and commitfest pipe is patch
that raise warning or error in this case.

I think the issue basically arises from this concern in exec_move_row:

* Row is a bit more complicated in that we assign the individual
* attributes of the tuple to the variables the row points to.
*
* NOTE: this code used to demand row->nfields ==
* HeapTupleHeaderGetNatts(tup->t_data), but that's wrong. The tuple
* might have more fields than we expected if it's from an
* inheritance-child table of the current table, or it might have fewer if
* the table has had columns added by ALTER TABLE. Ignore extra columns
* and assume NULL for missing columns, the same as heap_getattr would do.
* We also have to skip over dropped columns in either the source or
* destination.

As things stand today, we would have a hard time tightening that up
without producing unwanted complaints about the cases mentioned in
this comment, because the DTYPE_ROW logic is used for both "INTO a,b,c"
and composite-type variables. However, my pending patch at
https://commitfest.postgresql.org/17/1439/
gets rid of the use of DTYPE_ROW for composite types, and once that
is in it might well be reasonable to just throw a flat-out error for
wrong number of source values for a DTYPE_ROW target. I can't
immediately think of any good reason why you'd want to allow for
the number of INTO items not matching what the query produces.

regards, tom lane

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#8)
Re: Using scalar function as set-returning: bug or feature?

I wrote:

I think the issue basically arises from this concern in exec_move_row:

* NOTE: this code used to demand row->nfields ==
* HeapTupleHeaderGetNatts(tup->t_data), but that's wrong. The tuple
* might have more fields than we expected if it's from an
* inheritance-child table of the current table, or it might have fewer if
* the table has had columns added by ALTER TABLE.

BTW, this comment is very old, dating to commit 8bb3c8fe5, which was
in response to
/messages/by-id/200104300537.f3U5brK62902@hub.org

Looking at it again now, I wonder whether I didn't make the wrong choice
about how to fix the problem. The above concerns seem valid so far as the
physical number of fields in the tuple go, but they do not apply to the
tupdesc that comes with it. So arguably the real bug was using the
physical tuple rather than the tupdesc to drive the conversion, which is
something we got away from later anyway when we added dropped-column
support. Perhaps it'd be reasonable to throw an error if the number of
non-dropped columns in the source and destination tupdescs don't match,
even for the composite-type case.

The specific issue mentioned in the above bug report is gone anyway: if I
try that test case now, I don't see any tuples with extra columns arriving
at exec_move_row, because we insert a ConvertRowtypeExpr node to convert
the rows of the child table to the parent rowtype. But I'm not quite
prepared to assume that such cases can't arise through other examples.

Now, I do not think we can remove the facility for doing data type
conversion on the fields; undoubtedly there are people relying on being
able to SELECT INTO a numeric variable from an integer source column, for
example. So maybe being lax about the number of columns is a sensible
thing to go along with that. But it sure seems like a foot-gun.

regards, tom lane

#10Tels
nospam-pg-abuse@bloodgate.com
In reply to: Tom Lane (#8)
2 attachment(s)
Re: Using scalar function as set-returning: bug or feature?

Moin Tom,

On Mon, February 12, 2018 5:03 pm, Tom Lane wrote:

Pavel Stehule <pavel.stehule@gmail.com> writes:

2018-02-09 12:02 GMT+01:00 Marko Tiikkaja <marko@joh.to>:

This is quite short-sighted. The better way to do this is to complain
if
the number of expressions is different from the number of target
variables
(and the target variable is not a record-ish type). There's been at
least
two patches for this earlier (one my me, and one by, I think Pavel
Stehule). I urge you to dig around in the archives to avoid wasting
your
time.

[snip a bit]

As things stand today, we would have a hard time tightening that up
without producing unwanted complaints about the cases mentioned in
this comment, because the DTYPE_ROW logic is used for both "INTO a,b,c"
and composite-type variables. However, my pending patch at
https://commitfest.postgresql.org/17/1439/
gets rid of the use of DTYPE_ROW for composite types, and once that
is in it might well be reasonable to just throw a flat-out error for
wrong number of source values for a DTYPE_ROW target. I can't
immediately think of any good reason why you'd want to allow for
the number of INTO items not matching what the query produces.

Perl lets you set a fixed number of multiple variables from an array and
discard the rest like so:

my ($a, $b) = (1,2,3);

and the right hand side can also be a function:

my ($c, $d) = somefunc( 123 );

Where somefunc() returns more than 2 params.

This is quite handy if you sometimes need more ore less return values, but
don't want to create almost-identical functions for each case.

I'm not sure if you mean exactly the scenario as in the attached test
case, but this works in plpgsql, too, and would be a shame to lose.

OTOH, one could also write:

SELECT INTO ba, bb a,b FROM foo(1);

and it would still work, or wouldn't it?

Best regards,

Tels

Attachments:

test.psqlapplication/octet-stream; name=test.psqlDownload
test.plapplication/x-perl; name=test.plDownload
#11Marko Tiikkaja
marko@joh.to
In reply to: Tels (#10)
Re: Using scalar function as set-returning: bug or feature?

On Tue, Feb 13, 2018 at 12:30 PM, Tels <nospam-pg-abuse@bloodgate.com>
wrote:

I'm not sure if you mean exactly the scenario as in the attached test
case, but this works in plpgsql, too, and would be a shame to lose.

OTOH, one could also write:

SELECT INTO ba, bb a,b FROM foo(1);

and it would still work, or wouldn't it?

Yes. The code in test.psql shouldn't pass code review.

.m

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tels (#10)
Re: Using scalar function as set-returning: bug or feature?

"Tels" <nospam-pg-abuse@bloodgate.com> writes:

On Mon, February 12, 2018 5:03 pm, Tom Lane wrote:

... However, my pending patch at
https://commitfest.postgresql.org/17/1439/
gets rid of the use of DTYPE_ROW for composite types, and once that
is in it might well be reasonable to just throw a flat-out error for
wrong number of source values for a DTYPE_ROW target. I can't
immediately think of any good reason why you'd want to allow for
the number of INTO items not matching what the query produces.

Perl lets you set a fixed number of multiple variables from an array and
discard the rest like so:
my ($a, $b) = (1,2,3);
I'm not sure if you mean exactly the scenario as in the attached test
case, but this works in plpgsql, too, and would be a shame to lose.

Well, that's exactly the issue. Whether that's a handy feature or
a foot-gun that hides bugs depends entirely on your point of view.
Personally, this is not the kind of behavior I really want from a
programming language ;-). And I'm sure that if plpgsql were still
enforcing the old rules, and someone came along with a proposal to
relax that to be more like Perl, it'd be laughed down.

Still, backwards compatibility is worth something. I don't really
have a strong opinion about whether to change it or not.

regards, tom lane