References to parameters by name are lost in INSERT INTO ... SELECT <parameter value> .... statements in case of routines with the SQL-standard function body

Started by Erki Eessaarover 4 years ago6 messagesbugs
Jump to latest
#1Erki Eessaar
erki.eessaar@taltech.ee

Hello

PostgreSQL 14 added the feature: "Allow SQL-language functions<https://www.postgresql.org/docs/14/sql-createfunction.html&gt; and procedures<https://www.postgresql.org/docs/14/sql-createprocedure.html&gt; to use SQL-standard function bodies."

If the routine contains INSERT INTO ... SELECT <parameter value> ... statement, then \sf command in psql and pg_get_functiondef function return a CREATE statement where in the SELECT statement the references to the parameters by name have been replaced with positional references.

An example.

CREATE TABLE Person (person_id INTEGER,
e_mail VARCHAR(254) NOT NULL,
last_action TIMESTAMP,
CONSTRAINT pk_person PRIMARY KEY (person_id),
CONSTRAINT ak_person UNIQUE (e_mail));

CREATE TABLE Product (product_code INTEGER,
registrator_id INTEGER NOT NULL,
price NUMERIC(19,4) NOT NULL,
CONSTRAINT pk_product PRIMARY KEY (product_code),
CONSTRAINT fk_product_person FOREIGN KEY (registrator_id) REFERENCES Person(person_id));

CREATE OR REPLACE FUNCTION f_reg_product (p_product_code Product.product_code%TYPE, p_price Product.price%TYPE, p_e_mail Person.e_mail%TYPE)
RETURNS VOID
LANGUAGE SQL SECURITY DEFINER
SET search_path=public, pg_temp
BEGIN ATOMIC
INSERT INTO Product (product_code, price, registrator_id)
SELECT p_product_code, p_price, person_id
FROM Person
WHERE e_mail=p_e_mail;
UPDATE Person SET last_action=LOCALTIMESTAMP(0) WHERE e_mail=p_e_mail;
END;

SELECT pg_get_functiondef(oid) AS func_def
FROM pg_proc
WHERE proname='f_reg_product';

The result.

CREATE OR REPLACE FUNCTION public.f_reg_product(p_product_code integer, p_price numeric, p_e_mail character varying)
RETURNS void
LANGUAGE sql
SECURITY DEFINER
SET search_path TO 'public', 'pg_temp'
BEGIN ATOMIC
INSERT INTO product (product_code, price, registrator_id) SELECT $1 AS p_product_code,
$2 AS p_price,
person_id
FROM person
WHERE ((e_mail)::text = ($3)::text);
UPDATE person SET last_action = LOCALTIMESTAMP(0)
WHERE ((person.e_mail)::text = (f_reg_product.p_e_mail)::text);
END

As you can see, the issue does not affect the UPDATE statement.

Best regards
Erki Eessaar

#2Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Erki Eessaar (#1)
Re: References to parameters by name are lost in INSERT INTO ... SELECT <parameter value> .... statements in case of routines with the SQL-standard function body

On Fri, Nov 12, 2021 at 4:46 AM Erki Eessaar <erki.eessaar@taltech.ee> wrote:

Hello

PostgreSQL 14 added the feature: "Allow SQL-language functions and procedures to use SQL-standard function bodies."

If the routine contains INSERT INTO ... SELECT <parameter value> ... statement, then \sf command in psql and pg_get_functiondef function return a CREATE statement where in the SELECT statement the references to the parameters by name have been replaced with positional references.

An example.

CREATE TABLE Person (person_id INTEGER,
e_mail VARCHAR(254) NOT NULL,
last_action TIMESTAMP,
CONSTRAINT pk_person PRIMARY KEY (person_id),
CONSTRAINT ak_person UNIQUE (e_mail));

CREATE TABLE Product (product_code INTEGER,
registrator_id INTEGER NOT NULL,
price NUMERIC(19,4) NOT NULL,
CONSTRAINT pk_product PRIMARY KEY (product_code),
CONSTRAINT fk_product_person FOREIGN KEY (registrator_id) REFERENCES Person(person_id));

CREATE OR REPLACE FUNCTION f_reg_product (p_product_code Product.product_code%TYPE, p_price Product.price%TYPE, p_e_mail Person.e_mail%TYPE)
RETURNS VOID
LANGUAGE SQL SECURITY DEFINER
SET search_path=public, pg_temp
BEGIN ATOMIC
INSERT INTO Product (product_code, price, registrator_id)
SELECT p_product_code, p_price, person_id
FROM Person
WHERE e_mail=p_e_mail;
UPDATE Person SET last_action=LOCALTIMESTAMP(0) WHERE e_mail=p_e_mail;
END;

SELECT pg_get_functiondef(oid) AS func_def
FROM pg_proc
WHERE proname='f_reg_product';

The result.

CREATE OR REPLACE FUNCTION public.f_reg_product(p_product_code integer, p_price numeric, p_e_mail character varying)
RETURNS void
LANGUAGE sql
SECURITY DEFINER
SET search_path TO 'public', 'pg_temp'
BEGIN ATOMIC
INSERT INTO product (product_code, price, registrator_id) SELECT $1 AS p_product_code,
$2 AS p_price,
person_id
FROM person
WHERE ((e_mail)::text = ($3)::text);
UPDATE person SET last_action = LOCALTIMESTAMP(0)
WHERE ((person.e_mail)::text = (f_reg_product.p_e_mail)::text);
END

As you can see, the issue does not affect the UPDATE statement.

As you mentioned, p_e_mail in the UPDATE statement is not replaced
with a positional reference. But 'f_reg_product.p_e_mail' in the
UPDATE statement seems to correctly refer to the function argument
'p_e_mail'. Does the execution of the function produced by
pg_get_functiondef() produce a different result from the original's
one?

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

#3Erki Eessaar
erki.eessaar@taltech.ee
In reply to: Masahiko Sawada (#2)
Re: References to parameters by name are lost in INSERT INTO ... SELECT <parameter value> .... statements in case of routines with the SQL-standard function body

Hello

Indeed, re-execution of this code without any modifications in it produces the same result.

Still I see here two problems.

* Inconsistency - see INSERT vs. UPDATE. In general, the function and \sf produce the result that references parameters by name if the original had the same type of references. INSERT INTO ... SELECT <parameter value> is an exception in this sense. Please, see also the example at the end of the letter.
* The problem of positional references in general is that changing the order of parameters requires changing the body as well, i.e., the result lost some of its qualities. Thus, one could argue that in this case the input and output are not equivalent.

Best regards
Erki Eessaar

********************

DROP TABLE IF EXISTS Product CASCADE;
DROP TABLE IF EXISTS Person CASCADE;

CREATE TABLE Product (product_code INTEGER,
price NUMERIC(19,4) NOT NULL,
CONSTRAINT pk_product PRIMARY KEY (product_code));

DROP FUNCTION IF EXISTS f_reg_product;

CREATE OR REPLACE FUNCTION f_reg_product (p_product_code Product.product_code%TYPE, p_price Product.price%TYPE)
RETURNS VOID
LANGUAGE SQL SECURITY DEFINER
SET search_path=public, pg_temp
BEGIN ATOMIC
INSERT INTO Product (product_code, price)
VALUES (p_product_code, p_price);
END;

SELECT pg_get_functiondef(oid) AS func_def
FROM pg_proc
WHERE proname='f_reg_product';

CREATE OR REPLACE FUNCTION public.f_reg_product(p_product_code integer, p_price numeric)
RETURNS void
LANGUAGE sql
SECURITY DEFINER
SET search_path TO 'public', 'pg_temp'
BEGIN ATOMIC
INSERT INTO product (product_code, price)
VALUES (f_reg_product.p_product_code, f_reg_product.p_price);
END

CREATE OR REPLACE FUNCTION f_reg_product (p_product_code Product.product_code%TYPE, p_price Product.price%TYPE)
RETURNS VOID
LANGUAGE SQL SECURITY DEFINER
SET search_path=public, pg_temp
BEGIN ATOMIC
INSERT INTO Product (product_code, price)
SELECT p_product_code, p_price;
END;

With a SELECT in the original function there is now positional reference instead of referring by name.

CREATE OR REPLACE FUNCTION public.f_reg_product(p_product_code integer, p_price numeric)
RETURNS void
LANGUAGE sql
SECURITY DEFINER
SET search_path TO 'public', 'pg_temp'
BEGIN ATOMIC
INSERT INTO product (product_code, price) SELECT $1 AS p_product_code,
$2 AS p_price;
END
________________________________
From: Masahiko Sawada <sawada.mshk@gmail.com>
Sent: Tuesday, November 16, 2021 8:52 AM
To: Erki Eessaar <erki.eessaar@taltech.ee>
Cc: pgsql-bugs@lists.postgresql.org <pgsql-bugs@lists.postgresql.org>
Subject: Re: References to parameters by name are lost in INSERT INTO ... SELECT <parameter value> .... statements in case of routines with the SQL-standard function body

On Fri, Nov 12, 2021 at 4:46 AM Erki Eessaar <erki.eessaar@taltech.ee> wrote:

Hello

PostgreSQL 14 added the feature: "Allow SQL-language functions and procedures to use SQL-standard function bodies."

If the routine contains INSERT INTO ... SELECT <parameter value> ... statement, then \sf command in psql and pg_get_functiondef function return a CREATE statement where in the SELECT statement the references to the parameters by name have been replaced with positional references.

An example.

CREATE TABLE Person (person_id INTEGER,
e_mail VARCHAR(254) NOT NULL,
last_action TIMESTAMP,
CONSTRAINT pk_person PRIMARY KEY (person_id),
CONSTRAINT ak_person UNIQUE (e_mail));

CREATE TABLE Product (product_code INTEGER,
registrator_id INTEGER NOT NULL,
price NUMERIC(19,4) NOT NULL,
CONSTRAINT pk_product PRIMARY KEY (product_code),
CONSTRAINT fk_product_person FOREIGN KEY (registrator_id) REFERENCES Person(person_id));

CREATE OR REPLACE FUNCTION f_reg_product (p_product_code Product.product_code%TYPE, p_price Product.price%TYPE, p_e_mail Person.e_mail%TYPE)
RETURNS VOID
LANGUAGE SQL SECURITY DEFINER
SET search_path=public, pg_temp
BEGIN ATOMIC
INSERT INTO Product (product_code, price, registrator_id)
SELECT p_product_code, p_price, person_id
FROM Person
WHERE e_mail=p_e_mail;
UPDATE Person SET last_action=LOCALTIMESTAMP(0) WHERE e_mail=p_e_mail;
END;

SELECT pg_get_functiondef(oid) AS func_def
FROM pg_proc
WHERE proname='f_reg_product';

The result.

CREATE OR REPLACE FUNCTION public.f_reg_product(p_product_code integer, p_price numeric, p_e_mail character varying)
RETURNS void
LANGUAGE sql
SECURITY DEFINER
SET search_path TO 'public', 'pg_temp'
BEGIN ATOMIC
INSERT INTO product (product_code, price, registrator_id) SELECT $1 AS p_product_code,
$2 AS p_price,
person_id
FROM person
WHERE ((e_mail)::text = ($3)::text);
UPDATE person SET last_action = LOCALTIMESTAMP(0)
WHERE ((person.e_mail)::text = (f_reg_product.p_e_mail)::text);
END

As you can see, the issue does not affect the UPDATE statement.

As you mentioned, p_e_mail in the UPDATE statement is not replaced
with a positional reference. But 'f_reg_product.p_e_mail' in the
UPDATE statement seems to correctly refer to the function argument
'p_e_mail'. Does the execution of the function produced by
pg_get_functiondef() produce a different result from the original's
one?

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Erki Eessaar (#3)
Re: References to parameters by name are lost in INSERT INTO ... SELECT <parameter value> .... statements in case of routines with the SQL-standard function body

Erki Eessaar <erki.eessaar@taltech.ee> writes:

Indeed, re-execution of this code without any modifications in it produces the same result.

Right, that printout is functionally equivalent to the original.

Still I see here two problems.
* Inconsistency - see INSERT vs. UPDATE.

Yeah, it's weird that the same parameter is printed two different ways.
I dug into it and found out that we're losing the "context->namespace"
list when recursing into the sub-SELECT from get_insert_query_def.
The fix is trivial (attached). The other places where get_query_def is
invoked quasi-recursively all pass down the parent namespace list already.
The fact that this one is out of step is a very ancient oversight (it's
at least old enough to vote, according to some quick git archaeology).
But as far as I can see, it didn't have any visible consequences until
commit e717a9a18 taught get_parameter() to pay attention to the last
entry of the list. So I'm inclined not to change it before v14.

I was also pretty unhappy that get_parameter() was so naively trusting
that an EXTERN Param must have something to do with the last entry's
argnames array, or even that there must be a last entry. So the attached
makes that a little more paranoid, too.

* The problem of positional references in general is that changing the order of parameters requires changing the body as well, i.e., the result lost some of its qualities. Thus, one could argue that in this case the input and output are not equivalent.

I don't think this argument holds a lot of water, because you could
reverse it too: if the author had originally written $N, maybe there
was a reason why she thought that would be more maintainable than a named
reference. But we don't store any distinction between the two ways of
writing a Param reference, so we can't promise to regenerate it exactly
the way it was written.

Still, it's clear that the intent here was to print a reference-by-name
if possible, and the failure to do so within only the context of
INSERT/SELECT is therefore a bug.

regards, tom lane

Attachments:

fix-sql-function-param-decompile.patchtext/x-diff; charset=us-ascii; name=fix-sql-function-param-decompile.patchDownload+6-4
#5Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Tom Lane (#4)
Re: References to parameters by name are lost in INSERT INTO ... SELECT <parameter value> .... statements in case of routines with the SQL-standard function body

On Wed, Nov 17, 2021 at 6:52 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Erki Eessaar <erki.eessaar@taltech.ee> writes:

Indeed, re-execution of this code without any modifications in it produces the same result.

Right, that printout is functionally equivalent to the original.

Still I see here two problems.
* Inconsistency - see INSERT vs. UPDATE.

Yeah, it's weird that the same parameter is printed two different ways.
I dug into it and found out that we're losing the "context->namespace"
list when recursing into the sub-SELECT from get_insert_query_def.
The fix is trivial (attached). The other places where get_query_def is
invoked quasi-recursively all pass down the parent namespace list already.
The fact that this one is out of step is a very ancient oversight (it's
at least old enough to vote, according to some quick git archaeology).
But as far as I can see, it didn't have any visible consequences until
commit e717a9a18 taught get_parameter() to pay attention to the last
entry of the list. So I'm inclined not to change it before v14.

Agreed. I've confirmed by the attached test that the patch fixes this issue.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

Attachments:

regression_tests.patchapplication/octet-stream; name=regression_tests.patchDownload+36-0
#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Masahiko Sawada (#5)
Re: References to parameters by name are lost in INSERT INTO ... SELECT <parameter value> .... statements in case of routines with the SQL-standard function body

Masahiko Sawada <sawada.mshk@gmail.com> writes:

On Wed, Nov 17, 2021 at 6:52 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yeah, it's weird that the same parameter is printed two different ways.
I dug into it and found out that we're losing the "context->namespace"
list when recursing into the sub-SELECT from get_insert_query_def.
The fix is trivial (attached). The other places where get_query_def is
invoked quasi-recursively all pass down the parent namespace list already.
The fact that this one is out of step is a very ancient oversight (it's
at least old enough to vote, according to some quick git archaeology).
But as far as I can see, it didn't have any visible consequences until
commit e717a9a18 taught get_parameter() to pay attention to the last
entry of the list. So I'm inclined not to change it before v14.

Agreed. I've confirmed by the attached test that the patch fixes this issue.

Pushed, thanks for writing the test case.

regards, tom lane