plpgsql variable assignment with union is broken

Started by Nonameabout 5 years ago10 messages
#1Noname
easteregg@verfriemelt.org

hi,

i updated our ci pipeline to the latest 14-devel build from the postgresql apt repository:

PostgreSQL 14devel (Debian 14~~devel~20210105.1140-1~285.gitbc43b7c.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-3) 10.2.1 20201224, 64-bit

i found, that the behaviour of variable assignment in combination with union is not working anymore:

DO $$
DECLARE t bool;
begin
t := a FROM ( SELECT true WHERE false ) t(a) UNION SELECT true AS a;
END $$;

before it worked with pg13 and 14-devel with this build:

PostgreSQL 14devel (Debian 14~~devel~20201126.0540-1~210.gitf3a8f73.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.0-16) 10.2.0, 64-bit

is this an intended change or is it a bug?

with kind regards,
richard

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noname (#1)
Re: plpgsql variable assignment with union is broken

easteregg@verfriemelt.org writes:

i found, that the behaviour of variable assignment in combination with union is not working anymore:
DO $$
DECLARE t bool;
begin
t := a FROM ( SELECT true WHERE false ) t(a) UNION SELECT true AS a;
END $$;

is this an intended change or is it a bug?

It's an intended change, or at least I considered the case and thought
that it was useless because assignment will reject any result with more
than one row. Do you have any non-toy example that wouldn't be as
clear or clearer without using UNION? The above sure seems like an
example of awful SQL code.

regards, tom lane

#3Merlin Moncure
mmoncure@gmail.com
In reply to: Tom Lane (#2)
Re: plpgsql variable assignment with union is broken

On Tue, Jan 5, 2021 at 3:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

easteregg@verfriemelt.org writes:

i found, that the behaviour of variable assignment in combination with union is not working anymore:
DO $$
DECLARE t bool;
begin
t := a FROM ( SELECT true WHERE false ) t(a) UNION SELECT true AS a;
END $$;

is this an intended change or is it a bug?

It's an intended change, or at least I considered the case and thought
that it was useless because assignment will reject any result with more
than one row. Do you have any non-toy example that wouldn't be as
clear or clearer without using UNION? The above sure seems like an
example of awful SQL code.

What is the definition of broken here? What is the behavior of the
query with the change and why?

OP's query provably returns a single row and ought to always assign
true as written. A real world example might evaluate multiple
condition branches so that the assignment resolves true if any branch
is true. It could be rewritten with 'OR' of course.

Is this also "broken"?
t := a FROM ( SELECT 'something' WHERE _Flag) t(a) UNION SELECT
'something else' AS a WHERE NOT _Flag;

What about this?
SELECT INTO t true WHERE false
UNION select true;

merlin

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Merlin Moncure (#3)
Re: plpgsql variable assignment with union is broken

Merlin Moncure <mmoncure@gmail.com> writes:

On Tue, Jan 5, 2021 at 3:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

easteregg@verfriemelt.org writes:

i found, that the behaviour of variable assignment in combination with union is not working anymore:
DO $$
DECLARE t bool;
begin
t := a FROM ( SELECT true WHERE false ) t(a) UNION SELECT true AS a;
END $$;
is this an intended change or is it a bug?

It's an intended change, or at least I considered the case and thought
that it was useless because assignment will reject any result with more
than one row. Do you have any non-toy example that wouldn't be as
clear or clearer without using UNION? The above sure seems like an
example of awful SQL code.

What is the definition of broken here? What is the behavior of the
query with the change and why?

The OP is complaining that that gets a syntax error since c9d529848.

OP's query provably returns a single row and ought to always assign
true as written.

My opinion is that (a) it's useless and (b) there has never been any
documentation that claimed that you could do this.

As for (a), given the execution restriction that you can't return more
than one row, I can't think of anything you could do with this that
couldn't be done, both more clearly and far more cheaply, with a CASE
or similar construct.

As for (b), the docs have always given the syntax of plpgsql assignment as
"variable := expression"; whatever you might think an "expression" is,
I dispute that a syntactically-invalid fragment of a UNION query
qualifies. The fact that it was accepted is a completely accidental
artifact of the lax way that plpgsql assignment has been parsed up to now.

(Having said that, I would've preserved it if I could, but it's exactly
the fact that it *isn't* a syntactically valid UNION construct that
makes it hard to do so. If it's possible at all, it would take some
really ugly refactoring of the grammar.)

One last point is that if you're really intent on writing things this
way, you can still do it with SELECT INTO instead of assignment syntax.

In case you're wondering, INTERSECT and EXCEPT are in the same boat.

regards, tom lane

#5Merlin Moncure
mmoncure@gmail.com
In reply to: Tom Lane (#4)
Re: plpgsql variable assignment with union is broken

On Wed, Jan 6, 2021 at 9:39 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Merlin Moncure <mmoncure@gmail.com> writes:

On Tue, Jan 5, 2021 at 3:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

easteregg@verfriemelt.org writes:

i found, that the behaviour of variable assignment in combination with union is not working anymore:
DO $$
DECLARE t bool;
begin
t := a FROM ( SELECT true WHERE false ) t(a) UNION SELECT true AS a;
END $$;
is this an intended change or is it a bug?

It's an intended change, or at least I considered the case and thought
that it was useless because assignment will reject any result with more
than one row. Do you have any non-toy example that wouldn't be as
clear or clearer without using UNION? The above sure seems like an
example of awful SQL code.

What is the definition of broken here? What is the behavior of the
query with the change and why?

The OP is complaining that that gets a syntax error since c9d529848.

OP's query provably returns a single row and ought to always assign
true as written.

My opinion is that (a) it's useless and (b) there has never been any
documentation that claimed that you could do this.

Here is what the documentation says:

variable { := | = } expression;
As explained previously, the expression in such a statement is evaluated by means of an SQL SELECT command sent to the main database engine.

This is valid SQL:
SELECT a FROM ( SELECT true WHERE false ) t(a) UNION SELECT true AS a;

So I'd argue that OP's query *is* syntactically valid per the rules as
I understand them.

and is my opinion entirely consistent with the documentation in that it
a) resolves exactly one row, and:
b) is made syntactically valid by prefixing the expression with SELECT.

Aesthetical considerations are irrelevant IMO.

merlin

#6Pavel Stehule
pavel.stehule@gmail.com
In reply to: Merlin Moncure (#3)
Re: plpgsql variable assignment with union is broken

čt 7. 1. 2021 v 4:20 odesílatel Merlin Moncure <mmoncure@gmail.com> napsal:

On Tue, Jan 5, 2021 at 3:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

easteregg@verfriemelt.org writes:

i found, that the behaviour of variable assignment in combination with

union is not working anymore:

DO $$
DECLARE t bool;
begin
t := a FROM ( SELECT true WHERE false ) t(a) UNION SELECT true

AS a;

END $$;

is this an intended change or is it a bug?

It's an intended change, or at least I considered the case and thought
that it was useless because assignment will reject any result with more
than one row. Do you have any non-toy example that wouldn't be as
clear or clearer without using UNION? The above sure seems like an
example of awful SQL code.

What is the definition of broken here? What is the behavior of the
query with the change and why?

OP's query provably returns a single row and ought to always assign
true as written. A real world example might evaluate multiple
condition branches so that the assignment resolves true if any branch
is true. It could be rewritten with 'OR' of course.

Is this also "broken"?
t := a FROM ( SELECT 'something' WHERE _Flag) t(a) UNION SELECT
'something else' AS a WHERE NOT _Flag;

What about this?
SELECT INTO t true WHERE false
UNION select true;

ANSI SQL allows only SELECT INTO or var := SQL expression and SQL
expression can be (subquery) too

do $$
declare t bool;
begin
t := (SELECT true WHERE false UNION SELECT true );
end;
$$;

Regards

Pavel

Show quoted text

merlin

#7Noname
easteregg@verfriemelt.org
In reply to: Pavel Stehule (#6)
Re: plpgsql variable assignment with union is broken

to be clear, that was existing code within our codebase, it was used as a very bad alternative to an if/else clause. think of it as an tenery operator. thats just straight up bad code as tom said. but even if it is, its a change and i wanted to check, if its intended. so thank you for you time and consideration!

#8Merlin Moncure
mmoncure@gmail.com
In reply to: Pavel Stehule (#6)
Re: plpgsql variable assignment with union is broken

On Wed, Jan 6, 2021 at 11:07 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:

čt 7. 1. 2021 v 4:20 odesílatel Merlin Moncure <mmoncure@gmail.com> napsal:

On Tue, Jan 5, 2021 at 3:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

easteregg@verfriemelt.org writes:

i found, that the behaviour of variable assignment in combination with union is not working anymore:
DO $$
DECLARE t bool;
begin
t := a FROM ( SELECT true WHERE false ) t(a) UNION SELECT true AS a;
END $$;

is this an intended change or is it a bug?

It's an intended change, or at least I considered the case and thought
that it was useless because assignment will reject any result with more
than one row. Do you have any non-toy example that wouldn't be as
clear or clearer without using UNION? The above sure seems like an
example of awful SQL code.

What is the definition of broken here? What is the behavior of the
query with the change and why?

OP's query provably returns a single row and ought to always assign
true as written. A real world example might evaluate multiple
condition branches so that the assignment resolves true if any branch
is true. It could be rewritten with 'OR' of course.

Is this also "broken"?
t := a FROM ( SELECT 'something' WHERE _Flag) t(a) UNION SELECT
'something else' AS a WHERE NOT _Flag;

What about this?
SELECT INTO t true WHERE false
UNION select true;

ANSI SQL allows only SELECT INTO or var := SQL expression and SQL expression can be (subquery) too

This is PLPGSQL not ansi SQL so that's irrelevant. If queries along
the lines of:
var := FROM (SELECT ..) UNION ..

are narrowly broken, ok, but is that all that's going on here? I
guess I ought to test.

I have a 300k line pl/pgsql project, this thread is terrifying me. I
am going to be blunt here and say I am not comfortable with tightening
pl/pgsql syntax without an impact assessment, The point that this is
undocumanted behavior is weak, and it's already turning up problem
reports. IMO, expectation has been clearly set that
var := expression;

is more or less interchangeable with
SELECT INTO var expression;

Again, if this is narrowly confined to assignment into set query
operations, maybe this is not so bad. But is it?

merlin

#9Pavel Stehule
pavel.stehule@gmail.com
In reply to: Merlin Moncure (#8)
Re: plpgsql variable assignment with union is broken

čt 7. 1. 2021 v 17:29 odesílatel Merlin Moncure <mmoncure@gmail.com> napsal:

On Wed, Jan 6, 2021 at 11:07 PM Pavel Stehule <pavel.stehule@gmail.com>
wrote:

čt 7. 1. 2021 v 4:20 odesílatel Merlin Moncure <mmoncure@gmail.com>

napsal:

On Tue, Jan 5, 2021 at 3:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

easteregg@verfriemelt.org writes:

i found, that the behaviour of variable assignment in combination

with union is not working anymore:

DO $$
DECLARE t bool;
begin
t := a FROM ( SELECT true WHERE false ) t(a) UNION SELECT

true AS a;

END $$;

is this an intended change or is it a bug?

It's an intended change, or at least I considered the case and thought
that it was useless because assignment will reject any result with

more

than one row. Do you have any non-toy example that wouldn't be as
clear or clearer without using UNION? The above sure seems like an
example of awful SQL code.

What is the definition of broken here? What is the behavior of the
query with the change and why?

OP's query provably returns a single row and ought to always assign
true as written. A real world example might evaluate multiple
condition branches so that the assignment resolves true if any branch
is true. It could be rewritten with 'OR' of course.

Is this also "broken"?
t := a FROM ( SELECT 'something' WHERE _Flag) t(a) UNION SELECT
'something else' AS a WHERE NOT _Flag;

What about this?
SELECT INTO t true WHERE false
UNION select true;

ANSI SQL allows only SELECT INTO or var := SQL expression and SQL

expression can be (subquery) too

This is PLPGSQL not ansi SQL so that's irrelevant. If queries along
the lines of:
var := FROM (SELECT ..) UNION ..

are narrowly broken, ok, but is that all that's going on here? I
guess I ought to test.

I have a 300k line pl/pgsql project, this thread is terrifying me. I
am going to be blunt here and say I am not comfortable with tightening
pl/pgsql syntax without an impact assessment, The point that this is
undocumanted behavior is weak, and it's already turning up problem
reports. IMO, expectation has been clearly set that
var := expression;

is more or less interchangeable with
SELECT INTO var expression;

Again, if this is narrowly confined to assignment into set query
operations, maybe this is not so bad. But is it?

PLpgSQL_Expr: opt_target_list
<--><--><-->from_clause where_clause
<--><--><-->group_clause having_clause window_clause
<--><--><-->opt_sort_clause opt_select_limit opt_for_locking_clause
<--><--><--><-->{

So SELECT INTO and assignment are not fully compatible now.

Regards

Pavel

Show quoted text

merlin

#10Merlin Moncure
mmoncure@gmail.com
In reply to: Pavel Stehule (#9)
Re: plpgsql variable assignment with union is broken

On Thu, Jan 7, 2021 at 10:55 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:

Again, if this is narrowly confined to assignment into set query
operations, maybe this is not so bad. But is it?

PLpgSQL_Expr: opt_target_list
<--><--><-->from_clause where_clause
<--><--><-->group_clause having_clause window_clause
<--><--><-->opt_sort_clause opt_select_limit opt_for_locking_clause
<--><--><--><-->{

So SELECT INTO and assignment are not fully compatible now.

OK. Well, I went ahead and checked the code base for any suspicious
assignments that might fall out of the tightened syntax. It was
cursory, but didn't turn up anything obvious. So I'm going to change
my position on this.

My feedback would be to take backwards compatibility very seriously
relating to language changes in the future, and to rely less on
documentation arguments as it relates to change justification. The
current behavior has been in place for decades and is therefore a de
facto standard. Change freedom ought to be asserted in scenarios
where behavior is reserved as undefined or is non standard rather than
assumed. Rereading the development thread, it seems a fairly short
timeframe between idea origination and commit, and hypothetical
impacts to existing code bases was not rigorously assessed. I guess
it's possible this may ultimately cause some heartburn but it's hard
to say without strong data to justify that position.

Having said all of that, I'm very excited about the array assignment
improvements and investment in this space is very much appreciated. .

merlin