BUG #17360: array_to_string should be immutable instead of stable

Started by PG Bug reporting formover 4 years ago7 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 17360
Logged by: Gergely Czuczy
Email address: gergely.czuczy@harmless.hu
PostgreSQL version: 13.3
Operating system: FreeBSD13
Description:

Hello,

The array_to_string function should have a volatility of immutable, it
perfectly fits the requirements in the documentation.

Where this comes to play is generated columns, when one wishes to have a
field something like GENERATED ALWAYS AS (array_to_string(ARRAY[selector,
'_domainkey', subdomain], '.')) STORED.

Reading the documentation I haven't found anything that would prevent the
function from being immutable, and enable this kind of usecases.

Best regards,
Gergely

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: PG Bug reporting form (#1)
Re: BUG #17360: array_to_string should be immutable instead of stable

PG Bug reporting form <noreply@postgresql.org> writes:

The array_to_string function should have a volatility of immutable,

Nope. It invokes an arbitrary datatype I/O function,
which might only be stable. As an example:

regression=# begin;
BEGIN
regression=*# select array_to_string(array[now()], ',');
array_to_string
-------------------------------
2022-01-10 09:47:06.605304-05
(1 row)

regression=*# set timezone to UTC;
SET
regression=*# select array_to_string(array[now()], ',');
array_to_string
-------------------------------
2022-01-10 14:47:06.605304+00
(1 row)

Actually, in principle somebody could make a datatype
whose output function is volatile. It's not clear
what the use-case is, though, so we've established a
project policy that code that invokes some arbitrary
I/O function may assume that function is at least stable.
Thus, array_to_string() is marked stable, and likewise
for some other polymorphic functions such as format().

Ideally, perhaps, we'd determine the volatility level of
polymorphic functions like array_to_string() based on the
actual input data types --- but it's hard to see how to do
that.

regards, tom lane

#3Gergely Czuczy
gergely.czuczy@harmless.hu
In reply to: Tom Lane (#2)
Re: BUG #17360: array_to_string should be immutable instead of stable

Hello,

Thanks for explaining Tom, I wasn't aware these things on the internals.

Wouldn't it possible to add polymorphs for immutable types, like
func(text[]), func(int[]), etc, mark those immutable, and keep the
generic for all the other not-explicitly-dealt-with types as stable?
Again, I don't really know the implementation details here, so it's just
my tuppence.

Thanks again for explaining it.

Regards,
Gergely

Show quoted text

On 2022. 01. 10. 15:54, Tom Lane wrote:

PG Bug reporting form <noreply@postgresql.org> writes:

The array_to_string function should have a volatility of immutable,

Nope. It invokes an arbitrary datatype I/O function,
which might only be stable. As an example:

regression=# begin;
BEGIN
regression=*# select array_to_string(array[now()], ',');
array_to_string
-------------------------------
2022-01-10 09:47:06.605304-05
(1 row)

regression=*# set timezone to UTC;
SET
regression=*# select array_to_string(array[now()], ',');
array_to_string
-------------------------------
2022-01-10 14:47:06.605304+00
(1 row)

Actually, in principle somebody could make a datatype
whose output function is volatile. It's not clear
what the use-case is, though, so we've established a
project policy that code that invokes some arbitrary
I/O function may assume that function is at least stable.
Thus, array_to_string() is marked stable, and likewise
for some other polymorphic functions such as format().

Ideally, perhaps, we'd determine the volatility level of
polymorphic functions like array_to_string() based on the
actual input data types --- but it's hard to see how to do
that.

regards, tom lane

#4David G. Johnston
david.g.johnston@gmail.com
In reply to: Tom Lane (#2)
Re: BUG #17360: array_to_string should be immutable instead of stable

On Mon, Jan 10, 2022 at 7:54 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

PG Bug reporting form <noreply@postgresql.org> writes:

The array_to_string function should have a volatility of immutable,

Nope. It invokes an arbitrary datatype I/O function,
which might only be stable. As an example:

regression=# begin;
BEGIN
regression=*# select array_to_string(array[now()], ',');

That feels wrong. It's not like we are passing the "now()" function to the
function and invoking it later. So far as array_to_string is concerned it
is being given a literal value.

By your argument, isfinite(timestamp) should be stable but it is immutable
(it's the first example I stumbled across, I suspect there are quite a few
more I could contrive).

select isfinite(now());

array_to_string
-------------------------------
2022-01-10 09:47:06.605304-05
(1 row)

regression=*# set timezone to UTC;
SET
regression=*# select array_to_string(array[now()], ',');
array_to_string
-------------------------------
2022-01-10 14:47:06.605304+00
(1 row)

[...]

Ideally, perhaps, we'd determine the volatility level of
polymorphic functions like array_to_string() based on the
actual input data types --- but it's hard to see how to do
that.

In short, that doesn't make sense. The volatility level of a function is
only determined by the implementation code of said function. The function
invoking expression volatility level depends upon the most volatile
behavior of all functions used in the expression. That we should be doing
if we aren't already.

David J.

#5David G. Johnston
david.g.johnston@gmail.com
In reply to: David G. Johnston (#4)
Re: BUG #17360: array_to_string should be immutable instead of stable

On Mon, Jan 10, 2022, 08:54 David G. Johnston <david.g.johnston@gmail.com>
wrote:

On Mon, Jan 10, 2022 at 7:54 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

PG Bug reporting form <noreply@postgresql.org> writes:

The array_to_string function should have a volatility of immutable,

Nope. It invokes an arbitrary datatype I/O function,
which might only be stable. As an example:

regression=# begin;
BEGIN
regression=*# select array_to_string(array[now()], ',');

That feels wrong. It's not like we are passing the "now()" function to
the function and invoking it later. So far as array_to_string is concerned
it is being given a literal value.

Nevermind. It's the internal timestampt to text cast I'm forgetting.

David k

Show quoted text
#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: David G. Johnston (#4)
Re: BUG #17360: array_to_string should be immutable instead of stable

"David G. Johnston" <david.g.johnston@gmail.com> writes:

On Mon, Jan 10, 2022 at 7:54 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Nope. It invokes an arbitrary datatype I/O function,
which might only be stable. As an example:

That feels wrong. It's not like we are passing the "now()" function to the
function and invoking it later. So far as array_to_string is concerned it
is being given a literal value.

now() is not the problem; it's the datatype output function that's
the problem. I was perhaps being too thrifty with keystrokes in
my example, so here's another one:

regression=# show timezone;
TimeZone
------------------
America/New_York
(1 row)

regression=# select '{2022-01-10 00:00-05}'::timestamptz[];
timestamptz
----------------------------
{"2022-01-10 00:00:00-05"}
(1 row)

regression=# select array_to_string('{2022-01-10 00:00-05}'::timestamptz[], ',');
array_to_string
------------------------
2022-01-10 00:00:00-05
(1 row)

regression=# set timezone = UTC;
SET
regression=# select '{2022-01-10 00:00-05}'::timestamptz[];
timestamptz
----------------------------
{"2022-01-10 05:00:00+00"}
(1 row)

regression=# select array_to_string('{2022-01-10 00:00-05}'::timestamptz[], ',');
array_to_string
------------------------
2022-01-10 05:00:00+00
(1 row)

Now do you see the issue? The input datum is identical in all four
queries, but the resulting strings are not, so these functions
cannot be considered immutable.

In short, that doesn't make sense. The volatility level of a function is
only determined by the implementation code of said function.

array_to_string is invoking timestamptz_out along the way to
creating its result. Although array_to_string's own behavior
is immutable, timestamptz_out's is not.

The function
invoking expression volatility level depends upon the most volatile
behavior of all functions used in the expression. That we should be doing
if we aren't already.

The core of the difficulty is that although timestamptz_out
is getting called, that's nowhere visible in the parse tree.
I suppose we could decide that it's illegal to allow
array_to_string() or format() to exist, but I don't think
anybody will like that answer.

I did just have a thought about this though --- now that we've
invented planner support functions [1]https://www.postgresql.org/docs/devel/xfunc-optimization.html, maybe we could define
a support function request that is "tell me the true volatility
of this function call". Then array_to_string() could have a
support function that looks at the output function for its
input array's element type, and format()'s could determine the
most volatile of the output functions of any of its inputs, etc.

regards, tom lane

[1]: https://www.postgresql.org/docs/devel/xfunc-optimization.html

#7David G. Johnston
david.g.johnston@gmail.com
In reply to: Tom Lane (#6)
Re: BUG #17360: array_to_string should be immutable instead of stable

On Mon, Jan 10, 2022 at 9:33 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

The core of the difficulty is that although timestamptz_out
is getting called, that's nowhere visible in the parse tree.
I suppose we could decide that it's illegal to allow
array_to_string() or format() to exist, but I don't think
anybody will like that answer.

Yeah. The user expectation (and arguably architectural ideal) that the
output representation of data types is immutable being violated is the
bigger issue here but one that we are basically stuck with. As you note
below, however, since real complaints mostly manifest in polymorphic
situations, coming up with a solution at that scope, one that makes
function volatility a planning time property instead of a create function
time one, makes the most sense. Since the function is being defined for
planning time type resolution, extending that to volatility is internally
consistent.

David J.