Generated column is not updated (Postgres 13)
Hello,
I would like to report the following:
- a generated column is never updated if you pass the whole record to a
stored procedure (an immutable function);
- however, it works fine if you pass individual columns to the function;
- both options work fine on Postgres 12.
I consider it a backward compatibility issue.
Please correct me if I am wrong, but I was not able to find anything
related to this in the release notes for 13.x.
The background of this issue.
During the upgrade from 12.5 to 13.3 with pg_upgrade I had to drop a
generated column and some other stuff. After the upgrade was successfully
done, I needed to restore everything back. When I was adding the generated
column all of a sudden I got a "Segmentation fault". I retried a few times
with slightly different variants of the code, but each time I would get the
same result - server crashed. This has been in production for many months
now and never caused any issue.
2021-05-18 19:30:46 GMT [235415-9] LOG: server process (PID 235429) was
terminated by signal 11: Segmentation fault
2021-05-18 19:30:46 GMT [235415-10] DETAIL: Failed process was running:
ALTER TABLE ordering.requests_3pp ADD unique_hash bytea GENERATED ALWAYS AS
(fn_ordering.calc_3pp_req_hash(requests_3pp.*)) STORED NOT NULL;
2021-05-18 19:30:46 GMT [235415-11] LOG: terminating any other active
server processes
2021-05-18 19:30:46 GMT [235422-1] WARNING: terminating connection because
of crash of another server process
2021-05-18 19:30:46 GMT [235422-2] DETAIL: The postmaster has commanded
this server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2021-05-18 19:30:46 GMT [235422-3] HINT: In a moment you should be able to
reconnect to the database and repeat your command.
2021-05-18 19:30:46 GMT [235415-12] LOG: archiver process (PID 235423)
exited with exit code 2
2021-05-18 19:30:46 GMT [235473-1] postgres@scas_acceptance FATAL: the
database system is in recovery mode
2021-05-18 19:30:46 GMT [235415-13] LOG: all server processes terminated;
reinitializing
2021-05-18 19:30:46 GMT [235474-1] LOG: database system was interrupted;
last known up at 2021-05-18 19:29:47 GMT
2021-05-18 19:30:46 GMT [235474-2] LOG: database system was not properly
shut down; automatic recovery in progress
2021-05-18 19:30:46 GMT [235474-3] LOG: redo starts at C5B/E70000A0
2021-05-18 19:30:46 GMT [235474-4] LOG: redo done at C5B/E706FF60
2021-05-18 19:30:46 GMT [235415-14] LOG: database system is ready to
accept connections
I started digging, each time simplifying something, and eventually I
managed to make it working. I created a simple procedure which returns
"id::text::bytea", then I added a new column and then I recreated
the procedure with the initial business logic. And that's the reason I am
reporting it, because it turned out that with Postgres 13 *a generated
column is never updated if you pass the whole record to the stored
procedure*. Like I said, it works fine in production on Postgres 12.5. As a
workaround I will use a trigger, calling the same function until it's fixed.
I've run my tests against 13.2 and 13.3 (see below). But actually I think
the best and easiest way of reproducing it would be using the official
Docker images from Docker Hub. You can be absolutely sure it's executed in
the exact same environment as I have, and that I don't use any specific
configuration options etc. However, it's at your choice. This is how you
can create the containers and start "psql" inside them:
*Postgres 13.3:*
$ docker pull postgres:13.3$ docker run -d --name pg13 -e
POSTGRES_PASSWORD=postgres postgres:13.3
$ docker exec -it -u postgres pg13 psql
*Postgres 12.7:*
$ docker pull postgres:12.7
$ docker run -d --name pg12 -e POSTGRES_PASSWORD=postgres postgres:12.7
$ docker exec -it -u postgres pg12 psql
And then you can run the following SQL test case:
-----------------------------------------------------------
create table t(a text, b int);
insert into t values ('A', 1), ('B', 2), ('C', 3);
create or replace function calc_gen_plain(text, int) returns text
language sql immutable as
$$ select $1||'-'||$2; $$;
create or replace function calc_gen_rec_sql(t) returns text
language sql immutable as
$$ select $1.a||'-'||$1.b; $$;
create or replace function calc_gen_rec_plpgsql(t) returns text
language plpgsql immutable as
$$ begin
raise notice '[plpgsql] a=%, b=%', $1.a, $1.b;
return $1.a||'-'||$1.b;
end; $$;
alter table t add gen_val text not null generated always as
(calc_gen_plain(a, b)) stored;
alter table t add gen_rec1 text not null generated always as
(calc_gen_rec_sql(t)) stored;
alter table t add gen_rec2 text not null generated always as
(calc_gen_rec_plpgsql(t)) stored;
select t.* from t;
update t set a = chr(ascii(a) + 3), b = b + 3;
select t.* from t;
-----------------------------------------------------------
As a result, after the UPDATE command I expect all generated columns to
contain the same value (within each row), and that's what I actually get
with Postgres 12:
a | b | gen_val | gen_rec1 | gen_rec2
---+---+---------+----------+----------
D | 4 | D-4 | D-4 | D-4
E | 5 | E-5 | E-5 | E-5
F | 6 | F-6 | F-6 | F-6
But with Postgres 13 the two last columns are not updated.
Moreover, the "raise notice" statement in the calc_gen_rec_plpgsql()
function is not executed for the "UPDATE" command, so I think the function
is simply not called at all.
a | b | gen_val | gen_rec1 | gen_rec2
---+---+---------+----------+----------
D | 4 | D-4 | A-1 | A-1
E | 5 | E-5 | B-2 | B-2
F | 6 | F-6 | C-3 | C-3
Please look into the attachments for detailed output.
Besides, below you will find some info about my environments where I've got
the initial "Segmentation fault" issue.
Thank you!
*Postgres 13.2* installed from standard binary packages.
$ apt list --installed | grep postgres
postgresql-13/now 13.2-1.pgdg18.04+1 amd64 [installed,upgradable to:
13.3-1.pgdg18.04+1]
postgresql-13-cron/now 1.3.0-2.pgdg18.04+1 amd64 [installed,upgradable to:
1.3.1-1.pgdg18.04+1]
postgresql-13-mysql-fdw/bionic-pgdg,now 2.5.5-2.pgdg18.04+1 amd64
[installed]
postgresql-13-repack/bionic-pgdg,now 1.4.6-1.pgdg18.04+1 amd64 [installed]
postgresql-client-13/now 13.2-1.pgdg18.04+1 amd64 [installed,upgradable to:
13.3-1.pgdg18.04+1]
postgresql-client-common/now 225.pgdg18.04+1 all [installed,upgradable to:
226.pgdg18.04+1]
postgresql-common/now 225.pgdg18.04+1 all [installed,upgradable to:
226.pgdg18.04+1]
$ uname -a
Linux elxajw3dxz1 5.4.0-73-generic #82~18.04.1-Ubuntu SMP Fri Apr 16
15:10:02 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.5 LTS
Release: 18.04
Codename: bionic
*Postgres 13.3* installed from binary packages "13.3-2PGDG.rhel8"
posted on 18 May (
https://download.postgresql.org/pub/repos/yum/13/redhat/rhel-8-x86_64/):
postgresql13-13.3-2PGDG.rhel8.x86_64.rpm
postgresql13-contrib-13.3-2PGDG.rhel8.x86_64.rpm
postgresql13-libs-13.3-2PGDG.rhel8.x86_64.rpm
postgresql13-server-13.3-2PGDG.rhel8.x86_64.rpm
$ uname -a
Linux seliiudb01107 4.18.0-193.19.1.el8_2.x86_64 #1 SMP Wed Aug 26 15:29:02
EDT 2020 x86_64 x86_64 x86_64 GNU/Linux
$ lsb_release -a
LSB Version: :core-4.1-amd64:core-4.1-noarch
Distributor ID: RedHatEnterprise
Description: Red Hat Enterprise Linux release 8.2 (Ootpa)
Release: 8.2
Codename: Ootpa
Regards,
Vitaly Ustinov
Vitaly Ustinov <vitaly@ustinov.ca> writes:
I would like to report the following:
- a generated column is never updated if you pass the whole record to a
stored procedure (an immutable function);
TBH, I think that this is insane and needs to be forbidden. What value
are you expecting that the function would see in the column of the
whole-row var that corresponds to the generated column? It surely
cannot be passed the value that it hasn't computed yet.
You could perhaps argue that it'd be okay to pass NULL. The problem
with that, though, is that it'd violate the NOT NULL constraint that
exists for the generated column. Quite aside from any confusion that
ensues at the user level, I'm afraid that that could result in C code
crashes --- there are, for example, places in tuple deforming that
assume that NOT NULL constraints are truthful.
Anyway, I tried your test case on a debug build and observed assertion
failures, which I traced to ATRewriteTable not being careful enough
to zero out the whole tuple each time through. Conceivably we could
fix that as per the attached quick hack. However, I think we ought
to disallow the case instead. I observe that we already disallow
generated columns depending on each other:
regression=# create table foo (f1 int);
CREATE TABLE
regression=# alter table foo add f2 int generated always as (f1+1) stored;
ALTER TABLE
regression=# alter table foo add f3 int generated always as (f2+1) stored;
ERROR: cannot use generated column "f2" in column generation expression
DETAIL: A generated column cannot reference another generated column.
But a whole-row var violates this concept completely: it makes the
generated column depend, not only on every other column, but on itself
too. Also, even if you don't mind null-for-not-yet-computed-value,
that would expose the computation order of the generated columns.
regards, tom lane
Attachments:
hack-to-stop-crash-with-self-referential-generated-column.patchtext/x-diff; charset=us-ascii; name*0=hack-to-stop-crash-with-self-referential-generated-column.p; name*1=atchDownload+16-14
On Wed, May 19, 2021 at 3:03 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Vitaly Ustinov <vitaly@ustinov.ca> writes:
I would like to report the following:
- a generated column is never updated if you pass the whole record to a
stored procedure (an immutable function);You could perhaps argue that it'd be okay to pass NULL. The problem
with that, though, is that it'd violate the NOT NULL constraint that
exists for the generated column. Quite aside from any confusion that
ensues at the user level, I'm afraid that that could result in C code
crashes --- there are, for example, places in tuple deforming that
assume that NOT NULL constraints are truthful.
Knowing why version 12 worked would be helpful in judging whether to live
with past decisions, fix poorly made previous decisions, or just move
forward with the new behavior in v14 and leave the past alone. I agree
that the better design choice would have us forbid this, though that seems
problematic in its own right. The function itself is defined correctly and
aside from the fact it is in a generated expression it is also called
correctly. I suppose the error would be of the form "cannot form a valid
whole-row var due to unspecified generated column data"?
FWIW I can definitely see where the OP is coming from with this - the
expression at first blush, if one assumes PostgreSQL can handle the
nuances, seems like a perfectly reasonable semantic way to express the
programmer's desire. Combined with the fact it used to work makes me want
to lean toward keeping it working even if it takes come code hackery.
David J.
"David G. Johnston" <david.g.johnston@gmail.com> writes:
FWIW I can definitely see where the OP is coming from with this - the
expression at first blush, if one assumes PostgreSQL can handle the
nuances, seems like a perfectly reasonable semantic way to express the
programmer's desire. Combined with the fact it used to work makes me want
to lean toward keeping it working even if it takes come code hackery.
I dunno, I think it "used to work" only for exceedingly small values
of "work". For me, the given test case hits the same assertion failure
in all versions back to v12, which is unsurprising because the code in
ATRewriteTable is about the same.
Also, considering only the table-rewrite code path is a mistake.
The fundamental problem here is that the behavior is ill-defined and
necessarily implementation-dependent; which makes it likely that other
code paths behave inconsistently with ALTER TABLE ADD COLUMN.
regards, tom lane
I wrote:
... I think we ought
to disallow the case instead. I observe that we already disallow
generated columns depending on each other: ...
But a whole-row var violates this concept completely: it makes the
generated column depend, not only on every other column, but on itself
too. Also, even if you don't mind null-for-not-yet-computed-value,
that would expose the computation order of the generated columns.
After actually looking at the code involved, I'm even more on the
warpath. Not only is it failing to reject whole-row vars, but it's
failing to reject system columns. That is (a) infeasible to support,
given that we don't know the values of the system columns at the time
we compute generated expressions, and (b) just plain ludicrous in
expressions that are required to be immutable.
I see that there is actually a regression test case that believes
that "tableoid" should be allowed, but I think that is nonsense.
In the first place, it's impossible to claim that tableoid is an
immutable expression. Consider, say, "tableoid > 30000". Do you
think such a column is likely to survive dump-and-reload unchanged?
Also, while that example is artificial, I'm having a hard time
coming up with realistic immutable use-cases for generation
expressions involving tableoid.
In the second place, there are a bunch of implementation dependencies
that we'd have to fix if we want to consider that supported. I think
it's mostly accidental that the case seems to work in the mainline
INSERT code path. It's not hard to find cases where it does not work,
for example
regression=# create table foo (f1 int);
CREATE TABLE
regression=# insert into foo values (1);
INSERT 0 1
regression=# alter table foo add column f2 oid GENERATED ALWAYS AS (tableoid) STORED;
ALTER TABLE
regression=# table foo;
f1 | f2
----+----
1 | 0
(1 row)
So I think we should just forbid tableoid along with other system
columns, as attached.
regards, tom lane
Attachments:
disallow-wholerow-and-system-cols-in-GENERATED.patchtext/x-diff; charset=us-ascii; name=disallow-wholerow-and-system-cols-in-GENERATED.patchDownload+31-20
Hi,
Thank you very much for the quick response and feedback. I completely
understand your point, Tom. And I can go back to using triggers
instead. After all, this whole "generated columns" feature is just
syntax sugar. In my real case, the function accepts a row containing
dozens of columns and returns a SHA-1 hash that must be unique,
following pretty sophisticated business logic. Something like: if type
= X and subtype = Y then combine these fields, else if ... and so on.
That's why it's so convenient to pass the whole row.
For the record, I think that passing NULL as a value for all generated
columns would not be such a bad idea, because that's exactly what NULL
represents - an unknown value. And I agree that it would be insane to
rely on the order of calculation, if someone decided to read a value
that is still being computed. It reminds me of the famous "mutating
table" issue while using triggers.
As to the "NOT NULL" and other sorts of constraints - it's also fine,
because integrity constraints are applied later. Just to illustrate my
idea:
create table foo(
id serial,
val text,
hash bytea not null unique
);
insert into foo(val) values('A');
If I had a "before insert on foo for each row" trigger, what would be
the initial "hash" value in it? It would be NULL.
Can I temporarily assign "NEW.hash := NULL" in this trigger, until I
have not yet reached the "return NEW" statement? Yes, I can.
Can I temporarily assign a non-unique value to "NEW.hash"? Yes, I can.
Anyway, I trust your discretion. Thanks!
Regards,
Vitaly Ustinov
Regards,
Vitaly Ustinov
Show quoted text
On Wed, May 19, 2021 at 9:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
... I think we ought
to disallow the case instead. I observe that we already disallow
generated columns depending on each other: ...
But a whole-row var violates this concept completely: it makes the
generated column depend, not only on every other column, but on itself
too. Also, even if you don't mind null-for-not-yet-computed-value,
that would expose the computation order of the generated columns.After actually looking at the code involved, I'm even more on the
warpath. Not only is it failing to reject whole-row vars, but it's
failing to reject system columns. That is (a) infeasible to support,
given that we don't know the values of the system columns at the time
we compute generated expressions, and (b) just plain ludicrous in
expressions that are required to be immutable.I see that there is actually a regression test case that believes
that "tableoid" should be allowed, but I think that is nonsense.In the first place, it's impossible to claim that tableoid is an
immutable expression. Consider, say, "tableoid > 30000". Do you
think such a column is likely to survive dump-and-reload unchanged?
Also, while that example is artificial, I'm having a hard time
coming up with realistic immutable use-cases for generation
expressions involving tableoid.In the second place, there are a bunch of implementation dependencies
that we'd have to fix if we want to consider that supported. I think
it's mostly accidental that the case seems to work in the mainline
INSERT code path. It's not hard to find cases where it does not work,
for exampleregression=# create table foo (f1 int);
CREATE TABLE
regression=# insert into foo values (1);
INSERT 0 1
regression=# alter table foo add column f2 oid GENERATED ALWAYS AS (tableoid) STORED;
ALTER TABLE
regression=# table foo;
f1 | f2
----+----
1 | 0
(1 row)So I think we should just forbid tableoid along with other system
columns, as attached.regards, tom lane
Tom Lane wrote:
TBH, I think that this is insane and needs to be forbidden. What value
are you expecting that the function would see in the column of the
whole-row var that corresponds to the generated column? It surely
cannot be passed the value that it hasn't computed yet.
Perhaps this problem has already been addressed and resolved.
https://www.postgresql.org/docs/13/trigger-definition.html
Stored generated columns are computed after BEFORE triggers and
before AFTER triggers. Therefore, the generated value can be inspected
in AFTER triggers. In BEFORE triggers, the OLD row contains the old
generated value, as one would expect, but the NEW row does not yet
contain the new generated value and should not be accessed.
In the C language interface, the content of the column is undefined at
this point; a higher-level programming language should prevent access
to a stored generated column in the NEW row in a BEFORE trigger.
To bottom-line, this is the responsibility of the app developers.
Disclaimer: we warned you.
Tom Lane wrote:
However, I think we ought to disallow the case instead. I observe that
we already disallow generated columns depending on each other.
Right, and a function getting a whole-row var can bypass this limitation.
But let's take a look at the same documentation page again and check
what it says about interdependent triggers:
If more than one trigger is defined for the same event on the same
relation, the triggers will be fired in alphabetical order by trigger name.
In the case of BEFORE and INSTEAD OF triggers, the possibly-modified
row returned by each trigger becomes the input to the next trigger.
Similar to the described behavior, a solution could be to calculate
the generated
columns in alphabetical order. But I'd rather use the same formulation "should
not be accessed" and shift the responsibility to the app developers.
Regards,
Vitaly Ustinov
After looking at this further, I see that there already is a check
for system columns in GENERATED expressions; it's just in the
parser (scanNSItemForColumn), not where I was looking for it.
And it explicitly exempts tableoid. I continue to think that
that's a bad idea and we're gonna regret it, but at least the
issue in ATRewriteTable turns out to be trivial to fix.
So I did that and upgraded the test scenario to be a bit
more realistic, in 0001 below. 0002 then just disallows
whole-row variables. (I added an errdetail trying to
explain that restriction, too.)
It's worth pointing out here that what seems to me to be a
"more realistic" test scenario involves a regclass constant
for the table's own OID. If that doesn't scare you, it should,
because it implies all kinds of constraints on the order in which
these expressions are processed during CREATE or ALTER TABLE.
The test passes check-world, which includes dump/reload, but I am
sorely afraid that there are ways in which this sort of thing
would fail. Do we *really* want to buy into supporting it?
regards, tom lane
In a BEFORE trigger (PL/pgSQL) I can access not yet computed generated
columns via the "NEW" whole-row var. I can read from it (NULL), and I
can write to it, but the assigned value will be ignored.
This behavior seems obvious and well thought through to me. It's not
something app developers should do, because it wouldn't make sense,
but at least it's not forbidden and the server process does not crash
with a seg fault.
Is it okay with you that the patch 0002 introduces some inconsistency with that?
Regards,
Vitaly
Vitaly Ustinov <vitaly@ustinov.ca> writes:
In a BEFORE trigger (PL/pgSQL) I can access not yet computed generated
columns via the "NEW" whole-row var. I can read from it (NULL), and I
can write to it, but the assigned value will be ignored.
NEW is not really a whole-row var in the sense that we're discussing
here. In any case, yes, nodeModifyTable runs computation of generated
columns after firing BEFORE triggers. I suppose that's to prevent
the triggers from interfering with those values.
Is it okay with you that the patch 0002 introduces some inconsistency with that?
In what way would this introduce (new) inconsistency? I didn't move
the computation of those values from where they were. Moreover, so
far as I can see, ATRewriteTable doesn't fire BEFORE triggers at all.
BTW, while looking around nodeModifyTable, I noted some other gaps in our
support for "tableoid" in GENERATED expressions: the FDW code paths in
ExecInsert and ExecUpdate fail to fill tts_tableOid till after running
ExecComputeStoredGenerated. So they have the same bug that 0001 fixes
in ATRewriteTable. It can be fixed the same way, ie move the code
that fills tts_tableOid; but I remain very concerned about how many
other gaps there are.
regards, tom lane