PG 14.5 -- Impossible to restore dump due to interaction/order of views, functions, and generated columns

Started by Nunya Businessover 3 years ago6 messagesbugsgeneral
Jump to latest
#1Nunya Business
nb3425586@gmail.com
bugsgeneral

Good afternoon,

I've recently run into a weird issue that I'm trying to gather more data
on before sending an official bug report on the off chance that it's
already been addressed.

Within my schema there is a table that has a GENERATED ALWAYS column
that calls a plpgsql function. The called function has a "row type"
variable declared that references a view. While the schema itself
functions properly day to day, and pg_dumpall works as expected, the
generated SQL fails to successfully execute. The table in question is
restored with no rows, and an error is generated during the COPY stating
that the type does not exist.

The issue appears to be that the COPY statement for the data is trying
to execute the function specified for the GENERATED ALWAYS column, and
that function cannot run because the view that the function references
does not yet exist.

The dump was made with: pg_dumpall -c --quote-all-identifiers
--exclude-database=postgres --exclude-database=template0
--exclude-database=template1

Is this a known or unknown issue, or am I just missing something?

Any insight is appreciated. Please reply-all as I'm not currently
subscribed to the list. Thanks in advance!

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nunya Business (#1)
bugsgeneral
Re: PG 14.5 -- Impossible to restore dump due to interaction/order of views, functions, and generated columns

"Nunya Business" <nb3425586@gmail.com> writes:

Within my schema there is a table that has a GENERATED ALWAYS column
that calls a plpgsql function. The called function has a "row type"
variable declared that references a view. While the schema itself
functions properly day to day, and pg_dumpall works as expected, the
generated SQL fails to successfully execute. The table in question is
restored with no rows, and an error is generated during the COPY stating
that the type does not exist.

Hmm, do you have actually circular dependencies in that? pg_dump has
some heuristics for dealing with such cases, but maybe it needs more.
Please create a self-contained example and submit it to pgsql-bugs.

regards, tom lane

#3Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Nunya Business (#1)
bugsgeneral
Re: PG 14.5 -- Impossible to restore dump due to interaction/order of views, functions, and generated columns

On 12/5/22 11:49, Nunya Business wrote:

Good afternoon,

I've recently run into a weird issue that I'm trying to gather more data
on before sending an official bug report on the off chance that it's
already been addressed.

Within my schema there is a table that has a GENERATED ALWAYS column
that calls a plpgsql function.  The called function has a "row type"
variable declared that  references a view.  While the schema itself
functions properly day to day, and pg_dumpall works as expected, the
generated SQL fails to successfully execute.  The table in question is
restored with no rows, and an error is generated during the COPY stating
that the type does not exist.

The issue appears to be that the COPY statement for the data is trying
to execute the function specified for the GENERATED ALWAYS column, and
that function cannot run because the view that the function references
does not yet exist.

The dump was made with: pg_dumpall -c --quote-all-identifiers
--exclude-database=postgres --exclude-database=template0
--exclude-database=template1

Is this a known or unknown issue, or am I just missing something?

https://www.postgresql.org/docs/current/sql-createtable.html

"GENERATED ALWAYS AS ( generation_expr ) STORED

This clause creates the column as a generated column. The column
cannot be written to, and when read the result of the specified
expression will be returned.

The keyword STORED is required to signify that the column will be
computed on write and will be stored on disk.

The generation expression can refer to other columns in the table,
but not other generated columns. Any functions and operators used must
be immutable. **References to other tables are not allowed.**
"

Emphasis(**) added.

I'm going to say hiding the table/view reference in a function is not
going to work any better then when folks try that in a CHECK constraint.

Any insight is appreciated.  Please reply-all as I'm not currently
subscribed to the list.  Thanks in advance!

--
Adrian Klaver
adrian.klaver@aklaver.com

#4Nunya Business
nb3425586@gmail.com
In reply to: Tom Lane (#2)
bugsgeneral
Re[2]: PG 14.5 -- Impossible to restore dump due to interaction/order of views, functions, and generated columns

"Nunya Business" <nb3425586@gmail.com> writes:

Within my schema there is a table that has a GENERATED ALWAYS column that calls a plpgsql function. The called function has a "row type" variable declared that references a view. While the schema itself functions properly day to day, and pg_dumpall works as expected, the generated SQL fails to successfully execute. The table in question is restored with no rows, and an error is generated during the COPY stating that the type does not exist.

Hmm, do you have actually circular dependencies in that? pg_dump has
some heuristics for dealing with such cases, but maybe it needs more.
Please create a self-contained example and submit it to pgsql-bugs.

regards, tom lane

Thanks Tom. There are indeed circular references in the schema and the
whole thing sort of doesn't pass the smell test, but this is my first
look at it. The generated column on the table calls a function which
selects from a view that references the table. The production schema
where I ran into this is pretty large and complex, so the contrived
example that follows may not be the minimum working example but it's
pretty small and has the same behavior regarding the SQL generated by
pg_dumpall.

It seems that the schema is probably invalid according to the GENERATED
rules and that pg_dumpall is operating as intended, but somehow the
check in the ALTER TABLE isn't deep enough to prevent the issue, but
maybe I'm mistaken. Once this is created, if you insert a few rows and
execute pg_dumpall, the resulting SQL cannot be loaded and will fail
during the COPY, complaining that the view referenced by the function
doesn't exist.

Here is the schema. CCing pgsql-bugs as requested.

--------------------------CUT
CREATE TABLE tblA (
id serial unique not null,
dt timestamp with time zone not null default now(),
data text
);

CREATE OR REPLACE VIEW viewA as (
SELECT sum(id) FROM tblA
);

CREATE OR REPLACE FUNCTION fnA(askid INTEGER) RETURNS integer
LANGUAGE 'plpgsql' IMMUTABLE
AS $$

declare
varA viewA;
ret integer;

begin
SELECT viewA.*
INTO varA
FROM viewA;

ret = varA.sum;

return ret;
end;
$$;

ALTER TABLE tblA ADD COLUMN idsum INTEGER GENERATED ALWAYS AS (fnA(id))
STORED;

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nunya Business (#4)
bugsgeneral
Re: Re[2]: PG 14.5 -- Impossible to restore dump due to interaction/order of views, functions, and generated columns

"Nunya Business" <nb3425586@gmail.com> writes:

Thanks Tom. There are indeed circular references in the schema and the
whole thing sort of doesn't pass the smell test, but this is my first
look at it. The generated column on the table calls a function which
selects from a view that references the table. The production schema
where I ran into this is pretty large and complex, so the contrived
example that follows may not be the minimum working example but it's
pretty small and has the same behavior regarding the SQL generated by
pg_dumpall.

Hm. The actual problem here is that fnA() is making unwarranted
assumptions about the search_path it's run under, so it fails when the
pg_dump script invokes it with a restrictive search_path. If you change
the function text so that the references to viewA are schema-qualified,
then it restores without errors.

"Doesn't pass the smell test" is putting it mildly, btw. Labeling
that function IMMUTABLE is a huge lie, and what it means is that
your GENERATED column doesn't have the amount of stability that
it's supposed to per spec. I'm not sure exactly what sorts of
misbehaviors might ensue from that, but I'm pretty certain that the
data in the GENERATED column after dump/restore won't match what
you had there beforehand.

regards, tom lane

#6Nunya Business
nb3425586@gmail.com
In reply to: Tom Lane (#5)
bugsgeneral
Re[4]: PG 14.5 -- Impossible to restore dump due to interaction/order of views, functions, and generated columns

Tom,

Thanks for the insight, I didn't even consider the search path being an
issue and I should have. I saw it explicitly specified in other parts
of the dump and just assumed it was being done in the function as well.
For example, the CREATE statements in the dump output all specify the
schema name even though it's not specified in the original statements.
I suppose expecting statements in the function body to be similarly
qualified in the dump is unreasonable considering the complexity of
doing so.

Anyway, thanks for the tip. If I run into something like this again in
the future I'll have a better idea of where to look and how to recover
what data I can in an emergency.

------ Original Message ------