Attribute of type record has wrong type error with MERGE ... WHEN NOT MATCHED BY SOURCE THEN DELETE
Postgresql version 17.4 (Ubuntu 17.4-1.pgdg24.10+2) on x86_64-pc-linux-gnu
To reproduce, execute the statements in the attached file cr.sql. I get:
duncan=> \i cr.sql
CREATE TABLE
CREATE TABLE
CREATE VIEW
CREATE TABLE
COPY 1
COPY 2
COPY 1
psql:cr.sql:42: ERROR: attribute 2 of type record has wrong type
DETAIL: Table has type _country_or_region, but query expects record.
I attribute it to the "WHEN NOT MATCHED BY SOURCE THEN DELETE" part of the MERGE
as it doesn't happen if that part is left off.
Best wishes, Duncan.
Attachments:
Duncan Sands <duncan.sands@deepbluecap.com> 于2025年3月10日周一 18:43写道:
Postgresql version 17.4 (Ubuntu 17.4-1.pgdg24.10+2) on x86_64-pc-linux-gnu
To reproduce, execute the statements in the attached file cr.sql. I get:
duncan=> \i cr.sql
CREATE TABLE
CREATE TABLE
CREATE VIEW
CREATE TABLE
COPY 1
COPY 2
COPY 1
psql:cr.sql:42: ERROR: attribute 2 of type record has wrong type
DETAIL: Table has type _country_or_region, but query expects record.I attribute it to the "WHEN NOT MATCHED BY SOURCE THEN DELETE" part of the
MERGE
as it doesn't happen if that part is left off.
Yeah, I can reproduce this on HEAD, but on 17.0, no error happened.
I searched commit history, I found that this error was related to d7d297f84.
commit d7d297f8449641bfd71750d04c302572a350052c
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Date: Thu Oct 3 12:50:38 2024 +0100
Fix incorrect non-strict join recheck in MERGE WHEN NOT MATCHED BY
SOURCE.
If a MERGE command contains WHEN NOT MATCHED BY SOURCE actions, the
merge join condition is used by the executor to distinguish MATCHED
from NOT MATCHED BY SOURCE cases. However, this qual is executed using
the output from the join subplan node, which nulls the output from the
source relation in the not matched case, and so the result may be
incorrect if the join condition is "non-strict" -- for example,
something like "src.col IS NOT DISTINCT FROM tgt.col".
Reverted above commit, cr.sql succeeded.
psql (17.0)
Type "help" for help.
postgres=# \i /workspace/cr.sql
CREATE TABLE
CREATE TABLE
CREATE VIEW
CREATE TABLE
COPY 1
COPY 2
COPY 1
MERGE 0
--
Thanks,
Tender Wang
Duncan Sands <duncan.sands@deepbluecap.com> 于2025年3月10日周一 18:43写道:
Postgresql version 17.4 (Ubuntu 17.4-1.pgdg24.10+2) on x86_64-pc-linux-gnu
To reproduce, execute the statements in the attached file cr.sql. I get:
duncan=> \i cr.sql
CREATE TABLE
CREATE TABLE
CREATE VIEW
CREATE TABLE
COPY 1
COPY 2
COPY 1
psql:cr.sql:42: ERROR: attribute 2 of type record has wrong type
DETAIL: Table has type _country_or_region, but query expects record.I attribute it to the "WHEN NOT MATCHED BY SOURCE THEN DELETE" part of the
MERGE
as it doesn't happen if that part is left off.
When the query has NOT MATCHED BY SOURCE, commit d7d297f84 add "src IS NOT
NULL" join condition.
In this case, the src is view(e.g. subquery), so in makeWholeRowVar(), it
will call below code:
result = makeVar(varno,
InvalidAttrNumber,
RECORDOID,
-1,
InvalidOid,
varlevelsup);
the vartype is RECORDOID, but te reltype of src is not RECORDOID, so
$SUBJECT error reports.
I add the below codes to makeWholeRowVar() default branch:
if (rte->relkind == RELKIND_VIEW)
toid = get_rel_type_id(rte->relid);
else
toid = RECORDOID;
It can work.
--
Thanks,
Tender Wang
Tender Wang <tndrwang@gmail.com> writes:
When the query has NOT MATCHED BY SOURCE, commit d7d297f84 add "src IS NOT
NULL" join condition.
In this case, the src is view(e.g. subquery), so in makeWholeRowVar(), it
will call below code:
result = makeVar(varno,
InvalidAttrNumber,
RECORDOID,
-1,
InvalidOid,
varlevelsup);
the vartype is RECORDOID, but te reltype of src is not RECORDOID, so
$SUBJECT error reports.
Hmm. I tried adjusting the example to make _country_or_region
be a materialized view or plain table instead of a view, and
those cases did not fail. I wonder why the difference...
I add the below codes to makeWholeRowVar() default branch:
if (rte->relkind == RELKIND_VIEW)
toid = get_rel_type_id(rte->relid);
else
toid = RECORDOID;
It can work.
I'm quite uncomfortable with the idea of changing makeWholeRowVar()
itself in this way --- the potential blast radius from that seems
rather large. A localized fix in transform_MERGE_to_join() might
be a wiser answer.
regards, tom lane
On Mon, 10 Mar 2025 at 13:46, Tender Wang <tndrwang@gmail.com> wrote:
When the query has NOT MATCHED BY SOURCE, commit d7d297f84 add "src IS NOT NULL" join condition.
In this case, the src is view(e.g. subquery), so in makeWholeRowVar(), it will call below code:
result = makeVar(varno,
InvalidAttrNumber,
RECORDOID,
-1,
InvalidOid,
varlevelsup);the vartype is RECORDOID, but te reltype of src is not RECORDOID, so $SUBJECT error reports.
I add the below codes to makeWholeRowVar() default branch:
if (rte->relkind == RELKIND_VIEW)
toid = get_rel_type_id(rte->relid);
else
toid = RECORDOID;It can work.
Yes, I reached the same conclusion.
When the parser processes the "AND qq_src IS DISTINCT FROM qq_tgt"
clause, it creates a whole-row Var for qq_src whose type is the view
type. Then transform_MERGE_to_join() adds another whole-row Var for
qq_src, but by this time the RTE has been expanded into a subquery
RTE, so its type becomes RECORDOID. The executor then grumbles because
it has 2 Vars with the same varno and varattno, but different
vartypes.
Fixing that by having makeWholeRowVar() set the type based on
rte->relid for subquery RTEs that used to be views seems like a good
fix. However, it looks like that fix will only work as far back as v16
(where 47bb9db7599 and 0f8cfaf8921 were added).
Unfortunately, it looks like this bug pre-dates MERGE WHEN NOT MATCHED
BY SOURCE, and even MERGE itself. All that's needed to trigger it is a
query that causes 2 whole-row Vars to be added, one before and one
after view expansion. That can be made to happen via the rowmarking
mechanism in all supported branches as follows:
create table foo (a int, b int);
create view foo_v as select * from foo offset 0;
insert into foo values (1,2);
update foo set b = foo_v.b from foo_v where foo_v.a = foo.a returning foo_v;
which fails in the same way, with
ERROR: attribute 3 of type record has wrong type
DETAIL: Table has type record, but query expects foo_v.
Reading the commit message for 47bb9db7599 suggests that maybe it
would be OK to further back-patch the changes to ApplyRetrieveRule()
to retain relkind and relid on subquery RTEs for this purpose. That
wouldn't affect stored rules, but I haven't looked to see what else it
might affect.
Thoughts?
Regards,
Dean
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
Unfortunately, it looks like this bug pre-dates MERGE WHEN NOT MATCHED
BY SOURCE, and even MERGE itself. All that's needed to trigger it is a
query that causes 2 whole-row Vars to be added, one before and one
after view expansion. That can be made to happen via the rowmarking
mechanism in all supported branches as follows:
Ugh, right. So I withdraw my objection to fixing this in
makeWholeRowVar: all of the post-rewrite calls have need for this
behavior. However, the proposed code change is wrong in detail.
The existing places that are checking for this situation are doing
tests like
(rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid)))
I don't believe that checking relkind instead is an improvement.
Reading the commit message for 47bb9db7599 suggests that maybe it
would be OK to further back-patch the changes to ApplyRetrieveRule()
to retain relkind and relid on subquery RTEs for this purpose. That
wouldn't affect stored rules, but I haven't looked to see what else it
might affect.
Yeah, I think we can likely get away with that. We cannot back-patch
the changes that added relid to the outfuncs/readfuncs representation,
which means that the RTE's relid won't propagate to parallel workers,
but I don't see why they'd need it. We only need that info to get
as far as planning. I've not tried though.
Draft HEAD patch attached.
regards, tom lane
Attachments:
generate-correct-vartype-for-whole-row-view-refs.patchtext/x-diff; charset=us-ascii; name=generate-correct-vartype-for-whole-row-view-refs.patchDownload+55-3
I wrote:
Yeah, I think we can likely get away with that. We cannot back-patch
the changes that added relid to the outfuncs/readfuncs representation,
which means that the RTE's relid won't propagate to parallel workers,
but I don't see why they'd need it. We only need that info to get
as far as planning. I've not tried though.
OK, the attached patch for v15 passes check-world, with or without
force_parallel_mode. I'm inclined to commit the rewriteHandler.c
and parsenodes.h bits in a separate patch for commit log visibility.
It would be easy enough to leave the RTE's relkind and/or rellockmode
alone too, but I think the conservative approach is to not change
more than we have to in these old branches. There is some attraction
to making the behavior more like the newer branches, but still ...
regards, tom lane
Attachments:
generate-correct-vartype-for-whole-row-view-refs-15.patchtext/x-diff; charset=us-ascii; name=generate-correct-vartype-for-whole-row-view-refs-15.patchDownload+65-4
On Mon, 10 Mar 2025 at 19:11, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
Yeah, I think we can likely get away with that. We cannot back-patch
the changes that added relid to the outfuncs/readfuncs representation,
which means that the RTE's relid won't propagate to parallel workers,
but I don't see why they'd need it. We only need that info to get
as far as planning. I've not tried though.OK, the attached patch for v15 passes check-world, with or without
force_parallel_mode. I'm inclined to commit the rewriteHandler.c
and parsenodes.h bits in a separate patch for commit log visibility.
That looks good to me, on a quick read-through.
However, that's not quite the end of it -- preprocess_function_rtes()
/ inline_set_returning_function() can turn a function RTE into a
subquery RTE, leading to a similar problem:
create table foo (a int, b int);
insert into foo values (1,2);
create or replace function f() returns setof foo as
$$ select * from foo offset 0 $$ language sql stable;
update foo set b = f.b from f() as f(a,b) where f.a = foo.a returning f;
ERROR: attribute 3 of type record has wrong type
DETAIL: Table has type record, but query expects foo.
I tried looking for other places that change an RTE's rtekind, and
haven't managed to find any other problems, but may have missed
something.
Regards,
Dean
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
However, that's not quite the end of it -- preprocess_function_rtes()
/ inline_set_returning_function() can turn a function RTE into a
subquery RTE, leading to a similar problem:
create table foo (a int, b int);
insert into foo values (1,2);
create or replace function f() returns setof foo as
$$ select * from foo offset 0 $$ language sql stable;
update foo set b = f.b from f() as f(a,b) where f.a = foo.a returning f;
ERROR: attribute 3 of type record has wrong type
DETAIL: Table has type record, but query expects foo.
Double ugh. I guess we could get preprocess_function_rtes to
insert the appropriate relid ...
regards, tom lane
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
I tried looking for other places that change an RTE's rtekind, and
haven't managed to find any other problems, but may have missed
something.
The only other place I can find that converts some other kind of RTE
to a subquery is inline_cte(), which converts RTE_CTE to RTE_SUBQUERY.
That seems immune to the present problem because the appropriate
vartype would be RECORD in either case.
regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> 于2025年3月11日周二 03:11写道:
I wrote:
Yeah, I think we can likely get away with that. We cannot back-patch
the changes that added relid to the outfuncs/readfuncs representation,
which means that the RTE's relid won't propagate to parallel workers,
but I don't see why they'd need it. We only need that info to get
as far as planning. I've not tried though.OK, the attached patch for v15 passes check-world, with or without
force_parallel_mode. I'm inclined to commit the rewriteHandler.c
and parsenodes.h bits in a separate patch for commit log visibility.
The attached patch looks better than my proposed code.
LGTM.
--
Thanks,
Tender Wang
I wrote:
Double ugh. I guess we could get preprocess_function_rtes to
insert the appropriate relid ...
OK, that was less painful than I feared. makeWholeRowVar has
several different special cases for RTE_FUNCTION, but most of them
don't bear on this problem, because we wouldn't have applied inlining
when they did. It seems sufficient to fetch pg_type.typrelid for
the function's nominal return type and store that if it's not 0.
Patches for HEAD and v15 attached. I haven't checked other branches
but I expect the deltas won't be big.
regards, tom lane
Attachments:
v2-generate-correct-vartype-for-whole-row-vars-HEAD.patchtext/x-diff; charset=us-ascii; name=v2-generate-correct-vartype-for-whole-row-vars-HEAD.patchDownload+106-4
v2-generate-correct-vartype-for-whole-row-vars-15.patchtext/x-diff; charset=us-ascii; name=v2-generate-correct-vartype-for-whole-row-vars-15.patchDownload+113-4
On Tue, 11 Mar 2025 at 17:14, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
Double ugh. I guess we could get preprocess_function_rtes to
insert the appropriate relid ...OK, that was less painful than I feared. makeWholeRowVar has
several different special cases for RTE_FUNCTION, but most of them
don't bear on this problem, because we wouldn't have applied inlining
when they did. It seems sufficient to fetch pg_type.typrelid for
the function's nominal return type and store that if it's not 0.
Hmm, this introduces a new problem. Testing a case with a composite type:
create table foo (a int, b int);
insert into foo values (1,2);
create type t as (a int, b int);
create or replace function f() returns setof t as
$$ select 1,2 from foo offset 0 $$ language sql stable;
then doing
update foo set b = f.b from f() where f.a = foo.a returning f;
or even just
select f from f();
triggers an Assert() in relation_open() from
get_relation_data_width(), from set_rel_width() because now that the
RTE has a relid, it attempts to open it, without having previously
locked it. Maybe we need to not clear rte->functions, and use that
instead.
Regards,
Dean
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
Hmm, this introduces a new problem. Testing a case with a composite type:
create table foo (a int, b int);
insert into foo values (1,2);
create type t as (a int, b int);
create or replace function f() returns setof t as
$$ select 1,2 from foo offset 0 $$ language sql stable;
then doing
update foo set b = f.b from f() where f.a = foo.a returning f;
or even just
select f from f();
triggers an Assert() in relation_open() from
get_relation_data_width(), from set_rel_width() because now that the
RTE has a relid, it attempts to open it, without having previously
locked it.
Geez, our regression tests seem quite lacking in this area.
I'm wondering why we're trying to do relation_open on a SUBQUERY
RTE, relid or no relid. It seems kind of accidental that
set_rel_width() doesn't fail on subqueries given this coding.
Even without actual failure, looking to the referenced relation for
width estimates for a subquery seems like a pretty broken idea.
However, this does indicate that putting a composite type's relid
into the RTE is more dangerous than I thought --- there may be
other code out there doing things similar to set_rel_width().
Maybe we need to not clear rte->functions, and use that
instead.
At first I didn't like that idea a bit, and it's still rather ugly,
but it does have two advantages:
* it seems less likely to break anyone's assumptions about the data
structure;
* we'd avoid a purely speculative catcache fetch in
preprocess_function_rtes, which most of the time is a waste of effort
because no subsequent makeWholeRowVar will happen.
We could still clear rte->functions during setrefs.c, so that this
abuse of the data structure is purely local to the planner.
I'll have a go at coding it that way...
regards, tom lane
I wrote:
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
Maybe we need to not clear rte->functions, and use that
instead.
I'll have a go at coding it that way...
Yeah, that is a better way. In exchange for a slightly dirtier
data structure, we now have essentially all the relevant code in
makeWholeRowVar: correctness only depends on its different case
branches agreeing, rather than on some not-terribly-similar code
way over in prepjointree.c.
I also took the opportunity to split off the old-branch adjustment
of rewriteHandler.c, so that the HEAD and v15 versions of the main
bug fix patch are nearly the same.
regards, tom lane
Attachments:
v3-generate-correct-vartype-for-whole-row-vars-HEAD.patchtext/x-diff; charset=us-ascii; name=v3-generate-correct-vartype-for-whole-row-vars-HEAD.patchDownload+136-5
v3-0001-preserve-relid-15.patchtext/x-diff; charset=us-ascii; name=v3-0001-preserve-relid-15.patchDownload+12-2
v3-0002-generate-correct-vartype-for-whole-row-vars-15.patchtext/x-diff; charset=us-ascii; name*0=v3-0002-generate-correct-vartype-for-whole-row-vars-15.patc; name*1=hDownload+134-4
On Tue, 11 Mar 2025 at 21:51, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Yeah, that is a better way. In exchange for a slightly dirtier
data structure, we now have essentially all the relevant code in
makeWholeRowVar: correctness only depends on its different case
branches agreeing, rather than on some not-terribly-similar code
way over in prepjointree.c.
Agreed. Having all the code in one place makes it easier to see that
it's doing the same thing before and after RTE expansion.
I also took the opportunity to split off the old-branch adjustment
of rewriteHandler.c, so that the HEAD and v15 versions of the main
bug fix patch are nearly the same.
LGTM. I did some more testing and thought about it a little more, and
I can't see any other ways to break it.
Regards,
Dean
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
LGTM. I did some more testing and thought about it a little more, and
I can't see any other ways to break it.
Thanks for the careful review and testing! Pushed after fooling
with the comments a tiny bit more.
regards, tom lane
Hi,
On 2025-03-12 11:49:29 -0400, Tom Lane wrote:
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
LGTM. I did some more testing and thought about it a little more, and
I can't see any other ways to break it.Thanks for the careful review and testing! Pushed after fooling
with the comments a tiny bit more.
This seems to have introduce some breakage for 13-15. E.g. on
sifaka:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sifaka&dt=2025-03-12%2016%3A58%3A51
which has
'CPPFLAGS' => '-DWRITE_READ_PARSE_PLAN_TREES -DSTRESS_SORT_INT_MIN -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS'
diff -U3 /Users/buildfarm/bf-data/REL_14_STABLE/pgsql.build/src/test/regress/expected/tablespace.out /Users/buildfarm/bf-data/REL_14_STABLE/pgsql.build/src/test/regress/results/tablespace.out
--- /Users/buildfarm/bf-data/REL_14_STABLE/pgsql.build/src/test/regress/expected/tablespace.out 2025-03-12 12:59:22
+++ /Users/buildfarm/bf-data/REL_14_STABLE/pgsql.build/src/test/regress/results/tablespace.out 2025-03-12 12:59:23
@@ -242,6 +242,7 @@
-- check \\d output
\\d testschema.foo
+WARNING: outfuncs/readfuncs failed to produce equal parse tree
Table "testschema.foo"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
@@ -320,6 +321,7 @@
(3 rows)
...
Greetings,
Andres
Andres Freund <andres@anarazel.de> writes:
This seems to have introduce some breakage for 13-15. E.g. on
sifaka:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sifaka&dt=2025-03-12%2016%3A58%3A51
which has
'CPPFLAGS' => '-DWRITE_READ_PARSE_PLAN_TREES -DSTRESS_SORT_INT_MIN -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS'
Ugh. I supposed that it was okay that 317aba70e etc. didn't touch
outfuncs/readfuncs, but I did not think of
-DWRITE_READ_PARSE_PLAN_TREES.
Perhaps a good hack to deal with that is to make setrefs.c clear
out relid for RTE_SUBQUERY RTEs in those branches. Then, in the
same way that the rte->function hack doesn't escape the planner,
this one wouldn't either.
regards, tom lane
I wrote:
Ugh. I supposed that it was okay that 317aba70e etc. didn't touch
outfuncs/readfuncs, but I did not think of
-DWRITE_READ_PARSE_PLAN_TREES.
Perhaps a good hack to deal with that is to make setrefs.c clear
out relid for RTE_SUBQUERY RTEs in those branches. Then, in the
same way that the rte->function hack doesn't escape the planner,
this one wouldn't either.
Double ugh: that doesn't fix it, because we also do a round of
WRITE_READ_PARSE_PLAN_TREES checks on the rewriter output.
Not sure how to fix this, unless we lobotomize that write/read
check somehow.
regards, tom lane