BUG #18422: Assert in expandTupleDesc() fails on row mismatch with additional SRF

Started by PG Bug reporting formabout 2 years ago11 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 18422
Logged by: Alexander Lakhin
Email address: exclusion@gmail.com
PostgreSQL version: 16.2
Operating system: Ubuntu 22.04
Description:

The following query:
SELECT * FROM generate_series(1, 1),
COALESCE(row(1)) AS (a int, b int);
triggers an assertion failure:
TRAP: failed Assert("count <= tupdesc->natts"), File: "parse_relation.c",
Line: 3054, PID: 151361

with the following stack trace:
...
#5 0x000055b935579366 in ExceptionalCondition
(conditionName=conditionName@entry=0x55b93569420d "count <= tupdesc->natts",
fileName=fileName@entry=0x55b9356941e6 "parse_relation.c",
lineNumber=lineNumber@entry=3054) at assert.c:66
#6 0x000055b9351841b5 in expandTupleDesc (tupdesc=0x55b9378c7660,
eref=0x55b9378c7358, count=2, offset=offset@entry=0,
rtindex=rtindex@entry=2, sublevels_up=sublevels_up@entry=0, location=-1,
include_dropped=true, colnames=0x0,
colvars=0x7fffaccae3e0) at parse_relation.c:3054
#7 0x000055b93518713a in expandRTE (rte=rte@entry=0x55b9377fc360,
rtindex=rtindex@entry=2, sublevels_up=sublevels_up@entry=0,
location=location@entry=-1, include_dropped=include_dropped@entry=true,
colnames=colnames@entry=0x0,
colvars=0x7fffaccae3e0) at parse_relation.c:2758
#8 0x000055b9353649b6 in build_physical_tlist
(root=root@entry=0x55b9378c8108, rel=rel@entry=0x55b9378c8cf0) at
plancat.c:1792
#9 0x000055b935330653 in create_scan_plan (root=root@entry=0x55b9378c8108,
best_path=best_path@entry=0x55b9378c9340, flags=<optimized out>,
flags@entry=0) at createplan.c:637
#10 0x000055b93532dd65 in create_plan_recurse
(root=root@entry=0x55b9378c8108, best_path=0x55b9378c9340,
flags=flags@entry=0) at createplan.c:411
#11 0x000055b93532f585 in create_nestloop_plan (root=0x55b9378c8108,
best_path=0x55b9377fc470) at createplan.c:4342
#12 0x000055b93532f6be in create_join_plan (root=root@entry=0x55b9378c8108,
best_path=best_path@entry=0x55b9377fc470) at createplan.c:1076
#13 0x000055b93532dd75 in create_plan_recurse
(root=root@entry=0x55b9378c8108, best_path=0x55b9377fc470,
flags=flags@entry=1) at createplan.c:416
#14 0x000055b93532e575 in create_plan (root=root@entry=0x55b9378c8108,
best_path=<optimized out>) at createplan.c:347
#15 0x000055b93533ff5b in standard_planner (parse=0x55b9377fbf78,
query_string=<optimized out>, cursorOptions=2048, boundParams=<optimized
out>) at planner.c:420
#16 0x000055b9353404f8 in planner (parse=parse@entry=0x55b9377fbf78,
query_string=query_string@entry=0x55b9377fa888 "SELECT * FROM
generate_series(1, 1),\n COALESCE(row(1)) AS (a int, b int);",
cursorOptions=cursorOptions@entry=2048,
boundParams=boundParams@entry=0x0) at planner.c:281
#17 0x000055b93542c257 in pg_plan_query
(querytree=querytree@entry=0x55b9377fbf78,
query_string=query_string@entry=0x55b9377fa888 "SELECT * FROM
generate_series(1, 1),\n COALESCE(row(1)) AS (a int, b int);",
cursorOptions=cursorOptions@entry=2048,
boundParams=boundParams@entry=0x0) at postgres.c:904
#18 0x000055b93542c30d in pg_plan_queries (querytrees=0x55b9378c80b8,
query_string=query_string@entry=0x55b9377fa888 "SELECT * FROM
generate_series(1, 1),\n COALESCE(row(1)) AS (a int, b int);",
cursorOptions=cursorOptions@entry=2048,
boundParams=boundParams@entry=0x0) at postgres.c:996
#19 0x000055b93542c7f0 in exec_simple_query
(query_string=query_string@entry=0x55b9377fa888 "SELECT * FROM
generate_series(1, 1),\n COALESCE(row(1)) AS (a int, b int);") at
postgres.c:1193

(gdb) frame 6

(gdb) p tupdesc->natts
$1 = 1

Reproduced on REL_12_STABLE .. master.

`git bisect` blames d57534740 (8a15b4178 on REL_12_STABLE).
(Thanks to Amit Langote for correcting my initial report where this issue
was hastily attributed to JSON_TABLE().)

#2Tender Wang
tndrwang@gmail.com
In reply to: PG Bug reporting form (#1)
Re: BUG #18422: Assert in expandTupleDesc() fails on row mismatch with additional SRF

PG Bug reporting form <noreply@postgresql.org> 于2024年4月5日周五 18:35写道:

The following bug has been logged on the website:

Bug reference: 18422
Logged by: Alexander Lakhin
Email address: exclusion@gmail.com
PostgreSQL version: 16.2
Operating system: Ubuntu 22.04
Description:

The following query:
SELECT * FROM generate_series(1, 1),
COALESCE(row(1)) AS (a int, b int);
triggers an assertion failure:
TRAP: failed Assert("count <= tupdesc->natts"), File: "parse_relation.c",
Line: 3054, PID: 151361

with the following stack trace:
...
#5 0x000055b935579366 in ExceptionalCondition
(conditionName=conditionName@entry=0x55b93569420d "count <=
tupdesc->natts",
fileName=fileName@entry=0x55b9356941e6 "parse_relation.c",
lineNumber=lineNumber@entry=3054) at assert.c:66
#6 0x000055b9351841b5 in expandTupleDesc (tupdesc=0x55b9378c7660,
eref=0x55b9378c7358, count=2, offset=offset@entry=0,
rtindex=rtindex@entry=2, sublevels_up=sublevels_up@entry=0, location=-1,
include_dropped=true, colnames=0x0,
colvars=0x7fffaccae3e0) at parse_relation.c:3054
#7 0x000055b93518713a in expandRTE (rte=rte@entry=0x55b9377fc360,
rtindex=rtindex@entry=2, sublevels_up=sublevels_up@entry=0,
location=location@entry=-1, include_dropped=include_dropped@entry=true,
colnames=colnames@entry=0x0,
colvars=0x7fffaccae3e0) at parse_relation.c:2758
#8 0x000055b9353649b6 in build_physical_tlist
(root=root@entry=0x55b9378c8108, rel=rel@entry=0x55b9378c8cf0) at
plancat.c:1792
#9 0x000055b935330653 in create_scan_plan (root=root@entry
=0x55b9378c8108,
best_path=best_path@entry=0x55b9378c9340, flags=<optimized out>,
flags@entry=0) at createplan.c:637
#10 0x000055b93532dd65 in create_plan_recurse
(root=root@entry=0x55b9378c8108, best_path=0x55b9378c9340,
flags=flags@entry=0) at createplan.c:411
#11 0x000055b93532f585 in create_nestloop_plan (root=0x55b9378c8108,
best_path=0x55b9377fc470) at createplan.c:4342
#12 0x000055b93532f6be in create_join_plan (root=root@entry
=0x55b9378c8108,
best_path=best_path@entry=0x55b9377fc470) at createplan.c:1076
#13 0x000055b93532dd75 in create_plan_recurse
(root=root@entry=0x55b9378c8108, best_path=0x55b9377fc470,
flags=flags@entry=1) at createplan.c:416
#14 0x000055b93532e575 in create_plan (root=root@entry=0x55b9378c8108,
best_path=<optimized out>) at createplan.c:347
#15 0x000055b93533ff5b in standard_planner (parse=0x55b9377fbf78,
query_string=<optimized out>, cursorOptions=2048, boundParams=<optimized
out>) at planner.c:420
#16 0x000055b9353404f8 in planner (parse=parse@entry=0x55b9377fbf78,
query_string=query_string@entry=0x55b9377fa888 "SELECT * FROM
generate_series(1, 1),\n COALESCE(row(1)) AS (a int, b int);",
cursorOptions=cursorOptions@entry=2048,
boundParams=boundParams@entry=0x0) at planner.c:281
#17 0x000055b93542c257 in pg_plan_query
(querytree=querytree@entry=0x55b9377fbf78,
query_string=query_string@entry=0x55b9377fa888 "SELECT * FROM
generate_series(1, 1),\n COALESCE(row(1)) AS (a int, b int);",
cursorOptions=cursorOptions@entry=2048,
boundParams=boundParams@entry=0x0) at postgres.c:904
#18 0x000055b93542c30d in pg_plan_queries (querytrees=0x55b9378c80b8,
query_string=query_string@entry=0x55b9377fa888 "SELECT * FROM
generate_series(1, 1),\n COALESCE(row(1)) AS (a int, b int);",
cursorOptions=cursorOptions@entry=2048,
boundParams=boundParams@entry=0x0) at postgres.c:996
#19 0x000055b93542c7f0 in exec_simple_query
(query_string=query_string@entry=0x55b9377fa888 "SELECT * FROM
generate_series(1, 1),\n COALESCE(row(1)) AS (a int, b int);") at
postgres.c:1193

(gdb) frame 6

(gdb) p tupdesc->natts
$1 = 1

Reproduced on REL_12_STABLE .. master.

`git bisect` blames d57534740 (8a15b4178 on REL_12_STABLE).
(Thanks to Amit Langote for correcting my initial report where this issue
was hastily attributed to JSON_TABLE().)

Thanks for reporting this issue.
Before d57534740, it returned TYPEFUNC_RECORD in get_expr_result_type().
So it would not call expandTupleDesc(), but it would ereport in
tupledesc_match().

I copy the ereport code in tupledesc_match() into expandRTE() if
funcolcount is larger
than natts. The attached patch looks like a workaround fix.

--
Tender Wang
OpenPie: https://en.openpie.com/

Attachments:

0001-Fix-RowExpr-with-constant-value-as-COALESCE-argument.patchapplication/octet-stream; name=0001-Fix-RowExpr-with-constant-value-as-COALESCE-argument.patchDownload+16-1
#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tender Wang (#2)
Re: BUG #18422: Assert in expandTupleDesc() fails on row mismatch with additional SRF

Tender Wang <tndrwang@gmail.com> writes:

The following bug has been logged on the website:
Bug reference: 18422
Logged by: Alexander Lakhin
Email address: exclusion@gmail.com
PostgreSQL version: 16.2
Operating system: Ubuntu 22.04
Description:

The following query:
SELECT * FROM generate_series(1, 1),
COALESCE(row(1)) AS (a int, b int);
triggers an assertion failure:
TRAP: failed Assert("count <= tupdesc->natts"), File: "parse_relation.c",
Line: 3054, PID: 151361

I copy the ereport code in tupledesc_match() into expandRTE() if
funcolcount is larger
than natts. The attached patch looks like a workaround fix.

I think that's mostly a kluge. It seems to me that the actual problem
is that 2ed8f9a01 missed some places that need fixing. The policy
that that commit intended to institute is "if a RangeTblFunction has a
coldeflist, then the function return type is certainly RECORD, and we
don't need to try to dig the column specifications out of the function
expression" (which dodges the problem that the function expression
might not produce what we're expecting). However, expandRTE didn't
get that memo, and would produce the wrong tupdesc in an example such
as this one. IMO, it ought to produce the query-specified columns in
all cases, even when the function expression doesn't match. Throwing
a premature error isn't better, especially if it only detects one
kind of mismatch.

I dug around looking at other callers of get_expr_result_type and
get_expr_result_tupdesc, and found a couple other places that aren't
actively buggy but can be made faster and more robust by adding
this same rule.

There is a remaining place where we're arguably misusing
get_expr_result_type, which is init_sexpr in execSRF.c. However,
that will just result in postponing the eventual error, because
we'll still cross-check the function result tupdesc against the
query-expected one later. So I think it's okay as-is; at least
it doesn't seem worth the trouble to refactor so that init_sexpr
would have access to the correct RangeTblFunction.

BTW, I wondered why the test cases we added for 2ed8f9a01 didn't
catch this, because they sure look superficially similar. The
reason seems to be that we fail because this query produces a join
plan in which we invoke build_physical_tlist for the FunctionScan,
and that's what's calling expandRTE and hitting the assertion.
The older test cases get simplified sufficiently that there's no
join but just a FunctionScan, and that plan node will be required
to produce the query-specified tlist not a physical tlist, so we
accidentally avoid reaching the assertion.

So I end with the attached.

regards, tom lane

Attachments:

v2-fix-bug-18422.patchtext/x-diff; charset=us-ascii; name=v2-fix-bug-18422.patchDownload+27-11
#4Richard Guo
guofenglinux@gmail.com
In reply to: Tom Lane (#3)
Re: BUG #18422: Assert in expandTupleDesc() fails on row mismatch with additional SRF

On Wed, Apr 10, 2024 at 3:34 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I think that's mostly a kluge. It seems to me that the actual problem
is that 2ed8f9a01 missed some places that need fixing. The policy
that that commit intended to institute is "if a RangeTblFunction has a
coldeflist, then the function return type is certainly RECORD, and we
don't need to try to dig the column specifications out of the function
expression" (which dodges the problem that the function expression
might not produce what we're expecting). However, expandRTE didn't
get that memo, and would produce the wrong tupdesc in an example such
as this one. IMO, it ought to produce the query-specified columns in
all cases, even when the function expression doesn't match.

+1. This seems a more principled fix.

I dug around looking at other callers of get_expr_result_type and
get_expr_result_tupdesc, and found a couple other places that aren't
actively buggy but can be made faster and more robust by adding
this same rule.

I wonder why the v2 patch does not apply this same rule in
process_function_rte_ref(), by checking if the 'rtfunc' has a coldeflist
before it calls get_expr_result_tupdesc.

BTW, I wondered why the test cases we added for 2ed8f9a01 didn't
catch this, because they sure look superficially similar. The
reason seems to be that we fail because this query produces a join
plan in which we invoke build_physical_tlist for the FunctionScan,
and that's what's calling expandRTE and hitting the assertion.
The older test cases get simplified sufficiently that there's no
join but just a FunctionScan, and that plan node will be required
to produce the query-specified tlist not a physical tlist, so we
accidentally avoid reaching the assertion.

Exactly.

Thanks
Richard

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Richard Guo (#4)
Re: BUG #18422: Assert in expandTupleDesc() fails on row mismatch with additional SRF

Richard Guo <guofenglinux@gmail.com> writes:

I wonder why the v2 patch does not apply this same rule in
process_function_rte_ref(), by checking if the 'rtfunc' has a coldeflist
before it calls get_expr_result_tupdesc.

I guess we could. It won't matter because the following code will
reject RECORD in any case; but we could save a few cycles by not
calling get_expr_result_tupdesc there.

regards, tom lane

#6Richard Guo
guofenglinux@gmail.com
In reply to: Tom Lane (#5)
Re: BUG #18422: Assert in expandTupleDesc() fails on row mismatch with additional SRF

On Wed, Apr 10, 2024 at 11:32 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Richard Guo <guofenglinux@gmail.com> writes:

I wonder why the v2 patch does not apply this same rule in
process_function_rte_ref(), by checking if the 'rtfunc' has a coldeflist
before it calls get_expr_result_tupdesc.

I guess we could. It won't matter because the following code will
reject RECORD in any case; but we could save a few cycles by not
calling get_expr_result_tupdesc there.

Indeed. I think it would be better to add this same rule to
process_function_rte_ref(), or at least write some comments there to
explain why it is not necessary to check rtfunc->funccolnames while
other places do.

Thanks
Richard

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Richard Guo (#6)
Re: BUG #18422: Assert in expandTupleDesc() fails on row mismatch with additional SRF

Richard Guo <guofenglinux@gmail.com> writes:

On Wed, Apr 10, 2024 at 11:32 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I guess we could. It won't matter because the following code will
reject RECORD in any case; but we could save a few cycles by not
calling get_expr_result_tupdesc there.

Indeed. I think it would be better to add this same rule to
process_function_rte_ref(), or at least write some comments there to
explain why it is not necessary to check rtfunc->funccolnames while
other places do.

Wilco. Another thing I was considering, but didn't pull the trigger
on in the draft patch, was to introduce a funcapi.c function on the
order of
get_expr_result_rtfunc(RangeTblFunction *rtfunc, ...)
which would encapsulate applying either BuildDescFromLists or
get_expr_result_type. I didn't do it because I found that the
only places that would want to use it are the two that are correct
already; the places we still need to fix have short-cuts they can
take rather than building an intermediate tupdesc. (The present
bug could be summarized as "the short-cuts still need to pay
attention to the coldeflist".) But the advantage of doing this
anyway is that its header comment would be a natural place to
document this issue and policy. Thoughts?

regards, tom lane

#8Richard Guo
guofenglinux@gmail.com
In reply to: Tom Lane (#7)
Re: BUG #18422: Assert in expandTupleDesc() fails on row mismatch with additional SRF

On Wed, Apr 10, 2024 at 10:16 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Wilco. Another thing I was considering, but didn't pull the trigger
on in the draft patch, was to introduce a funcapi.c function on the
order of
get_expr_result_rtfunc(RangeTblFunction *rtfunc, ...)
which would encapsulate applying either BuildDescFromLists or
get_expr_result_type. I didn't do it because I found that the
only places that would want to use it are the two that are correct
already; the places we still need to fix have short-cuts they can
take rather than building an intermediate tupdesc. (The present
bug could be summarized as "the short-cuts still need to pay
attention to the coldeflist".) But the advantage of doing this
anyway is that its header comment would be a natural place to
document this issue and policy. Thoughts?

Do you think we can have a parameter in the new get_expr_result_rtfunc()
function to indicate whether we want to build an intermediate tupdesc
when we have a coldeflist? Then we can set it to true in the two places
that are correct already, and set it to false at the places we need to
fix. But I'm not sure if including such a new parameter would be an
improvement or just make it worse.

Thanks
Richard

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Richard Guo (#8)
Re: BUG #18422: Assert in expandTupleDesc() fails on row mismatch with additional SRF

Richard Guo <guofenglinux@gmail.com> writes:

On Wed, Apr 10, 2024 at 10:16 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Wilco. Another thing I was considering, but didn't pull the trigger
on in the draft patch, was to introduce a funcapi.c function on the
order of
get_expr_result_rtfunc(RangeTblFunction *rtfunc, ...)
which would encapsulate applying either BuildDescFromLists or
get_expr_result_type.

Do you think we can have a parameter in the new get_expr_result_rtfunc()
function to indicate whether we want to build an intermediate tupdesc
when we have a coldeflist? Then we can set it to true in the two places
that are correct already, and set it to false at the places we need to
fix. But I'm not sure if including such a new parameter would be an
improvement or just make it worse.

I did think about that, but it seems mighty weird. The semantics of
the flag would have to be something like "I want a tupdesc when the
result type is COMPOSITE, but not when it's RECORD", which seems
rather arbitrary.

Perhaps it'd be sufficient to add a note to the header comment of
get_expr_result_type warning about when not to use it.

regards, tom lane

#10Richard Guo
guofenglinux@gmail.com
In reply to: Tom Lane (#9)
Re: BUG #18422: Assert in expandTupleDesc() fails on row mismatch with additional SRF

On Thu, Apr 11, 2024 at 10:13 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Richard Guo <guofenglinux@gmail.com> writes:

Do you think we can have a parameter in the new get_expr_result_rtfunc()
function to indicate whether we want to build an intermediate tupdesc
when we have a coldeflist? Then we can set it to true in the two places
that are correct already, and set it to false at the places we need to
fix. But I'm not sure if including such a new parameter would be an
improvement or just make it worse.

I did think about that, but it seems mighty weird. The semantics of
the flag would have to be something like "I want a tupdesc when the
result type is COMPOSITE, but not when it's RECORD", which seems
rather arbitrary.

Indeed.

Perhaps it'd be sufficient to add a note to the header comment of
get_expr_result_type warning about when not to use it.

Works for me.

Thanks
Richard

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Richard Guo (#10)
Re: BUG #18422: Assert in expandTupleDesc() fails on row mismatch with additional SRF

Richard Guo <guofenglinux@gmail.com> writes:

On Thu, Apr 11, 2024 at 10:13 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I did think about that, but it seems mighty weird. The semantics of
the flag would have to be something like "I want a tupdesc when the
result type is COMPOSITE, but not when it's RECORD", which seems
rather arbitrary.

Indeed.

I did actually spend some time today trying to code that up, to see
what it would look like. The attempt failed though, because there
are existing cases in which get_expr_result_type() returns
TYPEFUNC_RECORD, so we can't use that result code as a positive
indicator that "this RTE has a coldeflist". In a green field
we could invent another TYPEFUNC_XXX code, but that's not a
back-patchable idea. So the new function would need some independent
indicator that it saw a coldeflist, and at that point there's
basically nothing we're hiding from the callers.

Perhaps it'd be sufficient to add a note to the header comment of
get_expr_result_type warning about when not to use it.

Works for me.

OK, barring other objections I'll push forward on that basis.

regards, tom lane