BUG #19037: Planner fails on estimating array length with "no relation entry" error
The following bug has been logged on the website:
Bug reference: 19037
Logged by: Alexander Lakhin
Email address: exclusion@gmail.com
PostgreSQL version: 18beta3
Operating system: Ubuntu 24.04
Description:
The following script:
create table t(ia int[]);
select exists (select 1 from (select 1) where case when b then 1 else 0 end
= 1)
from (select 1 = any(ia) as b from t);
triggers:
ERROR: XX000: no relation entry for relid 2
LOCATION: find_base_rel, relnode.c:426
2025-08-30 11:28:26.036 UTC [2334676] LOG: statement: select exists (select
1 from (select 1) where case when b then 1 else 0 end = 1)
from (select 1 = any(ia) as b from t);
2025-08-30 11:28:26.038 UTC [2334676] ERROR: no relation entry for relid 2
2025-08-30 11:28:26.038 UTC [2334676] BACKTRACE:
find_base_rel at relnode.c:426:2
examine_variable at selfuncs.c:5319:16
estimate_array_length at selfuncs.c:2173:7
cost_qual_eval_walker at costsize.c:4874:25
expression_tree_walker_impl at nodeFuncs.c:2304:9
cost_qual_eval_walker at costsize.c:5043:1
expression_tree_walker_impl at nodeFuncs.c:2534:8
expression_tree_walker_impl at nodeFuncs.c:2215:8
cost_qual_eval_walker at costsize.c:5043:1
...
(Discovered with SQLsmith.)
Reproduced starting from 9391f7152.
PG Bug reporting form <noreply@postgresql.org> writes:
The following script:
create table t(ia int[]);
select exists (select 1 from (select 1) where case when b then 1 else 0 end
= 1)
from (select 1 = any(ia) as b from t);
triggers:
ERROR: XX000: no relation entry for relid 2
LOCATION: find_base_rel, relnode.c:426
Thanks for the report. After a bit of experimentation, I can shorten
the reproducer to
select exists (select 1 where (1 = any(ia))::int = 1) from t;
ERROR: no relation entry for relid 1
but it doesn't happen any more if you simplify further to
select exists (select 1 where 1 = any(ia)) from t;
The reason for the difference seems to be that make_subplan
checks to see if the EXISTS can be converted to a hashable ANY
subplan (cf. convert_EXISTS_to_ANY), and the form where there's
a top-level "=" operator in the sub-select's WHERE clause can
be so converted. Then we hit the failure while trying to do
cost_qual_eval on the converted ANY expression. So you also
get this failure if you manually write out the form that
convert_EXISTS_to_ANY is generating:
select ((1 = any(ia))::int) = any (select 1) from t;
ERROR: no relation entry for relid 1
Anyway that's sort of a sideshow. The real issue here is that
we're applying cost_qual_eval before the planner has created
any RelOptInfos, which means that examine_variable won't work.
I find it surprising that this is the first report of such trouble,
because it certainly isn't obvious that cost_qual_eval shouldn't
be allowed to consult statistics.
The most expedient solution is probably to hack examine_variable
so that it doesn't fail if root->simple_rel_array isn't there yet.
That seems mighty ugly though.
Another low-risk response could be to revert 9391f7152. But I don't
care for that because it's not really addressing the underlying
problem. We might have the same issue elsewhere in cost estimation
already, and even if we don't, it would be quite likely we'd introduce
it again in future.
In some sense the "right" fix would be to do SubPlan generation
later, when we have statistics available for the Vars of the
parent query. But that seems like a rather large task, and
we'd surely not wish to back-patch the results.
So I'm not really seeing another workable answer besides hacking
examine_variable, more or less as attached.
regards, tom lane
Attachments:
v1-dont-fail-in-early-examine_variable.patchtext/x-diff; charset=us-ascii; name=v1-dont-fail-in-early-examine_variable.patchDownload+14-0
On Sun, Aug 31, 2025 at 6:51 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
The most expedient solution is probably to hack examine_variable
so that it doesn't fail if root->simple_rel_array isn't there yet.
That seems mighty ugly though.
As an alternative, I wonder if we could pass root as NULL to
cost_subplan() when it's called from build_subplan(), and possibly
also from SS_process_ctes(). At those points, the root does not yet
contain enough information to safely consult statistics. Meanwhile,
cost_qual_eval() and the functions it calls are already well-equipped
to handle a NULL root.
- Richard
Richard Guo <guofenglinux@gmail.com> writes:
On Sun, Aug 31, 2025 at 6:51 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
The most expedient solution is probably to hack examine_variable
so that it doesn't fail if root->simple_rel_array isn't there yet.
That seems mighty ugly though.
As an alternative, I wonder if we could pass root as NULL to
cost_subplan() when it's called from build_subplan(), and possibly
also from SS_process_ctes(). At those points, the root does not yet
contain enough information to safely consult statistics. Meanwhile,
cost_qual_eval() and the functions it calls are already well-equipped
to handle a NULL root.
Good suggestion, that does seem more in keeping with established
hacks^H^H^Hpractice. It'll need some commentary about why.
I'm slightly tempted to do the dirty work in cost_subplan() itself,
ie pass a NULL down to cost_qual_eval from there. Right now it
doesn't matter, but maybe in future there would be some other
safer use for the parent root in cost_subplan()? This approach
would force the same behavior for the third caller of cost_subplan,
SS_make_initplan_from_plan. But I don't think it matters there,
since an initplan will by definition not be consulting anything from
the parent plan.
regards, tom lane
On Mon, Sep 1, 2025 at 11:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I'm slightly tempted to do the dirty work in cost_subplan() itself,
ie pass a NULL down to cost_qual_eval from there. Right now it
doesn't matter, but maybe in future there would be some other
safer use for the parent root in cost_subplan()?
Fair point.
This approach
would force the same behavior for the third caller of cost_subplan,
SS_make_initplan_from_plan. But I don't think it matters there,
since an initplan will by definition not be consulting anything from
the parent plan.
Yeah, it should be safe to use a NULL root for initplans. Besides, in
the case of SS_make_initplan_from_plan, testexpr will be NULL, and
cost_qual_eval() becomes a no-op.
Here is the patch that passes root as NULL to cost_qual_eval() in
cost_subplan(), along with some commentary about why.
- Richard
Attachments:
v2-0001-Fix-planner-error-when-estimating-SubPlan-cost.patchapplication/octet-stream; name=v2-0001-Fix-planner-error-when-estimating-SubPlan-cost.patchDownload+44-3
Richard Guo <guofenglinux@gmail.com> writes:
Here is the patch that passes root as NULL to cost_qual_eval() in
cost_subplan(), along with some commentary about why.
v2 LGTM. I did think of a way to make the test case a little
less busy-looking:
select (1 = any(array_agg(f1))) = any (select false) from int4_tbl;
but that's just cosmetic.
regards, tom lane
On Wed, Sep 3, 2025 at 12:27 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Guo <guofenglinux@gmail.com> writes:
Here is the patch that passes root as NULL to cost_qual_eval() in
cost_subplan(), along with some commentary about why.
v2 LGTM. I did think of a way to make the test case a little
less busy-looking:select (1 = any(array_agg(f1))) = any (select false) from int4_tbl;
but that's just cosmetic.
Thanks, I'll take this test case.
One thing I'm not quite sure about is whether we should backpatch this
fix to pre-v17 branches. Prior to v17, estimate_array_length() wasn't
taught to use statistics, so this error isn't reproducible there.
OTOH, passing a root without a valid simple_rel_array to
cost_qual_eval() still seems potentially unsafe. What do you think?
- Richard
Richard Guo <guofenglinux@gmail.com> writes:
One thing I'm not quite sure about is whether we should backpatch this
fix to pre-v17 branches. Prior to v17, estimate_array_length() wasn't
taught to use statistics, so this error isn't reproducible there.
OTOH, passing a root without a valid simple_rel_array to
cost_qual_eval() still seems potentially unsafe. What do you think?
Yeah, "is there any other instance of this problem?" is the $64
question here. I was initially thinking v17 is sufficient, but
the possibility that some extension might be vulnerable makes
me lean to back-patching further. Your call ...
regards, tom lane
On Wed, Sep 3, 2025 at 10:19 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Guo <guofenglinux@gmail.com> writes:
One thing I'm not quite sure about is whether we should backpatch this
fix to pre-v17 branches. Prior to v17, estimate_array_length() wasn't
taught to use statistics, so this error isn't reproducible there.
OTOH, passing a root without a valid simple_rel_array to
cost_qual_eval() still seems potentially unsafe. What do you think?
Yeah, "is there any other instance of this problem?" is the $64
question here. I was initially thinking v17 is sufficient, but
the possibility that some extension might be vulnerable makes
me lean to back-patching further. Your call ...
I've decided to backpatch this fix to pre-v17 branches: using a root
that lacks a valid simple_rel_array in a cost estimation function
seems like a pitfall waiting to happen. I've included an explanation
of this backpatch decision in the commit message and have pushed the
fix (the test case is not included in pre-v17 branches though).
- Richard
On 30/8/2025 23:51, Tom Lane wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
The following script:
In some sense the "right" fix would be to do SubPlan generation
later, when we have statistics available for the Vars of the
parent query. But that seems like a rather large task, and
we'd surely not wish to back-patch the results.
I'd vote to go this direction in the next version.
Implementing memoisation of subplans to reduce the number of subplan
evaluations, I encountered the same problem: the number of groups in the
parameter set can't be estimated during subplan planning.
I resolved this issue by employing upper_paths_hook to decide the
addition of Memoise nodes into subplans at the end of upper query planning.
But it is not an in-core solution and has a flaw with multi-level
references.
So, it would be better to prepare as much PlannerInfo's info, as
possible (base rels, maybe query tree?) before filling RelOptInfos with
potential paths.
--
regards, Andrei Lepikhov
Hello Tom and Richard,
03.09.2025 11:08, Richard Guo wrote:
Yeah, "is there any other instance of this problem?" is the $64
question here. I was initially thinking v17 is sufficient, but
the possibility that some extension might be vulnerable makes
me lean to back-patching further. Your call ...I've decided to backpatch this fix to pre-v17 branches: using a root
that lacks a valid simple_rel_array in a cost estimation function
seems like a pitfall waiting to happen. I've included an explanation
of this backpatch decision in the commit message and have pushed the
fix (the test case is not included in pre-v17 branches though).
Thank you for fixing that anomaly!
Best regards,
Alexander