BUG #19099: Conditional DELETE from partitioned table with non-updatable partition raises internal error

Started by PG Bug reporting form6 months ago36 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 19099
Logged by: Alexander Lakhin
Email address: exclusion@gmail.com
PostgreSQL version: 18.0
Operating system: Ubuntu 24.04
Description:

The following script:
CREATE EXTENSION file_fdw;
CREATE SERVER file_server FOREIGN DATA WRAPPER file_fdw;
CREATE TABLE pt (a int, b text) partition by list (a);
CREATE FOREIGN TABLE p1 partition of pt for values in (1) SERVER file_server
OPTIONS (format 'csv', filename '/tmp/1.csv');
SET enable_partition_pruning = 'off';
EXPLAIN DELETE FROM pt WHERE false;

raises:
ERROR: XX000: could not find junk ctid column
LOCATION: ExecInitModifyTable, nodeModifyTable.c:4867
(Discovered with SQLsmith.)

Reproduced starting from 86dc9005.

On 86dc9005~1 or with enable_partition_pruning = 'on', EXPLAIN outputs the
query plan and "DELETE FROM pt WHERE false;" completes with no error.

#2jian he
jian.universality@gmail.com
In reply to: PG Bug reporting form (#1)
Re: BUG #19099: Conditional DELETE from partitioned table with non-updatable partition raises internal error

On Thu, Oct 30, 2025 at 9:02 AM PG Bug reporting form
<noreply@postgresql.org> wrote:

The following bug has been logged on the website:

Bug reference: 19099
Logged by: Alexander Lakhin
Email address: exclusion@gmail.com
PostgreSQL version: 18.0
Operating system: Ubuntu 24.04
Description:

The following script:
CREATE EXTENSION file_fdw;
CREATE SERVER file_server FOREIGN DATA WRAPPER file_fdw;
CREATE TABLE pt (a int, b text) partition by list (a);
CREATE FOREIGN TABLE p1 partition of pt for values in (1) SERVER file_server
OPTIONS (format 'csv', filename '/tmp/1.csv');
SET enable_partition_pruning = 'off';
EXPLAIN DELETE FROM pt WHERE false;

raises:
ERROR: XX000: could not find junk ctid column
LOCATION: ExecInitModifyTable, nodeModifyTable.c:4867
(Discovered with SQLsmith.)

Reproduced starting from 86dc9005.

On 86dc9005~1 or with enable_partition_pruning = 'on', EXPLAIN outputs the
query plan and "DELETE FROM pt WHERE false;" completes with no error.

we can add a postgresAddForeignUpdateTargets(postgres_fdw) equivalent
function for file_fdw even though we do not support UPDATE/DELETE in file_fdw.

Attachments:

v1-0001-file_fdw-may-need-AddForeignUpdateTargets.patchtext/x-patch; charset=US-ASCII; name=v1-0001-file_fdw-may-need-AddForeignUpdateTargets.patchDownload+38-1
#3Tender Wang
tndrwang@gmail.com
In reply to: jian he (#2)
Re: BUG #19099: Conditional DELETE from partitioned table with non-updatable partition raises internal error

jian he <jian.universality@gmail.com> 于2025年10月30日周四 12:07写道:

On Thu, Oct 30, 2025 at 9:02 AM PG Bug reporting form
<noreply@postgresql.org> wrote:

The following bug has been logged on the website:

Bug reference: 19099
Logged by: Alexander Lakhin
Email address: exclusion@gmail.com
PostgreSQL version: 18.0
Operating system: Ubuntu 24.04
Description:

The following script:
CREATE EXTENSION file_fdw;
CREATE SERVER file_server FOREIGN DATA WRAPPER file_fdw;
CREATE TABLE pt (a int, b text) partition by list (a);
CREATE FOREIGN TABLE p1 partition of pt for values in (1) SERVER

file_server

OPTIONS (format 'csv', filename '/tmp/1.csv');
SET enable_partition_pruning = 'off';
EXPLAIN DELETE FROM pt WHERE false;

raises:
ERROR: XX000: could not find junk ctid column
LOCATION: ExecInitModifyTable, nodeModifyTable.c:4867
(Discovered with SQLsmith.)

Reproduced starting from 86dc9005.

On 86dc9005~1 or with enable_partition_pruning = 'on', EXPLAIN outputs

the

query plan and "DELETE FROM pt WHERE false;" completes with no error.

we can add a postgresAddForeignUpdateTargets(postgres_fdw) equivalent
function for file_fdw even though we do not support UPDATE/DELETE in
file_fdw.

After applying your patch, I got a different output if I enable verbose in
EXPLAIN:
postgres=# EXPLAIN verbose DELETE FROM pt WHERE false;
QUERY PLAN
-------------------------------------------------------
Delete on public.pt (cost=0.00..0.00 rows=0 width=0)
-> Result (cost=0.00..0.00 rows=0 width=0)
Output: ctid
Replaces: Scan on pt
One-Time Filter: false
(5 rows)

postgres=# set enable_partition_pruning = 'off';
SET
postgres=# EXPLAIN verbose DELETE FROM pt WHERE false;
QUERY PLAN
-------------------------------------------------------
Delete on public.pt (cost=0.00..0.00 rows=0 width=0)
-> Result (cost=0.00..0.00 rows=0 width=0)
Output: NULL::oid, NULL::tid
Replaces: Scan on pt
One-Time Filter: false
(5 rows)

Output: ctid (enable_partition_pruning = on)
vs
Output: NULL::oid, NULL::tid(enable_partition_pruning = off)

I try add childrte->relkind != RELKIND_PARTITIONED_TABLE
&& childrte->relkind != RELKIND_FOREIGN_TABLE)
to avoid adding "tableoid" for foreign table
in expand_single_inheritance_child().
It works, but the file_fdw regression test failed.

I added Tom and Amit to the cc list.
Any thoughts?
--
Thanks,
Tender Wang

#4Kirill Reshke
reshkekirill@gmail.com
In reply to: Tender Wang (#3)
Re: BUG #19099: Conditional DELETE from partitioned table with non-updatable partition raises internal error

On Thu, 30 Oct 2025 at 09:41, Tender Wang <tndrwang@gmail.com> wrote:

jian he <jian.universality@gmail.com> 于2025年10月30日周四 12:07写道:

On Thu, Oct 30, 2025 at 9:02 AM PG Bug reporting form
<noreply@postgresql.org> wrote:

The following bug has been logged on the website:

Bug reference: 19099
Logged by: Alexander Lakhin
Email address: exclusion@gmail.com
PostgreSQL version: 18.0
Operating system: Ubuntu 24.04
Description:

The following script:
CREATE EXTENSION file_fdw;
CREATE SERVER file_server FOREIGN DATA WRAPPER file_fdw;
CREATE TABLE pt (a int, b text) partition by list (a);
CREATE FOREIGN TABLE p1 partition of pt for values in (1) SERVER file_server
OPTIONS (format 'csv', filename '/tmp/1.csv');
SET enable_partition_pruning = 'off';
EXPLAIN DELETE FROM pt WHERE false;

raises:
ERROR: XX000: could not find junk ctid column
LOCATION: ExecInitModifyTable, nodeModifyTable.c:4867
(Discovered with SQLsmith.)

Reproduced starting from 86dc9005.

On 86dc9005~1 or with enable_partition_pruning = 'on', EXPLAIN outputs the
query plan and "DELETE FROM pt WHERE false;" completes with no error.

we can add a postgresAddForeignUpdateTargets(postgres_fdw) equivalent
function for file_fdw even though we do not support UPDATE/DELETE in file_fdw.

After applying your patch, I got a different output if I enable verbose in EXPLAIN:
postgres=# EXPLAIN verbose DELETE FROM pt WHERE false;
QUERY PLAN
-------------------------------------------------------
Delete on public.pt (cost=0.00..0.00 rows=0 width=0)
-> Result (cost=0.00..0.00 rows=0 width=0)
Output: ctid
Replaces: Scan on pt
One-Time Filter: false
(5 rows)

postgres=# set enable_partition_pruning = 'off';
SET
postgres=# EXPLAIN verbose DELETE FROM pt WHERE false;
QUERY PLAN
-------------------------------------------------------
Delete on public.pt (cost=0.00..0.00 rows=0 width=0)
-> Result (cost=0.00..0.00 rows=0 width=0)
Output: NULL::oid, NULL::tid
Replaces: Scan on pt
One-Time Filter: false
(5 rows)

Output: ctid (enable_partition_pruning = on)
vs
Output: NULL::oid, NULL::tid(enable_partition_pruning = off)

I try add childrte->relkind != RELKIND_PARTITIONED_TABLE && childrte->relkind != RELKIND_FOREIGN_TABLE)
to avoid adding "tableoid" for foreign table in expand_single_inheritance_child().
It works, but the file_fdw regression test failed.

I added Tom and Amit to the cc list.
Any thoughts?
--
Thanks,
Tender Wang

Hi!
Jian's fix WFM, I confirm 'EXPLAIN DELETE FROM pt WHERE false' now
works. Should we add this test case to the regression suite of
file_fdw?

But I also wonder if Jian's fix fixed the right thing. Should we
instead fail in the planning phase with a more user-friendly error
message? This will be a regression though, because 'DELETE FROM
file_fdw_table WHERE false' will no longer work...

As for EXPLAIN VERBOSE output, are they both confusing? Both for
enable_partition_pruning=on and enable_partition_pruning=off? I mean,
file_fdw does not have semantics of neither ctid nor tid?

--
Best regards,
Kirill Reshke

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tender Wang (#3)
Re: BUG #19099: Conditional DELETE from partitioned table with non-updatable partition raises internal error

Tender Wang <tndrwang@gmail.com> writes:

I added Tom and Amit to the cc list.
Any thoughts?

I'm having a hard time getting super excited about this. file_fdw
does not support DELETE -- it provides no ExecForeignDelete method --
which is why you get this:

regression=# DELETE FROM pt;
ERROR: cannot delete from foreign table "p1"

It's surely pretty accidental (and arguably not desirable)
if "DELETE FROM pt WHERE false" doesn't fail the same way.

Now, I agree that it's not great if you instead get an
internal error like "could not find junk ctid column".
But that smells to me like error checks being applied in
the wrong order rather than something fundamentally wrong.

I didn't look at the proposed patch yet.

regards, tom lane

#6Kirill Reshke
reshkekirill@gmail.com
In reply to: Tom Lane (#5)
Re: BUG #19099: Conditional DELETE from partitioned table with non-updatable partition raises internal error

On Thu, 30 Oct 2025 at 10:08, Tom Lane <tgl@sss.pgh.pa.us> wrote:

It's surely pretty accidental (and arguably not desirable)
if "DELETE FROM pt WHERE false" doesn't fail the same way.

Now, I agree that it's not great if you instead get an
internal error like "could not find junk ctid column".
But that smells to me like error checks being applied in
the wrong order rather than something fundamentally wrong.

I didn't look at the proposed patch yet.

regards, tom lane

I wrote:

But I also wonder if Jian's fix fixed the right thing. Should we
instead fail in the planning phase with a more user-friendly error
message? This will be a regression though, because 'DELETE FROM
file_fdw_table WHERE false' will no longer work...

On the second thought, I doubt anyone will get unhappy with 'DELETE
FROM file_fdw_table WHERE false' stop working

Alexander wrote:

On 86dc9005~1 or with enable_partition_pruning = 'on', EXPLAIN outputs the
query plan and "DELETE FROM pt WHERE false;" completes with no error.

So, behaviour was wrong both before and after 86dc9005, just in different ways.

On head, we get an error about junk columns, because without partition
pruning, we derive the result relation as 'pt', not its partition
'p1', which is correct I believe. But with 'p1' as result relation,
we (correctly) error out in ExecInitModifyTable while with 'pt' we
don't.

So, error checks are applied, order is not wrong, but rather checks
are not full enough? I mean, we I believe we need to execute
CheckValidResultRel against all partitions in ExecInitModifyTable, at
least when no partition pruning has been performed

--
Best regards,
Kirill Reshke

#7Kirill Reshke
reshkekirill@gmail.com
In reply to: Kirill Reshke (#6)
Re: BUG #19099: Conditional DELETE from partitioned table with non-updatable partition raises internal error

On Thu, 30 Oct 2025 at 10:31, I wrote:

I mean, we I believe we need to execute
CheckValidResultRel against all partitions in ExecInitModifyTable, at
least when no partition pruning has been performed

So, the problem is that we managed to exclude all child relations, and
only have a single (dummy) root relation as a result of the
modifyTable plan. Maybe we should populate its target list with
pseudo-junk columns in create_modifytable_plan ?

For instance, they query does not error-out if we have at least one
another non-file-fdw partition:

create table p2 partition of pt for values in ( 2) ;

this is because we have this in create_modifytable_plan

```
/* Transfer resname/resjunk labeling, too, to keep executor happy */
apply_tlist_labeling(subplan->targetlist, root->processed_tlist);
```

and we successfully found a junk column in the p2 partition.

The problem is, it works iff root->processed_tlist has at least one
relation which can give us junk columns. Should we add handling for
corner case here?
Another option is to remove this 'Transfer resname/resjunk labeling'
completely and rework planner-executer contracts somehow.

--
Best regards,
Kirill Reshke

#8Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Kirill Reshke (#7)
Re: BUG #19099: Conditional DELETE from partitioned table with non-updatable partition raises internal error

Hi,

On Thu, Oct 30, 2025 at 3:44 PM Kirill Reshke <reshkekirill@gmail.com> wrote:

On Thu, 30 Oct 2025 at 10:31, I wrote:

I mean, we I believe we need to execute
CheckValidResultRel against all partitions in ExecInitModifyTable, at
least when no partition pruning has been performed

So, the problem is that we managed to exclude all child relations, and
only have a single (dummy) root relation as a result of the
modifyTable plan. Maybe we should populate its target list with
pseudo-junk columns in create_modifytable_plan ?

For instance, they query does not error-out if we have at least one
another non-file-fdw partition:

create table p2 partition of pt for values in ( 2) ;

this is because we have this in create_modifytable_plan

```
/* Transfer resname/resjunk labeling, too, to keep executor happy */
apply_tlist_labeling(subplan->targetlist, root->processed_tlist);
```

and we successfully found a junk column in the p2 partition.

The problem is, it works iff root->processed_tlist has at least one
relation which can give us junk columns. Should we add handling for
corner case here?
Another option is to remove this 'Transfer resname/resjunk labeling'
completely and rework planner-executer contracts somehow.

I am not really sure if we should play with the planner code.

I suspect the real issue is that we’re assuming partitioned tables
always need a ctid, which wasn’t true before MERGE started using the
root ResultRelInfo. In fact, the old code already looked wrong -- it’s
been requiring a ctid even for partitioned tables where that was never
necessary. We can fix this by only requiring the junk ctid when we
actually operate through the root partitioned table, that is, for
MERGE. Like the attached.

--
Thanks, Amit Langote

Attachments:

partitioned-table-ctid-check.diffapplication/octet-stream; name=partitioned-table-ctid-check.diffDownload+15-3
#9Kirill Reshke
reshkekirill@gmail.com
In reply to: Amit Langote (#8)
Re: BUG #19099: Conditional DELETE from partitioned table with non-updatable partition raises internal error

On Thu, 30 Oct 2025 at 13:29, Amit Langote <amitlangote09@gmail.com> wrote:

Hi,

On Thu, Oct 30, 2025 at 3:44 PM Kirill Reshke <reshkekirill@gmail.com> wrote:

On Thu, 30 Oct 2025 at 10:31, I wrote:

I mean, we I believe we need to execute
CheckValidResultRel against all partitions in ExecInitModifyTable, at
least when no partition pruning has been performed

So, the problem is that we managed to exclude all child relations, and
only have a single (dummy) root relation as a result of the
modifyTable plan. Maybe we should populate its target list with
pseudo-junk columns in create_modifytable_plan ?

For instance, they query does not error-out if we have at least one
another non-file-fdw partition:

create table p2 partition of pt for values in ( 2) ;

this is because we have this in create_modifytable_plan

```
/* Transfer resname/resjunk labeling, too, to keep executor happy */
apply_tlist_labeling(subplan->targetlist, root->processed_tlist);
```

and we successfully found a junk column in the p2 partition.

The problem is, it works iff root->processed_tlist has at least one
relation which can give us junk columns. Should we add handling for
corner case here?
Another option is to remove this 'Transfer resname/resjunk labeling'
completely and rework planner-executer contracts somehow.

I am not really sure if we should play with the planner code.

I suspect the real issue is that we’re assuming partitioned tables
always need a ctid, which wasn’t true before MERGE started using the
root ResultRelInfo. In fact, the old code already looked wrong -- it’s
been requiring a ctid even for partitioned tables where that was never
necessary. We can fix this by only requiring the junk ctid when we
actually operate through the root partitioned table, that is, for
MERGE. Like the attached.

--
Thanks, Amit Langote

Hi! Thanks for the patch. I can see your points, however I am unsure
if this is the most right thing to do.
As per ab5fcf2b04f9 commit message and
src/backend/optimizer/plan/planner.c comments, I am under impression
that the postgres-way of fixing that would allow for
ExecInitModifyTable to operate with a NULL result relation list.
And, in any case, I am still unsure if we should allow the 'DELETE'
statement from Alexander's repro to successfully execute, which yout
patch still does

--
Best regards,
Kirill Reshke

#10Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Kirill Reshke (#9)
Re: BUG #19099: Conditional DELETE from partitioned table with non-updatable partition raises internal error

On Thu, Oct 30, 2025 at 5:55 PM Kirill Reshke <reshkekirill@gmail.com> wrote:

On Thu, 30 Oct 2025 at 13:29, Amit Langote <amitlangote09@gmail.com> wrote:

Hi,

On Thu, Oct 30, 2025 at 3:44 PM Kirill Reshke <reshkekirill@gmail.com> wrote:

On Thu, 30 Oct 2025 at 10:31, I wrote:

I mean, we I believe we need to execute
CheckValidResultRel against all partitions in ExecInitModifyTable, at
least when no partition pruning has been performed

So, the problem is that we managed to exclude all child relations, and
only have a single (dummy) root relation as a result of the
modifyTable plan. Maybe we should populate its target list with
pseudo-junk columns in create_modifytable_plan ?

For instance, they query does not error-out if we have at least one
another non-file-fdw partition:

create table p2 partition of pt for values in ( 2) ;

this is because we have this in create_modifytable_plan

```
/* Transfer resname/resjunk labeling, too, to keep executor happy */
apply_tlist_labeling(subplan->targetlist, root->processed_tlist);
```

and we successfully found a junk column in the p2 partition.

The problem is, it works iff root->processed_tlist has at least one
relation which can give us junk columns. Should we add handling for
corner case here?
Another option is to remove this 'Transfer resname/resjunk labeling'
completely and rework planner-executer contracts somehow.

I am not really sure if we should play with the planner code.

I suspect the real issue is that we’re assuming partitioned tables
always need a ctid, which wasn’t true before MERGE started using the
root ResultRelInfo. In fact, the old code already looked wrong -- it’s
been requiring a ctid even for partitioned tables where that was never
necessary. We can fix this by only requiring the junk ctid when we
actually operate through the root partitioned table, that is, for
MERGE. Like the attached.

--
Thanks, Amit Langote

Hi! Thanks for the patch. I can see your points, however I am unsure
if this is the most right thing to do.
As per ab5fcf2b04f9 commit message and
src/backend/optimizer/plan/planner.c comments, I am under impression
that the postgres-way of fixing that would allow for
ExecInitModifyTable to operate with a NULL result relation list.

Isn't that what happens with my patch?

And, in any case, I am still unsure if we should allow the 'DELETE'
statement from Alexander's repro to successfully execute, which yout
patch still does

What behavior do you propose in that case? The WHERE false part makes
the plan a dummy ModifyTable on the root partitioned table pt (per
ab5fcf2b0 I guess), and there’s nothing left in the plan that can be
flagged at execution time; the error Alexander reported is a bug we're
trying to fix. Are you suggesting instead that the attempt to plan
DELETE on the file_fdw partition -- or any foreign table that doesn’t
support DELETE -- should be prevented?

--
Thanks, Amit Langote

#11Kirill Reshke
reshkekirill@gmail.com
In reply to: Amit Langote (#10)
Re: BUG #19099: Conditional DELETE from partitioned table with non-updatable partition raises internal error

On Thu, 30 Oct 2025, 14:18 Amit Langote, <amitlangote09@gmail.com> wrote:

On Thu, Oct 30, 2025 at 5:55 PM Kirill Reshke <reshkekirill@gmail.com>
wrote:

On Thu, 30 Oct 2025 at 13:29, Amit Langote <amitlangote09@gmail.com>

wrote:

Hi,

On Thu, Oct 30, 2025 at 3:44 PM Kirill Reshke <reshkekirill@gmail.com>

wrote:

On Thu, 30 Oct 2025 at 10:31, I wrote:

I mean, we I believe we need to execute
CheckValidResultRel against all partitions in ExecInitModifyTable,

at

least when no partition pruning has been performed

So, the problem is that we managed to exclude all child relations,

and

only have a single (dummy) root relation as a result of the
modifyTable plan. Maybe we should populate its target list with
pseudo-junk columns in create_modifytable_plan ?

For instance, they query does not error-out if we have at least one
another non-file-fdw partition:

create table p2 partition of pt for values in ( 2) ;

this is because we have this in create_modifytable_plan

```
/* Transfer resname/resjunk labeling, too, to keep executor happy */
apply_tlist_labeling(subplan->targetlist, root->processed_tlist);
```

and we successfully found a junk column in the p2 partition.

The problem is, it works iff root->processed_tlist has at least one
relation which can give us junk columns. Should we add handling for
corner case here?
Another option is to remove this 'Transfer resname/resjunk labeling'
completely and rework planner-executer contracts somehow.

I am not really sure if we should play with the planner code.

I suspect the real issue is that we’re assuming partitioned tables
always need a ctid, which wasn’t true before MERGE started using the
root ResultRelInfo. In fact, the old code already looked wrong -- it’s
been requiring a ctid even for partitioned tables where that was never
necessary. We can fix this by only requiring the junk ctid when we
actually operate through the root partitioned table, that is, for
MERGE. Like the attached.

--
Thanks, Amit Langote

Hi! Thanks for the patch. I can see your points, however I am unsure
if this is the most right thing to do.
As per ab5fcf2b04f9 commit message and
src/backend/optimizer/plan/planner.c comments, I am under impression
that the postgres-way of fixing that would allow for
ExecInitModifyTable to operate with a NULL result relation list.

Isn't that what happens with my patch?

And, in any case, I am still unsure if we should allow the 'DELETE'
statement from Alexander's repro to successfully execute, which yout
patch still does

What behavior do you propose in that case? The WHERE false part makes
the plan a dummy ModifyTable on the root partitioned table pt (per
ab5fcf2b0 I guess), and there’s nothing left in the plan that can be
flagged at execution time; the error Alexander reported is a bug we're
trying to fix. Are you suggesting instead that the attempt to plan
DELETE on the file_fdw partition -- or any foreign table that doesn’t
support DELETE -- should be prevented?

--
Thanks, Amit Langote

Okay, after putting more thought on it, I think your fix is OK. WFM, LGTM

Show quoted text
#12Tender Wang
tndrwang@gmail.com
In reply to: Amit Langote (#8)
Re: BUG #19099: Conditional DELETE from partitioned table with non-updatable partition raises internal error

Amit Langote <amitlangote09@gmail.com> 于2025年10月30日周四 16:29写道:

I am not really sure if we should play with the planner code.

I suspect the real issue is that we’re assuming partitioned tables
always need a ctid, which wasn’t true before MERGE started using the
root ResultRelInfo. In fact, the old code already looked wrong -- it’s
been requiring a ctid even for partitioned tables where that was never
necessary. We can fix this by only requiring the junk ctid when we
actually operate through the root partitioned table, that is, for
MERGE. Like the attached.

With your patch, this issue didn't happen again.
But I still get a different output when I enable verbose in EXPLAIN,

Output: ctid (enable_partition_pruning = on)
vs
Output: NULL::oid(enable_partition_pruning = off)

From the user's perspective, it's a bit confusing.
I agree more with Tom’s opinion — we should throw an error like "cannot
delete from foreign table p1"
But the plan only had a dummy root relation; CheckValidResultRel() doesn't
work.
Some other code place may need to do something.

--
Thanks,
Tender Wang

#13Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Tender Wang (#12)
Re: BUG #19099: Conditional DELETE from partitioned table with non-updatable partition raises internal error

On Thu, Oct 30, 2025 at 8:08 PM Tender Wang <tndrwang@gmail.com> wrote:

Amit Langote <amitlangote09@gmail.com> 于2025年10月30日周四 16:29写道:

I am not really sure if we should play with the planner code.

I suspect the real issue is that we’re assuming partitioned tables
always need a ctid, which wasn’t true before MERGE started using the
root ResultRelInfo. In fact, the old code already looked wrong -- it’s
been requiring a ctid even for partitioned tables where that was never
necessary. We can fix this by only requiring the junk ctid when we
actually operate through the root partitioned table, that is, for
MERGE. Like the attached.

With your patch, this issue didn't happen again.
But I still get a different output when I enable verbose in EXPLAIN,

Output: ctid (enable_partition_pruning = on)
vs
Output: NULL::oid(enable_partition_pruning = off)

From the user's perspective, it's a bit confusing.

Hmm, that's perhaps not ideal. That's the row identity var "tableoid"
added by expand_single_inheritance_child():

/*
* If we have any child target relations, assume they all need to
* generate a junk "tableoid" column. (If only one child survives
* pruning, we wouldn't really need this, but it's not worth
* thrashing about to avoid it.)
*/
rrvar = makeVar(childRTindex,
TableOidAttributeNumber,
OIDOID,
-1,
InvalidOid,
0);
add_row_identity_var(root, rrvar, childRTindex, "tableoid");

The WHERE false excludes the child that adds the above var, so you end
up with a NULL in the targetlist because of this part of
set_plan_refs():

/*
* The tlist of a childless Result could contain
* unresolved ROWID_VAR Vars, in case it's representing a
* target relation which is completely empty because of
* constraint exclusion. Replace any such Vars by null
* constants, as though they'd been resolved for a leaf
* scan node that doesn't support them. We could have
* fix_scan_expr do this, but since the case is only
* expected to occur here, it seems safer to special-case
* it here and keep the assertions that ROWID_VARs
* shouldn't be seen by fix_scan_expr.
*
* We also must handle the case where set operations have
* been short-circuited resulting in a dummy Result node.
* prepunion.c uses varno==0 for the set op targetlist.
* See generate_setop_tlist() and generate_setop_tlist().
* Here we rewrite these to use varno==1, which is the
* varno of the first set-op child. Without this, EXPLAIN
* will have trouble displaying targetlists of dummy set
* operations.
*/
foreach(l, splan->plan.targetlist)
{
TargetEntry *tle = (TargetEntry *) lfirst(l);
Var *var = (Var *) tle->expr;

if (var && IsA(var, Var))
{
if (var->varno == ROWID_VAR)
tle->expr = (Expr *) makeNullConst(var->vartype,

var->vartypmod,

var->varcollid);

I’m not sure how hard we should try to avoid that kind of confusion in
the EXPLAIN VERBOSE output.

I agree more with Tom’s opinion — we should throw an error like "cannot delete from foreign table p1"
But the plan only had a dummy root relation; CheckValidResultRel() doesn't work.
Some other code place may need to do something.

Yeah, I’m also not sure there’s an obvious place where we could detect
that earlier. Once pruning removes all real children, the planner ends
up with just a dummy root, and by that point there’s no surviving
foreign table ResultRelInfo to check or even any child relation data
structure in the planner.

--
Thanks, Amit Langote

#14Kirill Reshke
reshkekirill@gmail.com
In reply to: Tender Wang (#12)
Re: BUG #19099: Conditional DELETE from partitioned table with non-updatable partition raises internal error

Hi!

On Thu, 30 Oct 2025 at 16:08, Tender Wang <tndrwang@gmail.com> wrote:

From the user's perspective, it's a bit confusing.
I agree more with Tom’s opinion — we should throw an error like "cannot delete from foreign table p1"
But the plan only had a dummy root relation; CheckValidResultRel() doesn't work.
Some other code place may need to do something.

Tom wrote:

It's surely pretty accidental (and arguably not desirable)
if "DELETE FROM pt WHERE false" doesn't fail the same way.

I cannot prove to myself why failing here is actually desirable. Can
you elaborate?

--
Best regards,
Kirill Reshke

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Kirill Reshke (#14)
Re: BUG #19099: Conditional DELETE from partitioned table with non-updatable partition raises internal error

Kirill Reshke <reshkekirill@gmail.com> writes:

Tom wrote:

It's surely pretty accidental (and arguably not desirable)
if "DELETE FROM pt WHERE false" doesn't fail the same way.

I cannot prove to myself why failing here is actually desirable. Can
you elaborate?

If we throw that failure in some cases but not others, we're exposing
implementation details.

The definition could have been "throw 'cannot delete from foreign
table' only if the query actually attempts to delete some specific
row from some foreign table", but it is not implemented that way.
Instead the error is thrown during query startup if the query has
a foreign table as a potential delete target. Thus, as things stand
today, you might or might not get the error depending on whether
the planner can prove that that partition won't be deleted from.
This is not a great user experience, because we don't (and won't)
make any hard promises about how smart the planner is.

An analogy perhaps is that whether you get a "permission denied"
error about some target table is not conditional on whether the
query actually attempts to delete any rows from it. We go out
of our way to make sure that that happens when required by spec,
even if the planner is able to prove that no delete will happen.

None of this is meant to justify throwing an internal error here;
that's clearly bad. I'm just saying that there would be little
wrong with fixing it by throwing "cannot delete" instead. The user
has no right to expect that that won't happen in a case like this.

regards, tom lane

#16Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Tom Lane (#15)
Re: BUG #19099: Conditional DELETE from partitioned table with non-updatable partition raises internal error

On Thu, Oct 30, 2025 at 10:48 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Kirill Reshke <reshkekirill@gmail.com> writes:

Tom wrote:

It's surely pretty accidental (and arguably not desirable)
if "DELETE FROM pt WHERE false" doesn't fail the same way.

I cannot prove to myself why failing here is actually desirable. Can
you elaborate?

If we throw that failure in some cases but not others, we're exposing
implementation details.

The definition could have been "throw 'cannot delete from foreign
table' only if the query actually attempts to delete some specific
row from some foreign table", but it is not implemented that way.
Instead the error is thrown during query startup if the query has
a foreign table as a potential delete target. Thus, as things stand
today, you might or might not get the error depending on whether
the planner can prove that that partition won't be deleted from.
This is not a great user experience, because we don't (and won't)
make any hard promises about how smart the planner is.

An analogy perhaps is that whether you get a "permission denied"
error about some target table is not conditional on whether the
query actually attempts to delete any rows from it. We go out
of our way to make sure that that happens when required by spec,
even if the planner is able to prove that no delete will happen.

None of this is meant to justify throwing an internal error here;
that's clearly bad. I'm just saying that there would be little
wrong with fixing it by throwing "cannot delete" instead. The user
has no right to expect that that won't happen in a case like this.

We might be able to throw the "cannot delete from foreign table" like this:

@@ -987,6 +987,16 @@ add_row_identity_columns(PlannerInfo *root, Index rtindex,

fdwroutine = GetFdwRoutineForRelation(target_relation, false);

+        if (fdwroutine->ExecForeignDelete == NULL)
+            ereport(ERROR,
+                    (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                     errmsg("cannot delete from foreign table \"%s\"",
+                            RelationGetRelationName(target_relation))));
+        if (fdwroutine->ExecForeignUpdate == NULL)
+            ereport(ERROR,
+                    (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                     errmsg("cannot update foreign table \"%s\"",
+                            RelationGetRelationName(target_relation))));
         if (fdwroutine->AddForeignUpdateTargets != NULL)
             fdwroutine->AddForeignUpdateTargets(root, rtindex,
                                                 target_rte, target_relation);

but I am not sure how consistent the following is after applying that:

postgres=# set enable_partition_pruning to off;
SET
postgres=# EXPLAIN verbose DELETE FROM pt WHERE false;
ERROR: cannot delete from foreign table "p1"
postgres=# set enable_partition_pruning to on;
SET

-- we don't even hit the foreign table in the planner
postgres=# EXPLAIN verbose DELETE FROM pt WHERE false;
QUERY PLAN
-------------------------------------------------------
Delete on public.pt (cost=0.00..0.00 rows=0 width=0)
-> Result (cost=0.00..0.00 rows=0 width=0)
Output: ctid
Replaces: Scan on pt
One-Time Filter: false
(5 rows)

--
Thanks, Amit Langote

#17David Rowley
dgrowleyml@gmail.com
In reply to: Tom Lane (#15)
Re: BUG #19099: Conditional DELETE from partitioned table with non-updatable partition raises internal error

On Fri, 31 Oct 2025 at 02:48, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The definition could have been "throw 'cannot delete from foreign
table' only if the query actually attempts to delete some specific
row from some foreign table", but it is not implemented that way.
Instead the error is thrown during query startup if the query has
a foreign table as a potential delete target. Thus, as things stand
today, you might or might not get the error depending on whether
the planner can prove that that partition won't be deleted from.
This is not a great user experience, because we don't (and won't)
make any hard promises about how smart the planner is.

It's a good point, but I doubt we could change this fact as I expect
there are people relying on pruned partitions being excluded from this
check. It seems reasonable that someone might want to do something
like archive ancient time-based partitioned table partitions into
file_fdw stored on a compressed filesystem so that they can still at
least query old data should they need to. If we were to precheck that
all partitions support an UPDATE/DELETE, then it could break workloads
that do updates on recent data in heap-based partitions. Things would
go bad for those people if they switched off partition pruning, but I
doubt that would be the only reason as that would also add a huge
amount of overhead to their SELECT statements.

In any case, the planner is now very efficient at not loading any
metadata for pruned partitions, so I don't see how we'd do this
without adding possibly large overhead to the planner. I'd say we're
well beyond the point of being able to change this now.

David

#18Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: David Rowley (#17)
Re: BUG #19099: Conditional DELETE from partitioned table with non-updatable partition raises internal error

On Fri, Oct 31, 2025 at 10:50 AM David Rowley <dgrowleyml@gmail.com> wrote:

On Fri, 31 Oct 2025 at 02:48, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The definition could have been "throw 'cannot delete from foreign
table' only if the query actually attempts to delete some specific
row from some foreign table", but it is not implemented that way.
Instead the error is thrown during query startup if the query has
a foreign table as a potential delete target. Thus, as things stand
today, you might or might not get the error depending on whether
the planner can prove that that partition won't be deleted from.
This is not a great user experience, because we don't (and won't)
make any hard promises about how smart the planner is.

It's a good point, but I doubt we could change this fact as I expect
there are people relying on pruned partitions being excluded from this
check. It seems reasonable that someone might want to do something
like archive ancient time-based partitioned table partitions into
file_fdw stored on a compressed filesystem so that they can still at
least query old data should they need to. If we were to precheck that
all partitions support an UPDATE/DELETE, then it could break workloads
that do updates on recent data in heap-based partitions. Things would
go bad for those people if they switched off partition pruning, but I
doubt that would be the only reason as that would also add a huge
amount of overhead to their SELECT statements.

In any case, the planner is now very efficient at not loading any
metadata for pruned partitions, so I don't see how we'd do this
without adding possibly large overhead to the planner. I'd say we're
well beyond the point of being able to change this now.

I agree that we definitely shouldn’t load metadata for partitions that
are excluded from the plan, especially not just to slightly improve
user experience in this corner case.

I looked at a few options, but none seem non-invasive enough for
back-patching, apart from the first patch I posted. That one makes
ExecInitModifyTable() not require a ctid to be present to set the root
partitioned table’s ri_RowIdAttNo, except in the case of MERGE, which
seems to need it. The corner case that triggers the internal error for
UPDATE/DELETE doesn’t occur for MERGE now and likely won’t when
foreign tables eventually gain MERGE support; don't mark my words
though ;-).

Among those options, I considered the following block, which adds a
ctid for the partitioned root table when it’s the only target in the
query after partition pruning removes all child tables due to the
WHERE false condition in the problematic case:

/*
* Ordinarily, we expect that leaf result relation(s) will have added some
* ROWID_VAR Vars to the query. However, it's possible that constraint
* exclusion suppressed every leaf relation. The executor will get upset
* if the plan has no row identity columns at all, even though it will
* certainly process no rows. Handle this edge case by re-opening the top
* result relation and adding the row identity columns it would have used,
* as preprocess_targetlist() would have done if it weren't marked "inh".
* Then re-run build_base_rel_tlists() to ensure that the added columns
* get propagated to the relation's reltarget. (This is a bit ugly, but
* it seems better to confine the ugliness and extra cycles to this
* unusual corner case.)
*/
if (root->row_identity_vars == NIL)
{
Relation target_relation;

target_relation = table_open(target_rte->relid, NoLock);
add_row_identity_columns(root, result_relation,
target_rte, target_relation);
table_close(target_relation, NoLock);
build_base_rel_tlists(root, root->processed_tlist);
/* There are no ROWID_VAR Vars in this case, so we're done. */
return;
}

If enable_partition_pruning is off, root->row_identity_vars already
contains a RowIdentityVarInfo entry for the tableoid Var that was
added while processing the foreign-table child partition. Because of
that, the if (root->row_identity_vars == NIL) block doesn’t run in
this case, so it won’t add any row identity columns such as ctid for
the partitioned root table.

In theory, we could prevent the planner from adding tableoid in the
first place when the child table doesn’t support any row identity
column -- or worse, doesn’t support the UPDATE/DELETE/MERGE command at
all -- but doing so would require changing the order in which tableoid
appears in root->processed_tlist. That would be too invasive for a
back-patch.

So for back branches, I’d propose sticking with the smaller
executor-side fix and perhaps revisiting the planner behavior
separately if we ever want to refine handling of pruned partitions or
dummy roots. I understand, as was reported upthread, that the EXPLAIN
VERBOSE output isn’t very consistent with that patch even though the
internal error goes away. Making sense of the output differences
requires knowing that the targetlist population behavior differs
depending on whether enable_partition_pruning is on or off as I
described above.

--
Thanks, Amit Langote

#19Tender Wang
tndrwang@gmail.com
In reply to: Amit Langote (#18)
Re: BUG #19099: Conditional DELETE from partitioned table with non-updatable partition raises internal error

Amit Langote <amitlangote09@gmail.com> 于2025年11月6日周四 18:00写道:

On Fri, Oct 31, 2025 at 10:50 AM David Rowley <dgrowleyml@gmail.com>
wrote:

On Fri, 31 Oct 2025 at 02:48, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The definition could have been "throw 'cannot delete from foreign
table' only if the query actually attempts to delete some specific
row from some foreign table", but it is not implemented that way.
Instead the error is thrown during query startup if the query has
a foreign table as a potential delete target. Thus, as things stand
today, you might or might not get the error depending on whether
the planner can prove that that partition won't be deleted from.
This is not a great user experience, because we don't (and won't)
make any hard promises about how smart the planner is.

It's a good point, but I doubt we could change this fact as I expect
there are people relying on pruned partitions being excluded from this
check. It seems reasonable that someone might want to do something
like archive ancient time-based partitioned table partitions into
file_fdw stored on a compressed filesystem so that they can still at
least query old data should they need to. If we were to precheck that
all partitions support an UPDATE/DELETE, then it could break workloads
that do updates on recent data in heap-based partitions. Things would
go bad for those people if they switched off partition pruning, but I
doubt that would be the only reason as that would also add a huge
amount of overhead to their SELECT statements.

In any case, the planner is now very efficient at not loading any
metadata for pruned partitions, so I don't see how we'd do this
without adding possibly large overhead to the planner. I'd say we're
well beyond the point of being able to change this now.

I agree that we definitely shouldn’t load metadata for partitions that
are excluded from the plan, especially not just to slightly improve
user experience in this corner case.

I looked at a few options, but none seem non-invasive enough for
back-patching, apart from the first patch I posted. That one makes
ExecInitModifyTable() not require a ctid to be present to set the root
partitioned table’s ri_RowIdAttNo, except in the case of MERGE, which
seems to need it. The corner case that triggers the internal error for
UPDATE/DELETE doesn’t occur for MERGE now and likely won’t when
foreign tables eventually gain MERGE support; don't mark my words
though ;-).

Among those options, I considered the following block, which adds a
ctid for the partitioned root table when it’s the only target in the
query after partition pruning removes all child tables due to the
WHERE false condition in the problematic case:

/*
* Ordinarily, we expect that leaf result relation(s) will have added
some
* ROWID_VAR Vars to the query. However, it's possible that constraint
* exclusion suppressed every leaf relation. The executor will get
upset
* if the plan has no row identity columns at all, even though it will
* certainly process no rows. Handle this edge case by re-opening the
top
* result relation and adding the row identity columns it would have
used,
* as preprocess_targetlist() would have done if it weren't marked
"inh".
* Then re-run build_base_rel_tlists() to ensure that the added columns
* get propagated to the relation's reltarget. (This is a bit ugly,
but
* it seems better to confine the ugliness and extra cycles to this
* unusual corner case.)
*/
if (root->row_identity_vars == NIL)
{
Relation target_relation;

target_relation = table_open(target_rte->relid, NoLock);
add_row_identity_columns(root, result_relation,
target_rte, target_relation);
table_close(target_relation, NoLock);
build_base_rel_tlists(root, root->processed_tlist);
/* There are no ROWID_VAR Vars in this case, so we're done. */
return;
}

If enable_partition_pruning is off, root->row_identity_vars already
contains a RowIdentityVarInfo entry for the tableoid Var that was
added while processing the foreign-table child partition. Because of
that, the if (root->row_identity_vars == NIL) block doesn’t run in
this case, so it won’t add any row identity columns such as ctid for
the partitioned root table.

In theory, we could prevent the planner from adding tableoid in the
first place when the child table doesn’t support any row identity
column -- or worse, doesn’t support the UPDATE/DELETE/MERGE command at
all -- but doing so would require changing the order in which tableoid
appears in root->processed_tlist. That would be too invasive for a
back-patch.

Yeah, it seems to need more work if we prevent the planner from adding
tableoid
in the first place.

So for back branches, I’d propose sticking with the smaller
executor-side fix and perhaps revisiting the planner behavior
separately if we ever want to refine handling of pruned partitions or
dummy roots. I understand, as was reported upthread, that the EXPLAIN
VERBOSE output isn’t very consistent with that patch even though the
internal error goes away. Making sense of the output differences
requires knowing that the targetlist population behavior differs
depending on whether enable_partition_pruning is on or off as I
described above.

The executor-side fix works for me and the test case should be added to
your patch.
Should we add some comments to explain the output difference in EXPLAIN
VERBOSE
if enable_partition_pruning is set to a different value?

--
Thanks,
Tender Wang

#20Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#18)
Re: BUG #19099: Conditional DELETE from partitioned table with non-updatable partition raises internal error

On Thu, Nov 6, 2025 at 7:00 PM Amit Langote <amitlangote09@gmail.com> wrote:

I looked at a few options, but none seem non-invasive enough for
back-patching, apart from the first patch I posted. That one makes
ExecInitModifyTable() not require a ctid to be present to set the root
partitioned table’s ri_RowIdAttNo, except in the case of MERGE, which
seems to need it. The corner case that triggers the internal error for
UPDATE/DELETE doesn’t occur for MERGE now and likely won’t when
foreign tables eventually gain MERGE support; don't mark my words
though ;-).

Well, OK, I just had not tried hard enough to see that the same error
happens for MERGE too.

With my patch applied:
EXPLAIN VERBOSE MERGE INTO pt t USING (VALUES (1, 'x'::text)) AS s(a,
b) ON false WHEN MATCHED THEN UPDATE SET b = s.b;
ERROR: could not find junk ctid column

I have another idea: we can simply recognize the corner condition that
throws this error in ExecInitModifyTable() by checking if
ModifyTable.resultRelations contains only the root partitioned table.
That can only happen for UPDATE, DELETE, or MERGE when all child
relations were excluded.

Patch doing that attached. Added test cases to file_fdw's suite.

--
Thanks, Amit Langote

Attachments:

v2-0001-Fix-bogus-ctid-requirement-for-dummy-root-partiti.patchapplication/octet-stream; name=v2-0001-Fix-bogus-ctid-requirement-for-dummy-root-partiti.patchDownload+136-2
#21Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Tender Wang (#19)
#22Tender Wang
tndrwang@gmail.com
In reply to: Amit Langote (#21)
#23Kirill Reshke
reshkekirill@gmail.com
In reply to: Amit Langote (#20)
#24Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Kirill Reshke (#23)
#25Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#24)
#26Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#18)
#27Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#26)
#28Tender Wang
tndrwang@gmail.com
In reply to: Amit Langote (#26)
#29Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Tender Wang (#28)
#30Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#29)
#31Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#30)
#32Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#31)
#33Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Langote (#32)
#34Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Tom Lane (#33)
#35Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#34)
#36Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#35)