Incorrect explain output for updates/delete operations with returning-list on partitioned tables
Hi PostgreSQL Community,
I have been working on partitioned tables recently, and I have noticed
something that doesn't seem correct with the EXPLAIN output of an
update/delete query with a returning list.
For example, consider two partitioned tables, "t1" and "t2," with
partitions "t11," "t12," and "t21," "t22," respectively. The table
definitions are as follows:
```sql
postgres=# \d+ t1
Partitioned table "public.t1"
Column | Type | Collation | Nullable | Default | Storage | Compression
| Stats target | Description
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
a | integer | | | | plain |
| |
b | integer | | | | plain |
| |
c | integer | | | | plain |
| |
Partition key: RANGE (a)
Partitions: t11 FOR VALUES FROM (0) TO (1000),
t12 FOR VALUES FROM (1000) TO (10000)
postgres=# \d+ t2
Partitioned table "public.t2"
Column | Type | Collation | Nullable | Default | Storage | Compression
| Stats target | Description
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
a | integer | | | | plain |
| |
b | integer | | | | plain |
| |
c | integer | | | | plain |
| |
Partition key: RANGE (a)
Partitions: t21 FOR VALUES FROM (0) TO (1000),
t22 FOR VALUES FROM (1000) TO (10000)
```
The EXPLAIN output for an update query with a returning list doesn't seem
correct to me. Here are the examples (the part that doesn't seem right is
highlighted in bold):
*Query1:*
```
postgres=# explain verbose update t1 set b = 10 from t2 where t1.a = t2.a
returning t1.c;
QUERY PLAN
-------------------------------------------------------------------------------------------
Update on public.t1 (cost=0.00..125187.88 rows=41616 width=14)
*Output: t1_1.c -----> something not right??*
Update on public.t11 t1_1
Update on public.t12 t1_2
-> Append (cost=0.00..125187.88 rows=41616 width=14)
-> Nested Loop (cost=0.00..62489.90 rows=20808 width=14)
Output: 10, t1_1.tableoid, t1_1.ctid
Join Filter: (t1_1.a = t2_1.a)
-> Seq Scan on public.t11 t1_1 (cost=0.00..30.40 rows=2040
width=14)
Output: t1_1.a, t1_1.tableoid, t1_1.ctid
-> Materialize (cost=0.00..40.60 rows=2040 width=4)
Output: t2_1.a
-> Seq Scan on public.t21 t2_1 (cost=0.00..30.40
rows=2040 width=4)
Output: t2_1.a
-> Nested Loop (cost=0.00..62489.90 rows=20808 width=14)
Output: 10, t1_2.tableoid, t1_2.ctid
Join Filter: (t1_2.a = t2_2.a)
-> Seq Scan on public.t12 t1_2 (cost=0.00..30.40 rows=2040
width=14)
Output: t1_2.a, t1_2.tableoid, t1_2.ctid
-> Materialize (cost=0.00..40.60 rows=2040 width=4)
Output: t2_2.a
-> Seq Scan on public.t22 t2_2 (cost=0.00..30.40
rows=2040 width=4)
Output: t2_2.a
(23 rows)
```
*Query2:*
*```*postgres=# explain verbose update t1 set b = 10 from t2 where t1.a =
t2.a returning t2.c;
QUERY PLAN
-------------------------------------------------------------------------------------------
Update on public.t1 (cost=0.00..125187.88 rows=41616 width=18)
*Output: t2.c*
Update on public.t11 t1_1
Update on public.t12 t1_2
-> Append (cost=0.00..125187.88 rows=41616 width=18)
-> Nested Loop (cost=0.00..62489.90 rows=20808 width=18)
Output: 10, t2_1.c, t1_1.tableoid, t1_1.ctid
Join Filter: (t1_1.a = t2_1.a)
-> Seq Scan on public.t11 t1_1 (cost=0.00..30.40 rows=2040
width=14)
Output: t1_1.a, t1_1.tableoid, t1_1.ctid
-> Materialize (cost=0.00..40.60 rows=2040 width=8)
Output: t2_1.c, t2_1.a
-> Seq Scan on public.t21 t2_1 (cost=0.00..30.40
rows=2040 width=8)
Output: t2_1.c, t2_1.a
-> Nested Loop (cost=0.00..62489.90 rows=20808 width=18)
Output: 10, t2_2.c, t1_2.tableoid, t1_2.ctid
Join Filter: (t1_2.a = t2_2.a)
-> Seq Scan on public.t12 t1_2 (cost=0.00..30.40 rows=2040
width=14)
Output: t1_2.a, t1_2.tableoid, t1_2.ctid
-> Materialize (cost=0.00..40.60 rows=2040 width=8)
Output: t2_2.c, t2_2.a
-> Seq Scan on public.t22 t2_2 (cost=0.00..30.40
rows=2040 width=8)
Output: t2_2.c, t2_2.a
(23 rows)
```
After further investigation into the code, I noticed following:
1. In the 'grouping_planner()' function, while generating paths for the
final relation (
https://github.com/postgres/postgres/blob/master/src/backend/optimizer/plan/planner.c#L1857),
we only take care of adjusting the append_rel_attributes in returningList
for resultRelation. Shouldn't we do that for other relations as well in
query? Example for *Query2* above, *adjust_appendrel_attrs_multilevel* is a
no-op.
2. After plan creation (
https://github.com/postgres/postgres/blob/master/src/backend/optimizer/plan/createplan.c#L351),
shouldn't we perform tlist labeling for the `returningList` as well? I
suspect this is resulting in incorrect output in *Query1*.
I suspect that similar issues might also be present for `withCheckOptions`,
`mergeActionList`, and `mergeJoinCondition`.
I would appreciate it if the community could provide insights or
clarifications regarding this observation.
Thank you for your time and consideration.
Regards
Saikiran Avula,
SDE, Amazon Web Services.
SAIKIRAN AVULA <avulasaikiranreddy@gmail.com> writes:
I have been working on partitioned tables recently, and I have noticed
something that doesn't seem correct with the EXPLAIN output of an
update/delete query with a returning list.
What do you think is not right exactly? The output has to use some
one of the correlation names for the partitioned table. I think
it generally chooses the one corresponding to the first Append arm,
but really any would be good enough for EXPLAIN's purposes.
1. In the 'grouping_planner()' function, while generating paths for the
final relation (
https://github.com/postgres/postgres/blob/master/src/backend/optimizer/plan/planner.c#L1857),
we only take care of adjusting the append_rel_attributes in returningList
for resultRelation. Shouldn't we do that for other relations as well in
query?
If the only difference is which way variables get labeled in EXPLAIN,
I'd be kind of disinclined to spend extra cycles. But in any case,
I rather suspect you'll find that this actively breaks things.
Whether we change the varno on a Var isn't really optional, and there
are cross-checks in setrefs.c to make sure things match up.
regards, tom lane
On Tue, 7 May 2024 at 09:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:
SAIKIRAN AVULA <avulasaikiranreddy@gmail.com> writes:
I have been working on partitioned tables recently, and I have noticed
something that doesn't seem correct with the EXPLAIN output of an
update/delete query with a returning list.What do you think is not right exactly? The output has to use some
one of the correlation names for the partitioned table. I think
it generally chooses the one corresponding to the first Append arm,
but really any would be good enough for EXPLAIN's purposes.
Also looks harmless to me. But just a slight correction, you're
talking about the deparse Append condition that's in
set_deparse_plan(). Whereas the code that controls this for the
returningList is the following in nodeModifyTable.c:
/*
* Initialize result tuple slot and assign its rowtype using the first
* RETURNING list. We assume the rest will look the same.
*/
mtstate->ps.plan->targetlist = (List *) linitial(node->returningLists);
David