Segfault in RI UPDATE CASCADE on partitioned tables with LIKE+ATTACH child (attnum drift)

Started by Dmitry Fomin8 months ago15 messagesbugs

fomin.list@gmail.com

8 months ago

Hello,
We’re seeing a backend segfault when an ON UPDATE CASCADE fires across partitions, if the destination partition was created via CREATE TABLE … LIKE + ATTACH (so its physical tuple descriptor differs from the parent due to dropped-column tombstones/attnum drift). Names/types match by inspection, but the crash occurs during tuple materialization in the RI trigger execution.

Minimal Reproducer (self-contained)

DROP SCHEMA IF EXISTS t CASCADE;
CREATE SCHEMA t;

-- Pipelines (partitioned)
CREATE TABLE t.pipelines (
partition_id int NOT NULL,
id bigint NOT NULL,
PRIMARY KEY (partition_id, id)
) PARTITION BY LIST (partition_id);
CREATE TABLE t.pipelines_102 PARTITION OF t.pipelines FOR VALUES IN (102);
CREATE TABLE t.pipelines_50 PARTITION OF t.pipelines FOR VALUES IN (50);

-- Stages (partitioned) with ON UPDATE CASCADE to pipelines.
-- Create a mid column and drop it to leave a tombstone gap in attnums.
CREATE TABLE t.stages (
partition_id int NOT NULL,
id bigint NOT NULL,
tmp_mid int, -- dropped below, leaves parent attnum gap
pipeline_id bigint NOT NULL,
name text,
status int,
PRIMARY KEY (partition_id, id),
FOREIGN KEY (partition_id, pipeline_id)
REFERENCES t.pipelines(partition_id, id)
ON UPDATE CASCADE ON DELETE CASCADE
) PARTITION BY LIST (partition_id);
CREATE TABLE t.stages_102 PARTITION OF t.stages FOR VALUES IN (102);
ALTER TABLE t.stages DROP COLUMN tmp_mid;

-- Miscreate destination stage partition via LIKE + ATTACH (no tombstone, different attnums).
CREATE TABLE t.stages_50_like (LIKE t.stages INCLUDING DEFAULTS);
ALTER TABLE t.stages ATTACH PARTITION t.stages_50_like FOR VALUES IN (50);

-- Builds (partitioned), cascades to both stages and pipelines.
CREATE TABLE t.builds (
partition_id int NOT NULL,
id bigint NOT NULL,
stage_id bigint NOT NULL,
commit_id bigint NOT NULL,
PRIMARY KEY (partition_id, id),
FOREIGN KEY (partition_id, stage_id)
REFERENCES t.stages(partition_id, id)
ON UPDATE CASCADE ON DELETE CASCADE,
FOREIGN KEY (partition_id, commit_id)
REFERENCES t.pipelines(partition_id, id)
ON UPDATE CASCADE ON DELETE CASCADE
) PARTITION BY LIST (partition_id);
CREATE TABLE t.builds_102 PARTITION OF t.builds FOR VALUES IN (102);
CREATE TABLE t.builds_50 PARTITION OF t.builds FOR VALUES IN (50);

-- Seed rows in source partition 102.
INSERT INTO t.pipelines_102(partition_id, id) VALUES (102, 1);
INSERT INTO t.stages_102 (partition_id, id, pipeline_id, name, status)
VALUES (102, 10, 1, 's', 0);
INSERT INTO t.builds_102 (partition_id, id, stage_id, commit_id)
VALUES (102, 100, 10, 1);

-- Crash repro: cascaded UPDATE across partitions
UPDATE t.pipelines
SET partition_id = 50
WHERE partition_id = 102 AND id = 1;

server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

Postgres logs:

2025-10-16 09:23:08.535 UTC [18630] LOG: client backend (PID 18673) was terminated by signal 11: Segmentation fault
2025-10-16 09:23:08.535 UTC [18630] DETAIL: Failed process was running: UPDATE t.pipelines
SET partition_id = 50
WHERE partition_id = 102 AND id = 1;

Environment
PostgreSQL:
postgres=# SHOW server_version;
server_version
----------------
18.0
(1 row)

postgres=# SHOW server_version_num;
server_version_num
--------------------
180000
(1 row)

postgres=# SELECT version();
version
----------------------------------------------------------------------------------------------------------
PostgreSQL 18.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5), 64-bit
(1 row)

postgres=# SHOW shared_preload_libraries;
shared_preload_libraries
--------------------------

(1 row)

postgres=# SELECT extname, extversion FROM pg_extension ORDER BY 1;
extname | extversion
---------+------------
plpgsql | 1.0
(1 row)

OS/Kernel/Libc:
[root@postgres-source ~]# uname -a
Linux postgres-source 5.14.0-427.22.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Jun 19 04:14:38 PDT 2024 x86_64 x86_64 x86_64 GNU/Linux
[root@postgres-source ~]# cat /etc/os-release
NAME="Oracle Linux Server"
VERSION="9.3"
ID="ol"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="9.3"
PLATFORM_ID="platform:el9"
PRETTY_NAME="Oracle Linux Server 9.3"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:oracle:linux:9:3:server"
HOME_URL="https://linux.oracle.com/"
BUG_REPORT_URL="https://github.com/oracle/oracle-linux"

ORACLE_BUGZILLA_PRODUCT="Oracle Linux 9"
ORACLE_BUGZILLA_PRODUCT_VERSION=9.3
ORACLE_SUPPORT_PRODUCT="Oracle Linux"
ORACLE_SUPPORT_PRODUCT_VERSION=9.3
[root@postgres-source ~]# ldd --version
ldd (GNU libc) 2.34
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

Backrace in attachment

Issue is reproducible at least in Postgres 16, 17, 18

Please let me know if I need to provide some other information

—
BR Dmitry

David Rowley

dgrowleyml@gmail.com

8 months ago

In reply to: Dmitry Fomin (#1)

Re: Segfault in RI UPDATE CASCADE on partitioned tables with LIKE+ATTACH child (attnum drift)

On Fri, 17 Oct 2025 at 10:53, Dmitry Fomin <fomin.list@gmail.com> wrote:

-- Crash repro: cascaded UPDATE across partitions
UPDATE t.pipelines
SET partition_id = 50
WHERE partition_id = 102 AND id = 1;

server closed the connection unexpectedly

Thanks for the detailed report. It seems to have been caused by
ba9a7e3921. For some reason the ResultRelInfo.ri_RootResultRelInfo
isn't set for this partition which causes ExecGetChildToRootMap() to
think no translation is required.

More to come...

David

David Rowley

dgrowleyml@gmail.com

8 months ago

In reply to: David Rowley (#2)

Re: Segfault in RI UPDATE CASCADE on partitioned tables with LIKE+ATTACH child (attnum drift)

On Fri, 17 Oct 2025 at 14:21, David Rowley <dgrowleyml@gmail.com> wrote:

Thanks for the detailed report. It seems to have been caused by
ba9a7e3921. For some reason the ResultRelInfo.ri_RootResultRelInfo
isn't set for this partition which causes ExecGetChildToRootMap() to
think no translation is required.

The problem is with the ResultRelInfo caching that's in
ExecGetTriggerResultRel(). The code there tries looking into the
estate's es_opened_result_relations, es_tuple_routing_result_relations
and es_trig_target_relations Lists to see if the ResultRelInfo was
created before. In the problem case the ResultRelInfo is found in
es_trig_target_relations and unfortunately it's been set up by some
code in afterTriggerInvokeEvents() which passes a NULL rootRelInfo.
This means when ExecGetTriggerResultRel() is called again this time
passing the correct rootRelInfo, the cached one that has the NULL
ri_RootResultRelInfo is found and returned.

This results in the ExecGetChildToRootMap() code doing the wrong thing
because it sees a NULL ri_RootResultRelInfo therefore does not
translate the slot into the slow format of the partitioned table.

I've attached a patch which fixes the problem. I'm just not sure if
it's the right fix for the problem. I suspect the real problem is down
to the fact that ExecGetTriggerResultRel() passes a NULL rootRelInfo
in the first place. I just don't see a good way to figure out what the
parent table should be so we know to create a parent ResultRelInfo as
the trigger that is firing is for the partition, not the partitioned
table. I don't see any way to figure out that the trigger is being
fired because it's cascading an update of its parent partitioned
table... At a guess, it feels like there might be some fields missing
in AfterTriggerShared to figure this out.

Any thoughts on this Amit?

David

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

8 months ago

In reply to: David Rowley (#3)

Re: Segfault in RI UPDATE CASCADE on partitioned tables with LIKE+ATTACH child (attnum drift)

On Fri, Oct 17, 2025 at 6:08 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Fri, 17 Oct 2025 at 14:21, David Rowley <dgrowleyml@gmail.com> wrote:

Thanks for the detailed report. It seems to have been caused by
ba9a7e3921. For some reason the ResultRelInfo.ri_RootResultRelInfo
isn't set for this partition which causes ExecGetChildToRootMap() to
think no translation is required.

The problem is with the ResultRelInfo caching that's in
ExecGetTriggerResultRel(). The code there tries looking into the
estate's es_opened_result_relations, es_tuple_routing_result_relations
and es_trig_target_relations Lists to see if the ResultRelInfo was
created before. In the problem case the ResultRelInfo is found in
es_trig_target_relations and unfortunately it's been set up by some
code in afterTriggerInvokeEvents() which passes a NULL rootRelInfo.
This means when ExecGetTriggerResultRel() is called again this time
passing the correct rootRelInfo, the cached one that has the NULL
ri_RootResultRelInfo is found and returned.

This results in the ExecGetChildToRootMap() code doing the wrong thing
because it sees a NULL ri_RootResultRelInfo therefore does not
translate the slot into the slow format of the partitioned table.

I've attached a patch which fixes the problem. I'm just not sure if
it's the right fix for the problem. I suspect the real problem is down
to the fact that ExecGetTriggerResultRel() passes a NULL rootRelInfo
in the first place. I just don't see a good way to figure out what the
parent table should be so we know to create a parent ResultRelInfo as
the trigger that is firing is for the partition, not the partitioned
table. I don't see any way to figure out that the trigger is being
fired because it's cascading an update of its parent partitioned
table... At a guess, it feels like there might be some fields missing
in AfterTriggerShared to figure this out.

Any thoughts on this Amit?

Thanks for the off-list heads-up, David.

I'll need to play around with Dmitry's test case a bit before I can be
sure, but I agree with your suspicion -- something's not right if a
result rel without a root rel set ends up in a path where tuple
conversion matters.

--
Thanks, Amit Langote

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

8 months ago

In reply to: Amit Langote (#4)

Re: Segfault in RI UPDATE CASCADE on partitioned tables with LIKE+ATTACH child (attnum drift)

On Fri, Oct 17, 2025 at 7:36 PM Amit Langote <amitlangote09@gmail.com> wrote:

On Fri, Oct 17, 2025 at 6:08 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Fri, 17 Oct 2025 at 14:21, David Rowley <dgrowleyml@gmail.com> wrote:

Thanks for the detailed report. It seems to have been caused by
ba9a7e3921. For some reason the ResultRelInfo.ri_RootResultRelInfo
isn't set for this partition which causes ExecGetChildToRootMap() to
think no translation is required.

The problem is with the ResultRelInfo caching that's in
ExecGetTriggerResultRel(). The code there tries looking into the
estate's es_opened_result_relations, es_tuple_routing_result_relations
and es_trig_target_relations Lists to see if the ResultRelInfo was
created before. In the problem case the ResultRelInfo is found in
es_trig_target_relations and unfortunately it's been set up by some
code in afterTriggerInvokeEvents() which passes a NULL rootRelInfo.
This means when ExecGetTriggerResultRel() is called again this time
passing the correct rootRelInfo, the cached one that has the NULL
ri_RootResultRelInfo is found and returned.

This results in the ExecGetChildToRootMap() code doing the wrong thing
because it sees a NULL ri_RootResultRelInfo therefore does not
translate the slot into the slow format of the partitioned table.

I've attached a patch which fixes the problem. I'm just not sure if
it's the right fix for the problem. I suspect the real problem is down
to the fact that ExecGetTriggerResultRel() passes a NULL rootRelInfo
in the first place. I just don't see a good way to figure out what the
parent table should be so we know to create a parent ResultRelInfo as
the trigger that is firing is for the partition, not the partitioned
table. I don't see any way to figure out that the trigger is being
fired because it's cascading an update of its parent partitioned
table... At a guess, it feels like there might be some fields missing
in AfterTriggerShared to figure this out.

Any thoughts on this Amit?

Thanks for the off-list heads-up, David.

I'll need to play around with Dmitry's test case a bit before I can be
sure, but I agree with your suspicion -- something's not right if a
result rel without a root rel set ends up in a path where tuple
conversion matters.

What seems to be happening is that trigger firings of the FK child
tables run under the parent table’s UPDATE EState, because the child
table’s trigger events are queued under SPI in ri_triggers.c. That
code doesn’t execute the child’s triggers directly -- it defers them
to run later under the parent’s context. As a result, the EState used
at execution time lacks child ResultRelInfos with a root rel set, and
new ones end up being created and added to es_trig_target_relations.

In this case, the first event that creates a rootless entry in
es_trig_result_relations is the INSERT side of the cross-partition
update on stages, fired for its FK into pipelines. The later UPDATE
event is for builds, whose FK references stages. When that UPDATE
runs, ExecGetTriggerResultRel() looks up the destination result rel
for stages and finds the rootless one that was built for the earlier
INSERT. Because its ri_RootResultRelInfo is NULL,
ExecGetChildToRootMap() skips tuple translation and crashes. That
seems to explain the “something off” we both suspected.

For the fix, I tried a slightly different approach from your patch.
Instead of updating the cached entry to always set
ri_RootResultRelInfo = rootRelInfo on a cache hit, I made
ExecGetTriggerResultRel() avoid reusing a rootless ResultRelInfo when
a rooted one is expected. Concretely, I added assertions on the two
primary lists and tightened the es_trig_target_relations lookup to
only return an entry when ri_RootResultRelInfo == rootRelInfo or the
caller’s rootRelInfo is NULL. That prevents the rootless destination
built for the INSERT on stages -> pipelines from being reused for the
later UPDATE from builds -> stages.

This seems a bit safer since it keeps cached entries immutable and
surfaces mismatches via asserts rather than mutating shared state.

Thoughts?

--
Thanks, Amit Langote

David Rowley

dgrowleyml@gmail.com

8 months ago

In reply to: Amit Langote (#5)

Re: Segfault in RI UPDATE CASCADE on partitioned tables with LIKE+ATTACH child (attnum drift)

On Tue, 21 Oct 2025 at 02:48, Amit Langote <amitlangote09@gmail.com> wrote:

For the fix, I tried a slightly different approach from your patch.
Instead of updating the cached entry to always set
ri_RootResultRelInfo = rootRelInfo on a cache hit, I made
ExecGetTriggerResultRel() avoid reusing a rootless ResultRelInfo when
a rooted one is expected. Concretely, I added assertions on the two
primary lists and tightened the es_trig_target_relations lookup to
only return an entry when ri_RootResultRelInfo == rootRelInfo or the
caller’s rootRelInfo is NULL. That prevents the rootless destination
built for the INSERT on stages -> pipelines from being reused for the
later UPDATE from builds -> stages.

  foreach(l, estate->es_trig_target_relations)
  {
  rInfo = (ResultRelInfo *) lfirst(l);
- if (RelationGetRelid(rInfo->ri_RelationDesc) == relid)
+ if (RelationGetRelid(rInfo->ri_RelationDesc) == relid &&
+ (rInfo->ri_RootResultRelInfo == rootRelInfo ||
+ rootRelInfo == NULL))

I don't understand the || rootRelInfo == NULL part. This would break
if we first created the ResultRelInfo with a parent then asked for
another one with no parent.

This seems a bit safer since it keeps cached entries immutable and
surfaces mismatches via asserts rather than mutating shared state.

I think creating separate ResultRelInfos is probably a safer bet.
That's what I ended up doing in [1]https://github.com/postgres/postgres/compare/master...david-rowley:postgres:fix_resultrelinfo_caching after posting my initial patch
(which was intended to highlight the problem area.)

David

[1]: https://github.com/postgres/postgres/compare/master...david-rowley:postgres:fix_resultrelinfo_caching

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

8 months ago

In reply to: David Rowley (#6)

Re: Segfault in RI UPDATE CASCADE on partitioned tables with LIKE+ATTACH child (attnum drift)

On Tue, Oct 21, 2025 at 4:28 AM David Rowley <dgrowleyml@gmail.com> wrote:

On Tue, 21 Oct 2025 at 02:48, Amit Langote <amitlangote09@gmail.com> wrote:

For the fix, I tried a slightly different approach from your patch.
Instead of updating the cached entry to always set
ri_RootResultRelInfo = rootRelInfo on a cache hit, I made
ExecGetTriggerResultRel() avoid reusing a rootless ResultRelInfo when
a rooted one is expected. Concretely, I added assertions on the two
primary lists and tightened the es_trig_target_relations lookup to
only return an entry when ri_RootResultRelInfo == rootRelInfo or the
caller’s rootRelInfo is NULL. That prevents the rootless destination
built for the INSERT on stages -> pipelines from being reused for the
later UPDATE from builds -> stages.
foreach(l, estate->es_trig_target_relations)
{
rInfo = (ResultRelInfo *) lfirst(l);
- if (RelationGetRelid(rInfo->ri_RelationDesc) == relid)
+ if (RelationGetRelid(rInfo->ri_RelationDesc) == relid &&
+ (rInfo->ri_RootResultRelInfo == rootRelInfo ||
+ rootRelInfo == NULL))
I don't understand the || rootRelInfo == NULL part. This would break
if we first created the ResultRelInfo with a parent then asked for
another one with no parent.

You are right, the || rootRelInfo == NULL check in
es_trig_target_relations was too permissive, it could indeed return a
rooted entry when the caller explicitly passed NULL. I’ve updated the
patch so that es_trig_target_relations now requires strict equality of
ri_RootResultRelInfo.

For es_opened_result_relations and es_tuple_routing_result_relations,
I kept the looser asserts. These lookups can be made with a NULL
rootRelInfo, as in the initial call within afterTriggerInvokeEvents()
that isn’t assuming a child context. In that case, returning a rooted
entry is harmless since it’s only used to fetch trigger metadata, not
for tuple translation. I didn’t turn those asserts into runtime
conditions, because doing so would prevent legitimate reuse in those
safe contexts.

This seems a bit safer since it keeps cached entries immutable and
surfaces mismatches via asserts rather than mutating shared state.

I think creating separate ResultRelInfos is probably a safer bet.
That's what I ended up doing in [1] after posting my initial patch
(which was intended to highlight the problem area.)

[1] https://github.com/postgres/postgres/compare/master...david-rowley:postgres:fix_resultrelinfo_caching

Ah, good to know. I borrowed a bit from your comment in that commit
to update the function header, so it now briefly explains the
rootRelInfo matching rules and the distinction between the lists.

Updated patch attached.

--
Thanks, Amit Langote

David Rowley

dgrowleyml@gmail.com

8 months ago

In reply to: Amit Langote (#7)

Re: Segfault in RI UPDATE CASCADE on partitioned tables with LIKE+ATTACH child (attnum drift)

On Tue, 21 Oct 2025 at 14:28, Amit Langote <amitlangote09@gmail.com> wrote:

On Tue, Oct 21, 2025 at 4:28 AM David Rowley <dgrowleyml@gmail.com> wrote:

I don't understand the || rootRelInfo == NULL part. This would break
if we first created the ResultRelInfo with a parent then asked for
another one with no parent.

You are right, the || rootRelInfo == NULL check in
es_trig_target_relations was too permissive, it could indeed return a
rooted entry when the caller explicitly passed NULL. I’ve updated the
patch so that es_trig_target_relations now requires strict equality of
ri_RootResultRelInfo.

Now I don't understand why you left the Asserts with the rootRelInfo
== NULL check. If the ri_RootResultRelInfo changes compared to the
cached one in the es_opened_result_relations or
es_tuple_routing_result_relations then we're subseptable to the very
same bug we're aiming to fix here.

As far as I see it, there's two options:

1) Use Assert(rInfo->ri_RootResultRelInfo == rootRelInfo) for the
es_opened_result_relations and es_tuple_routing_result_relations; or
2) Treat all 3 lists the same and put "&& rInfo->ri_RootResultRelInfo
== rootRelInfo" into the "if" check.

I was a bit uncertain if #1 can be done as I wasn't sure if we could
have the ResultRelInfo we need for a partition movement originating
from a trigger already in the es_opened_result_relations list, but I
played around for a bit and I came up with the following which will
Assert fail if we did the Asserts for #1.

begin;

create table pt (a int primary key, b int references pt (a) on update
cascade on delete cascade) partition by list(a);
create table pt1 partition of pt for values in(1);
create table pt2 partition of pt for values in(2);

insert into pt values(1,null);

update pt set b = 1;
update pt set b = 2;

rollback;

I'm pretty certain that #2 is the correct answer.

I've attached a v4 patch which does #2, adds tests and fixes the
outdated header comment in ExecGetTriggerResultRel().

David

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

8 months ago

In reply to: David Rowley (#8)

Re: Segfault in RI UPDATE CASCADE on partitioned tables with LIKE+ATTACH child (attnum drift)

On Tue, Oct 21, 2025 at 2:20 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Tue, 21 Oct 2025 at 14:28, Amit Langote <amitlangote09@gmail.com> wrote:

On Tue, Oct 21, 2025 at 4:28 AM David Rowley <dgrowleyml@gmail.com> wrote:

I don't understand the || rootRelInfo == NULL part. This would break
if we first created the ResultRelInfo with a parent then asked for
another one with no parent.

You are right, the || rootRelInfo == NULL check in
es_trig_target_relations was too permissive, it could indeed return a
rooted entry when the caller explicitly passed NULL. I’ve updated the
patch so that es_trig_target_relations now requires strict equality of
ri_RootResultRelInfo.

Now I don't understand why you left the Asserts with the rootRelInfo
== NULL check. If the ri_RootResultRelInfo changes compared to the
cached one in the es_opened_result_relations or
es_tuple_routing_result_relations then we're subseptable to the very
same bug we're aiming to fix here.

Hmm, I had assumed non-NULL mismatches could not occur within one
EState, but with multiple ModifyTableStates (e.g., a CTE update plus
the main update) the same child relid can be opened under different
parent ResultRelInfos. For example:

WITH cte AS (UPDATE parted_cp_update_bug_2_p1 SET a = a) UPDATE
parted_cp_update_bug_1 SET a = 2 WHERE a = 1;

In that case a cached entry’s ri_RootResultRelInfo can differ from the
rootRelInfo passed in, so returning a relid match is unsafe (crashes
under v3!). I agree the assert is not appropriate even ignoring the
NULL clause. IOW, I agree with folding ri_RootResultRelInfo ==
rootRelInfo into the condition across all three lists (your #2).

I've attached a v4 patch which does #2, adds tests and fixes the
outdated header comment in ExecGetTriggerResultRel().

Not sure if you missed it, but I had added tests in v2/v3 mirroring
Dmitry's case.

In the attached v5, I’ve updated the test case to use the no-op CTE I
mentioned above, trimmed the test case code a bit, and modified it to
use a new schema like the surrounding tests. I also updated the
comment in ExecGetTriggerResultRel() as you did, and moved the
additional-check explanation into the header comment.

--
Thanks, Amit Langote

#10

David Rowley

dgrowleyml@gmail.com

8 months ago

In reply to: Amit Langote (#9)

Re: Segfault in RI UPDATE CASCADE on partitioned tables with LIKE+ATTACH child (attnum drift)

On Tue, 21 Oct 2025 at 22:12, Amit Langote <amitlangote09@gmail.com> wrote:

On Tue, Oct 21, 2025 at 2:20 PM David Rowley <dgrowleyml@gmail.com> wrote:

I've attached a v4 patch which does #2, adds tests and fixes the
outdated header comment in ExecGetTriggerResultRel().

Not sure if you missed it, but I had added tests in v2/v3 mirroring
Dmitry's case.

I did miss that. You didn't mention it in the email and I didn't look
down far enough to see them.

In the attached v5, I’ve updated the test case to use the no-op CTE I
mentioned above, trimmed the test case code a bit, and modified it to
use a new schema like the surrounding tests. I also updated the
comment in ExecGetTriggerResultRel() as you did, and moved the
additional-check explanation into the header comment.

Thanks. The CTE test seems worthwhile since it can reuse the same
tables and results in using the es_opened_result_relations List. I've
included it. I also tidied up the tests a bit more. I looked at your
v5 test case and saw it hadn't gone through the same simplification
process as I put my version through (making tables non-partitioned
when they don't need to be to trigger the issue), so what's in v6 is
my version.

I've attached v6. If you have other proposed changes, would you be
able to send them as a diff based on my patch? We seem to have got
some split-brain and the patches have diverged a little and that
resulted in some of my v4 changes being lost.

I also checked to see if there were any changes around what's shown in
the Triggers portion of EXPLAIN ANALYZE for Dmitry's query. Both
patched and unpatched EXPLAIN ANALYZE are the same. I'm not yet clear
on if it's possible something could be different there now for some
other cases that cause additional ResultRelInfos to be added to the
lists.

David

#11

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

8 months ago

In reply to: David Rowley (#10)

Re: Segfault in RI UPDATE CASCADE on partitioned tables with LIKE+ATTACH child (attnum drift)

On Tue, Oct 21, 2025 at 8:22 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Tue, 21 Oct 2025 at 22:12, Amit Langote <amitlangote09@gmail.com> wrote:

On Tue, Oct 21, 2025 at 2:20 PM David Rowley <dgrowleyml@gmail.com> wrote:

I've attached a v4 patch which does #2, adds tests and fixes the
outdated header comment in ExecGetTriggerResultRel().

Not sure if you missed it, but I had added tests in v2/v3 mirroring
Dmitry's case.

I did miss that. You didn't mention it in the email and I didn't look
down far enough to see them.

Ok, sorry about that.

In the attached v5, I’ve updated the test case to use the no-op CTE I
mentioned above, trimmed the test case code a bit, and modified it to
use a new schema like the surrounding tests. I also updated the
comment in ExecGetTriggerResultRel() as you did, and moved the
additional-check explanation into the header comment.

Thanks. The CTE test seems worthwhile since it can reuse the same
tables and results in using the es_opened_result_relations List. I've
included it. I also tidied up the tests a bit more. I looked at your
v5 test case and saw it hadn't gone through the same simplification
process as I put my version through (making tables non-partitioned
when they don't need to be to trigger the issue), so what's in v6 is
my version.

Ah, yes, the 3rd table doesn't need to be partitioned.

I've attached v6. If you have other proposed changes, would you be
able to send them as a diff based on my patch? We seem to have got
some split-brain

Sorry about that.

and the patches have diverged a little and that
resulted in some of my v4 changes being lost.

Besides test suite being entirely different in my v5, the only change
from your v4 was rewording and moving the new comment in
ExecGetTriggerResultRel() into its header comment:

@@ -1337,6 +1337,12 @@ InitResultRelInfo(ResultRelInfo *resultRelInfo,
  * also provides a way for EXPLAIN ANALYZE to report the runtimes of such
  * triggers.)  So we make additional ResultRelInfo's as needed, and save them
  * in es_trig_target_relations.
+ *
+ * When reusing cached entries from any of these lists, require an exact
+ * match of ri_RootResultRelInfo and the rootRelInfo passed by the caller.
+ * This avoids returning an entry created for a different parent context
+ * and ensures child->parent translation is neither skipped nor applied
+ * incorrectly.  If no exact match exists, build a new ResultRelInfo.
  */
 ResultRelInfo *
 ExecGetTriggerResultRel(EState *estate, Oid relid,
@@ -1347,14 +1353,6 @@ ExecGetTriggerResultRel(EState *estate, Oid relid,
     Relation    rel;
     MemoryContext oldcontext;

- /*
- * Before creating a new ResultRelInfo, check if we've already made and
- * cached one for this relation. We must ensure that the given
- * 'rootRelInfo' matches the one stored in the cached ResultRelInfo as
- * trigger handling for partitions can result in mixed requirements for
- * what ri_RootResultRelInfo is set to.
- */
-

Please take it if you think that's a good change. That said, I don't
have any issue with your v6, code or tests.

I also checked to see if there were any changes around what's shown in
the Triggers portion of EXPLAIN ANALYZE for Dmitry's query. Both
patched and unpatched EXPLAIN ANALYZE are the same. I'm not yet clear
on if it's possible something could be different there now for some
other cases that cause additional ResultRelInfos to be added to the
lists.

Good point.

I made report_triggers() print which list each ResultRelInfo came from
(with the attached). It looks like the entries that could have been
affected are always in es_trig_target_relations anyway, so the change
in ExecGetTriggerResultRel() doesn’t alter what EXPLAIN shows under
“Triggers.” I was still a bit surprised, since I expected there might
now be an extra entry after we stopped reusing a child entry with a
mismatching ri_RootResultRelInfo.

I do see child relations listed under “Triggers,” but only when they
have trigger events of their own, not when they’re processed as child
relations of a root after a CP_UPDATE event. In those cases,
afterTriggerInvokeEvents() only passes the instrumentation for the
root rel to AfterTriggerExecute(), so the child entries don’t have
their ri_TrigInstrument updated and don’t appear in the EXPLAIN
output.

--
Thanks, Amit Langote

#12

David Rowley

dgrowleyml@gmail.com

8 months ago

In reply to: Amit Langote (#11)

Re: Segfault in RI UPDATE CASCADE on partitioned tables with LIKE+ATTACH child (attnum drift)

On Wed, 22 Oct 2025 at 02:49, Amit Langote <amitlangote09@gmail.com> wrote:

On Tue, Oct 21, 2025 at 8:22 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Tue, 21 Oct 2025 at 22:12, Amit Langote <amitlangote09@gmail.com> wrote:

On Tue, Oct 21, 2025 at 2:20 PM David Rowley <dgrowleyml@gmail.com> wrote:

I've attached a v4 patch which does #2, adds tests and fixes the
outdated header comment in ExecGetTriggerResultRel().

Not sure if you missed it, but I had added tests in v2/v3 mirroring
Dmitry's case.

I did miss that. You didn't mention it in the email and I didn't look
down far enough to see them.

Ok, sorry about that.

In the attached v5, I’ve updated the test case to use the no-op CTE I
mentioned above, trimmed the test case code a bit, and modified it to
use a new schema like the surrounding tests. I also updated the
comment in ExecGetTriggerResultRel() as you did, and moved the
additional-check explanation into the header comment.

Thanks. The CTE test seems worthwhile since it can reuse the same
tables and results in using the es_opened_result_relations List. I've
included it. I also tidied up the tests a bit more. I looked at your
v5 test case and saw it hadn't gone through the same simplification
process as I put my version through (making tables non-partitioned
when they don't need to be to trigger the issue), so what's in v6 is
my version.

Ah, yes, the 3rd table doesn't need to be partitioned.

I've attached v6. If you have other proposed changes, would you be
able to send them as a diff based on my patch? We seem to have got
some split-brain

Sorry about that.

and the patches have diverged a little and that
resulted in some of my v4 changes being lost.

Besides test suite being entirely different in my v5, the only change
from your v4 was rewording and moving the new comment in
ExecGetTriggerResultRel() into its header comment:
@@ -1337,6 +1337,12 @@ InitResultRelInfo(ResultRelInfo *resultRelInfo,
* also provides a way for EXPLAIN ANALYZE to report the runtimes of such
* triggers.)  So we make additional ResultRelInfo's as needed, and save them
* in es_trig_target_relations.
+ *
+ * When reusing cached entries from any of these lists, require an exact
+ * match of ri_RootResultRelInfo and the rootRelInfo passed by the caller.
+ * This avoids returning an entry created for a different parent context
+ * and ensures child->parent translation is neither skipped nor applied
+ * incorrectly.  If no exact match exists, build a new ResultRelInfo.
*/
ResultRelInfo *
ExecGetTriggerResultRel(EState *estate, Oid relid,
@@ -1347,14 +1353,6 @@ ExecGetTriggerResultRel(EState *estate, Oid relid,
Relation    rel;
MemoryContext oldcontext;
- /*
- * Before creating a new ResultRelInfo, check if we've already made and
- * cached one for this relation. We must ensure that the given
- * 'rootRelInfo' matches the one stored in the cached ResultRelInfo as
- * trigger handling for partitions can result in mixed requirements for
- * what ri_RootResultRelInfo is set to.
- */
-

Please take it if you think that's a good change. That said, I don't
have any issue with your v6, code or tests.

The reason I like the comment being within the function body is that I
didn't see any need to tell the caller to care about this. As far as
the caller should be concerned, the function should return a correctly
set up ResultRelInfo. The inner details of how the caching works feels
like it should be implementation details, and therefore comments about
that feel better placed inside the function body.

I also checked to see if there were any changes around what's shown in
the Triggers portion of EXPLAIN ANALYZE for Dmitry's query. Both
patched and unpatched EXPLAIN ANALYZE are the same. I'm not yet clear
on if it's possible something could be different there now for some
other cases that cause additional ResultRelInfos to be added to the
lists.

Good point.

I made report_triggers() print which list each ResultRelInfo came from
(with the attached). It looks like the entries that could have been
affected are always in es_trig_target_relations anyway, so the change
in ExecGetTriggerResultRel() doesn’t alter what EXPLAIN shows under
“Triggers.” I was still a bit surprised, since I expected there might
now be an extra entry after we stopped reusing a child entry with a
mismatching ri_RootResultRelInfo.

I do see child relations listed under “Triggers,” but only when they
have trigger events of their own, not when they’re processed as child
relations of a root after a CP_UPDATE event. In those cases,
afterTriggerInvokeEvents() only passes the instrumentation for the
root rel to AfterTriggerExecute(), so the child entries don’t have
their ri_TrigInstrument updated and don’t appear in the EXPLAIN
output.

That's good to know, so sounds like we're safe from listing any
additional ResultRelInfos there.

I'm happy to go with v6.

David

#13

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

8 months ago

In reply to: David Rowley (#12)

Re: Segfault in RI UPDATE CASCADE on partitioned tables with LIKE+ATTACH child (attnum drift)

On Wed, Oct 22, 2025 at 5:29 AM David Rowley <dgrowleyml@gmail.com> wrote:

On Wed, 22 Oct 2025 at 02:49, Amit Langote <amitlangote09@gmail.com> wrote:
Besides test suite being entirely different in my v5, the only change
from your v4 was rewording and moving the new comment in
ExecGetTriggerResultRel() into its header comment:
@@ -1337,6 +1337,12 @@ InitResultRelInfo(ResultRelInfo *resultRelInfo,
* also provides a way for EXPLAIN ANALYZE to report the runtimes of such
* triggers.)  So we make additional ResultRelInfo's as needed, and save them
* in es_trig_target_relations.
+ *
+ * When reusing cached entries from any of these lists, require an exact
+ * match of ri_RootResultRelInfo and the rootRelInfo passed by the caller.
+ * This avoids returning an entry created for a different parent context
+ * and ensures child->parent translation is neither skipped nor applied
+ * incorrectly.  If no exact match exists, build a new ResultRelInfo.
*/
ResultRelInfo *
ExecGetTriggerResultRel(EState *estate, Oid relid,
@@ -1347,14 +1353,6 @@ ExecGetTriggerResultRel(EState *estate, Oid relid,
Relation    rel;
MemoryContext oldcontext;
- /*
- * Before creating a new ResultRelInfo, check if we've already made and
- * cached one for this relation. We must ensure that the given
- * 'rootRelInfo' matches the one stored in the cached ResultRelInfo as
- * trigger handling for partitions can result in mixed requirements for
- * what ri_RootResultRelInfo is set to.
- */
-

Please take it if you think that's a good change. That said, I don't
have any issue with your v6, code or tests.
The reason I like the comment being within the function body is that I
didn't see any need to tell the caller to care about this. As far as
the caller should be concerned, the function should return a correctly
set up ResultRelInfo. The inner details of how the caching works feels
like it should be implementation details, and therefore comments about
that feel better placed inside the function body.

Makes sense.

I also checked to see if there were any changes around what's shown in
the Triggers portion of EXPLAIN ANALYZE for Dmitry's query. Both
patched and unpatched EXPLAIN ANALYZE are the same. I'm not yet clear
on if it's possible something could be different there now for some
other cases that cause additional ResultRelInfos to be added to the
lists.

Good point.

I made report_triggers() print which list each ResultRelInfo came from
(with the attached). It looks like the entries that could have been
affected are always in es_trig_target_relations anyway, so the change
in ExecGetTriggerResultRel() doesn’t alter what EXPLAIN shows under
“Triggers.” I was still a bit surprised, since I expected there might
now be an extra entry after we stopped reusing a child entry with a
mismatching ri_RootResultRelInfo.

I do see child relations listed under “Triggers,” but only when they
have trigger events of their own, not when they’re processed as child
relations of a root after a CP_UPDATE event. In those cases,
afterTriggerInvokeEvents() only passes the instrumentation for the
root rel to AfterTriggerExecute(), so the child entries don’t have
their ri_TrigInstrument updated and don’t appear in the EXPLAIN
output.

That's good to know, so sounds like we're safe from listing any
additional ResultRelInfos there.

I think so.

I'm happy to go with v6.

It looks good to me as well.

I also thought about whether the test case could be simplified further
but didn’t come up with anything that still reproduces the failure
cleanly.

I then tried to understand why we end up with an INSERT event on
stages_50_like (queued due to the FK from stages into pipelines)
during the cascaded update on its parent, which creates the rootless
ResultRelInfo in es_trig_target_relations. That same entry was later
reused for the parent’s CP_UPDATE event (queued due to the FK from
builds into stages), leading to the crash. I was curious why we queue
that INSERT event instead of a CP_UPDATE on the parent, since the
latter would have avoided the problem. When I hacked the code to do
that, the crash disappeared -- but that’s not a valid fix, since
FK-side triggers can’t be used with partitioned tables; they assume
heap relations, so queuing CP_UPDATE events for FK-side triggers
doesn’t work in general.

Anyway, just thought to share that here for the record, since it
helped me understand how that rootless ResultRelInfo ends up being
created in the first place.

--
Thanks, Amit Langote

#14

David Rowley

dgrowleyml@gmail.com

8 months ago

In reply to: Amit Langote (#13)

Re: Segfault in RI UPDATE CASCADE on partitioned tables with LIKE+ATTACH child (attnum drift)

On Fri, 24 Oct 2025 at 01:08, Amit Langote <amitlangote09@gmail.com> wrote:

On Wed, Oct 22, 2025 at 5:29 AM David Rowley <dgrowleyml@gmail.com> wrote:

I'm happy to go with v6.

It looks good to me as well.

Thanks for checking. Pushed.

David

#15

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

8 months ago

In reply to: David Rowley (#14)

Re: Segfault in RI UPDATE CASCADE on partitioned tables with LIKE+ATTACH child (attnum drift)

On Sun, Oct 26, 2025 at 7:04 AM David Rowley <dgrowleyml@gmail.com> wrote:

On Fri, 24 Oct 2025 at 01:08, Amit Langote <amitlangote09@gmail.com> wrote:

On Wed, Oct 22, 2025 at 5:29 AM David Rowley <dgrowleyml@gmail.com> wrote:

I'm happy to go with v6.

It looks good to me as well.

Thanks for checking. Pushed.

Thanks David.

--
Thanks, Amit Langote

Segfault in RI UPDATE CASCADE on partitioned tables with LIKE+ATTACH child (attnum drift)

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments: