PG19 FK fast path: OOB write and missed FK checks during batched

Started by Nikolay Samokhvalov13 days ago14 messageshackers
Jump to latest
#1Nikolay Samokhvalov
samokhvalov@gmail.com

Hi hackers,

The new FK existence-check fast path in ri_triggers.c (ri_FastPath*) runs
user-defined code in the middle of a deferred batch flush, which yields at
least three defects reachable by an unprivileged table owner. Present in
master and verified inREL_19_BETA1.

I identified these issues during recent security research with LLMs. While
they have clear security implications (OOB write, integrity bypass),
reporting them here because they are isolated to 19beta1, absent in PG18
and earlier; I don't have patches, only reproducibility.

Mechanism:

For an INSERT/UPDATE on the referencing side the fast path buffers rows in
a transaction-lived cache (ri_fastpath_cache, keyed by pg_constraint OID)
and probes the PK index in groups, flushing when a

per-constraint buffer reaches RI_FASTPATH_BATCH_SIZE (64) or when the

trigger-firing pass ends (ri_FastPathEndBatch, an
AfterTriggerBatchCallback). For a cross-type FK the flush calls the
column's cast function (ri_FastPathFlushArray, the FunctionCall3 at line
3069) and the equality operator -- arbitrary user code, mid-flush. Line
numbers below are from a REL_19_BETA1 build (commit 4b0bf07).

Unprivileged vehicle (defects 1 and 3). No superuser, no contrib: a
role creates
a type it owns and an IMPLICIT cast from it to the PK type with a PL/pgSQL
function, which ri_HashCompareOp wires into the fast path's cast

slot. Below uses a composite type. Default btree opclass, ordinary
single-column
FK, no GUC (fast path is unconditional for non-partitioned, non-temporal
FKs, per ri_fastpath_is_applicable).

1) ri_FastPathBatchAdd (line 2859): out-of-bounds write on re-entry

The write precedes the bound check, and batch_count is reset to 0 only at end
of flush (ri_FastPathBatchFlush, line 2971), so it is 64 throughout a
full-batch
flush:

fpentry->batch[fpentry->batch_count] = ExecCopySlotHeapTuple(newslot);

fpentry->batch_count++;

if (fpentry->batch_count >= RI_FASTPATH_BATCH_SIZE)

ri_FastPathBatchFlush(fpentry, fk_rel, riinfo);

There is no re-entrancy guard and ri_FastPathGetEntry returns the same entry,
so user code that does DML on the same table during a full-batch flush
re-enters with batch_count == 64 and writes batch[64], one past the

array, overwriting the adjacent batch_count field (struct layout, lines
250-251). A single re-entrant row only stomps batch_count, which is then reset
to 0 before reuse; the crash manifests once the re-entrant insert is

itself large enough to fill and flush a batch, so the stomped batch_count
is used as an array index (batch[garbage]) and as nvals in memset(matched,
0, nvals * sizeof(bool)) (line 3054).

Reproduction (non-superuser; reliable SIGSEGV on --enable-cassert -O0;
under -O2 the out-of-bounds write is of undefined effect):

create table parent(id int primary key);

insert into parent select g from generate_series(1,2000) g;

create type vch as (v int);

create function vcast(vch) returns int language plpgsql as $$

begin

if $1.v = 64 then

insert into child select row(g)::vch from
generate_series(1001,1064) g;

end if;

return $1.v;

end$$;

create cast (vch as int) with function vcast(vch) as implicit;

create table child(a vch);

alter table child add constraint child_fkey

foreign key (a) references parent(id);

insert into child select row(g)::vch from generate_series(1,64) g; --
crash

-- gdb: crash at ri_FastPathBatchAdd line 2866 with batch_count holding
a

-- stomped HeapTuple pointer's low bits, i.e. batch[64] overwrote

-- batch_count; backend SIGSEGVs and the cluster restarts.

2) ri_FastPathSubXactCallback (line 4208): batch dropped on subxact abort

On SUBXACT_EVENT_ABORT_SUB the callback discards the whole cache:

ri_fastpath_cache = NULL;

ri_fastpath_callback_registered = false;

But batch[] holds outstanding rows of the enclosing transaction, not
the aborting
subxact. An internal subxact abort during after-trigger firing (PL/pgSQL
BEGIN ... EXCEPTION) drops the buffered rows unflushed; their FK checks
never run and orphans commit behind a constraint that still reports itself
valid. No cast needed:

create table pk(id int primary key);

create table fk(a int, tag text);

insert into pk select g from generate_series(1,10) g;

alter table fk add constraint fk_a_fkey foreign key (a) references
pk(id);

create function abort_subxact() returns trigger language plpgsql as $$

begin

if NEW.tag = 'boom' then

begin perform 1/0; exception when others then null; end;

end if;

return NEW;

end$$;

create trigger fk_after after insert on fk

for each row execute function abort_subxact();

insert into fk values (999,'bad'),(0,'boom'),(1,'ok'),(2,'ok'),(3,'ok');

-- INSERT 0 5, no error

select f.a from fk f left join pk p on f.a=p.id where p.id is null;

-- a

-- -----

-- 999

-- 0 (orphans)

-- the constraint still reports itself valid, and re-validation passes

-- while the orphans remain:

select convalidated from pg_constraint where conname = 'fk_a_fkey';

-- convalidated

-- --------------

-- t

alter table fk validate constraint fk_a_fkey;

-- ALTER TABLE (succeeds; does not re-scan committed rows)

select f.a from fk f left join pk p on f.a=p.id where p.id is null;

-- 999, 0 (orphans still present)

Controls (no EXCEPTION; between-statement SAVEPOINT; DEFERRABLE
INITIALLY DEFERRED)
all behave correctly (FK violation raised, no orphans). The whole statement's
buffered batch is discarded, not just the aborting row's check. The abort
path also emits "WARNING: resource was not closed" (relation /

index / TupleDesc), a resource leak consistent with the missing flush.

3) ri_FastPathEndBatch (line 4133): cross-table re-entry drops a check

EndBatch flushes by iterating the cache with hash_seq_search (line 4143). If
flush-time user code INSERTs into a different fast-path FK table,
ri_FastPathGetEntry
adds a new cache entry mid-scan; it can land in a bucket hash_seq_search
already passed and is never reached. ri_FastPathTeardown (line 4165) then
hash_destroys the cache (line 4188) without flushing entries that still
have batch_count > 0, so that buffered check is discarded. This survives a

per-entry guard for [1] (different entry, not a re-entry of the busy one):

create table parent(id int primary key);

insert into parent select g from generate_series(1,64) g;

create table child2(a int);

alter table child2 add constraint child2_fkey

foreign key (a) references parent(id);

create type vch as (v int);

create function vcast(vch) returns int language plpgsql as $$

begin

if $1.v = 1 then

insert into child2 values (999999); -- orphan into a *different*
FK

end if;

return $1.v;

end$$;

create cast (vch as int) with function vcast(vch) as implicit;

create table child(a vch);

alter table child add constraint child_fkey

foreign key (a) references parent(id);

insert into child values (row(1)::vch); -- flushed at
ri_FastPathEndBatch

select a from child2 where a not in (select id from parent); -- =>
999999

-- control: INSERT INTO child2 VALUES (999999); -- correctly raises FK
error

Root cause / thoughts:

All three stem from invoking user cast/operator code inside a deferred batch
flush: while a per-entry batch is half-updated [1], while a cache-wide
hash_seq_search
is in progress and teardown drops non-empty entries [3], and against a
subxact-abort invalidation that cannot tell parent-xact rows from
aborted-subxact
rows [2].

- [1] Bound-check before the write in ri_FastPathBatchAdd, and add a "flushing"
flag to RI_FastPathEntry, rejecting re-entrant modification of a busy entry
(a nested per-row probe is unsafe: the flush may hold PK-index buffer
locks).

- [3] Loop-flush in ri_FastPathEndBatch until no entry has batch_count

0, and/or

flush non-empty entries in ri_FastPathTeardown before hash_destroy.

- [2] Do not discard outstanding parent-xact rows on
SUBXACT_EVENT_ABORT_SUB; track the buffering subxact, or flush
immediate-constraint batches subxact boundaries.

- Unifying: a global "in fast-path flush" guard routing any re-entrant FK check
to the immediate per-row path, and reconsidering running user code mid-flush
at all.

Nik

#2Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Nikolay Samokhvalov (#1)
Re: PG19 FK fast path: OOB write and missed FK checks during batched

On Sat, Jun 6, 2026 at 17:31 Nikolay Samokhvalov <nik@postgres.ai> wrote:

Hi hackers,

The new FK existence-check fast path in ri_triggers.c (ri_FastPath*) runs
user-defined code in the middle of a deferred batch flush, which yields at
least three defects reachable by an unprivileged table owner. Present in
master and verified inREL_19_BETA1.

I identified these issues during recent security research with LLMs. While
they have clear security implications (OOB write, integrity bypass),
reporting them here because they are isolated to 19beta1, absent in PG18
and earlier; I don't have patches, only reproducibility.

Mechanism:

For an INSERT/UPDATE on the referencing side the fast path buffers rows
in a transaction-lived cache (ri_fastpath_cache, keyed by pg_constraint
OID) and probes the PK index in groups, flushing when a

per-constraint buffer reaches RI_FASTPATH_BATCH_SIZE (64) or when the

trigger-firing pass ends (ri_FastPathEndBatch, an
AfterTriggerBatchCallback). For a cross-type FK the flush calls the
column's cast function (ri_FastPathFlushArray, the FunctionCall3 at line
3069) and the equality operator -- arbitrary user code, mid-flush. Line
numbers below are from a REL_19_BETA1 build (commit 4b0bf07).

Unprivileged vehicle (defects 1 and 3). No superuser, no contrib: a role creates
a type it owns and an IMPLICIT cast from it to the PK type with a PL/pgSQL
function, which ri_HashCompareOp wires into the fast path's cast

slot. Below uses a composite type. Default btree opclass, ordinary single-column
FK, no GUC (fast path is unconditional for non-partitioned, non-temporal
FKs, per ri_fastpath_is_applicable).

1) ri_FastPathBatchAdd (line 2859): out-of-bounds write on re-entry

The write precedes the bound check, and batch_count is reset to 0 only at end
of flush (ri_FastPathBatchFlush, line 2971), so it is 64 throughout a full-batch
flush:

fpentry->batch[fpentry->batch_count] = ExecCopySlotHeapTuple(newslot);

fpentry->batch_count++;

if (fpentry->batch_count >= RI_FASTPATH_BATCH_SIZE)

ri_FastPathBatchFlush(fpentry, fk_rel, riinfo);

There is no re-entrancy guard and ri_FastPathGetEntry returns the same entry,
so user code that does DML on the same table during a full-batch flush
re-enters with batch_count == 64 and writes batch[64], one past the

array, overwriting the adjacent batch_count field (struct layout, lines
250-251). A single re-entrant row only stomps batch_count, which is then reset
to 0 before reuse; the crash manifests once the re-entrant insert is

itself large enough to fill and flush a batch, so the stomped batch_count
is used as an array index (batch[garbage]) and as nvals in memset(matched,
0, nvals * sizeof(bool)) (line 3054).

Reproduction (non-superuser; reliable SIGSEGV on --enable-cassert -O0;
under -O2 the out-of-bounds write is of undefined effect):

create table parent(id int primary key);

insert into parent select g from generate_series(1,2000) g;

create type vch as (v int);

create function vcast(vch) returns int language plpgsql as $$

begin

if $1.v = 64 then

insert into child select row(g)::vch from
generate_series(1001,1064) g;

end if;

return $1.v;

end$$;

create cast (vch as int) with function vcast(vch) as implicit;

create table child(a vch);

alter table child add constraint child_fkey

foreign key (a) references parent(id);

insert into child select row(g)::vch from generate_series(1,64) g; --
crash

-- gdb: crash at ri_FastPathBatchAdd line 2866 with batch_count
holding a

-- stomped HeapTuple pointer's low bits, i.e. batch[64] overwrote

-- batch_count; backend SIGSEGVs and the cluster restarts.

2) ri_FastPathSubXactCallback (line 4208): batch dropped on subxact abort

On SUBXACT_EVENT_ABORT_SUB the callback discards the whole cache:

ri_fastpath_cache = NULL;

ri_fastpath_callback_registered = false;

But batch[] holds outstanding rows of the enclosing transaction, not the aborting
subxact. An internal subxact abort during after-trigger firing (PL/pgSQL
BEGIN ... EXCEPTION) drops the buffered rows unflushed; their FK checks
never run and orphans commit behind a constraint that still reports itself
valid. No cast needed:

create table pk(id int primary key);

create table fk(a int, tag text);

insert into pk select g from generate_series(1,10) g;

alter table fk add constraint fk_a_fkey foreign key (a) references
pk(id);

create function abort_subxact() returns trigger language plpgsql as $$

begin

if NEW.tag = 'boom' then

begin perform 1/0; exception when others then null; end;

end if;

return NEW;

end$$;

create trigger fk_after after insert on fk

for each row execute function abort_subxact();

insert into fk values
(999,'bad'),(0,'boom'),(1,'ok'),(2,'ok'),(3,'ok');

-- INSERT 0 5, no error

select f.a from fk f left join pk p on f.a=p.id where p.id is null;

-- a

-- -----

-- 999

-- 0 (orphans)

-- the constraint still reports itself valid, and re-validation passes

-- while the orphans remain:

select convalidated from pg_constraint where conname = 'fk_a_fkey';

-- convalidated

-- --------------

-- t

alter table fk validate constraint fk_a_fkey;

-- ALTER TABLE (succeeds; does not re-scan committed rows)

select f.a from fk f left join pk p on f.a=p.id where p.id is null;

-- 999, 0 (orphans still present)

Controls (no EXCEPTION; between-statement SAVEPOINT; DEFERRABLE INITIALLY DEFERRED)
all behave correctly (FK violation raised, no orphans). The whole statement's
buffered batch is discarded, not just the aborting row's check. The abort
path also emits "WARNING: resource was not closed" (relation /

index / TupleDesc), a resource leak consistent with the missing flush.

3) ri_FastPathEndBatch (line 4133): cross-table re-entry drops a check

EndBatch flushes by iterating the cache with hash_seq_search (line 4143). If
flush-time user code INSERTs into a different fast-path FK table, ri_FastPathGetEntry
adds a new cache entry mid-scan; it can land in a bucket hash_seq_search
already passed and is never reached. ri_FastPathTeardown (line 4165) then
hash_destroys the cache (line 4188) without flushing entries that still
have batch_count > 0, so that buffered check is discarded. This survives a

per-entry guard for [1] (different entry, not a re-entry of the busy one):

create table parent(id int primary key);

insert into parent select g from generate_series(1,64) g;

create table child2(a int);

alter table child2 add constraint child2_fkey

foreign key (a) references parent(id);

create type vch as (v int);

create function vcast(vch) returns int language plpgsql as $$

begin

if $1.v = 1 then

insert into child2 values (999999); -- orphan into a
*different* FK

end if;

return $1.v;

end$$;

create cast (vch as int) with function vcast(vch) as implicit;

create table child(a vch);

alter table child add constraint child_fkey

foreign key (a) references parent(id);

insert into child values (row(1)::vch); -- flushed at
ri_FastPathEndBatch

select a from child2 where a not in (select id from parent); -- =>
999999

-- control: INSERT INTO child2 VALUES (999999); -- correctly raises
FK error

Root cause / thoughts:

All three stem from invoking user cast/operator code inside a deferred batch
flush: while a per-entry batch is half-updated [1], while a cache-wide hash_seq_search
is in progress and teardown drops non-empty entries [3], and against a
subxact-abort invalidation that cannot tell parent-xact rows from aborted-subxact
rows [2].

- [1] Bound-check before the write in ri_FastPathBatchAdd, and add a "flushing"
flag to RI_FastPathEntry, rejecting re-entrant modification of a busy
entry (a nested per-row probe is unsafe: the flush may hold PK-index buffer
locks).

- [3] Loop-flush in ri_FastPathEndBatch until no entry has batch_count >
0, and/or flush non-empty entries in ri_FastPathTeardown before
hash_destroy.

- [2] Do not discard outstanding parent-xact rows on
SUBXACT_EVENT_ABORT_SUB; track the buffering subxact, or flush
immediate-constraint batches subxact boundaries.

- Unifying: a global "in fast-path flush" guard routing any re-entrant FK check
to the immediate per-row path, and reconsidering running user code mid-flush
at all.

Nik

Thanks for the detailed report and reproducers. I’ve started looking into
this.

- thanks, Amit

Show quoted text
#3Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#2)
Re: PG19 FK fast path: OOB write and missed FK checks during batched

On Sat, Jun 6, 2026 at 6:13 PM Amit Langote <amitlangote09@gmail.com> wrote:

Thanks for the detailed report and reproducers. I’ve started looking into this.

Continuing to look. Appended this to the open items list:

https://wiki.postgresql.org/wiki/PostgreSQL_19_Open_Items#Open_Issues

--
Thanks, Amit Langote

#4Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#3)
Re: PG19 FK fast path: OOB write and missed FK checks during batched

On Mon, Jun 8, 2026 at 5:18 PM Amit Langote <amitlangote09@gmail.com> wrote:

On Sat, Jun 6, 2026 at 6:13 PM Amit Langote <amitlangote09@gmail.com> wrote:

Thanks for the detailed report and reproducers. I’ve started looking into this.

Continuing to look. Appended this to the open items list:

https://wiki.postgresql.org/wiki/PostgreSQL_19_Open_Items#Open_Issues

Thanks again, Nik, for the thorough analysis and the reproducers --
they made all three easy to confirm and pin down. Patches attached:
0001 for defect 1, 0002 for defects 2 and 3.

0001 (defect 1): check and flush before writing the row rather than
after, and add a per-entry "flushing" flag so a re-entrant add on the
same entry during a flush takes the per-row path instead of touching
the mid-flush batch. The flag is cleared in a PG_FINALLY, which also
resets batch_count, so the entry stays reusable if a flush error is
caught by a savepoint.

0002 (defects 2 and 3): rather than track subxact membership per row,
confine batching to the top transaction level -- in RI_FKey_check,
when GetCurrentTransactionNestLevel() > 1, use the per-row path. I
went this way because per-entry subxact tracking isn't enough (one
entry's batch can mix rows from several levels, since the cache is
keyed by constraint), and flushing at subxact boundaries doesn't work
for deferred constraints. Once the cache only ever holds top-level
rows, a subxact abort has nothing of its own to discard, so
ri_FastPathSubXactCallback goes away -- that's what fixes your defect
2 reproducer. For defect 3, which is still reachable at the top level,
the same patch adds a cache-wide flag set while ri_FastPathEndBatch
iterates, so a re-entrant check during the scan takes the per-row path
instead of inserting into the cache being scanned.

The per-row path still bypasses SPI, so these stay well ahead of the
pre-19 check in terms of performance. I'd like to recover batching
across subtransactions properly in v20 but didn't want to rush it now.

On defect 3, can you check whether your reproducer still commits the
orphan with 0002 applied, or whether (like on my build) it now raises
the violation? I'd like to be sure the bucket-placement variation you
hit is actually covered. And of course any review of the patches is
welcome.

--
Thanks, Amit Langote

Attachments:

v1-0001-Fix-out-of-bounds-write-in-RI-fast-path-batch-on-.patchapplication/octet-stream; name=v1-0001-Fix-out-of-bounds-write-in-RI-fast-path-batch-on-.patchDownload+147-14
v1-0002-Confine-RI-fast-path-batching-to-the-top-transact.patchapplication/octet-stream; name=v1-0002-Confine-RI-fast-path-batching-to-the-top-transact.patchDownload+97-28
#5Nikolay Samokhvalov
samokhvalov@gmail.com
In reply to: Amit Langote (#4)
Re: PG19 FK fast path: OOB write and missed FK checks during batched

On Tue, Jun 9, 2026 at 6:31 AM Amit Langote <amitlangote09@gmail.com> wrote:

On Mon, Jun 8, 2026 at 5:18 PM Amit Langote <amitlangote09@gmail.com>
wrote:

On Sat, Jun 6, 2026 at 6:13 PM Amit Langote <amitlangote09@gmail.com>

wrote:

Thanks for the detailed report and reproducers. I’ve started looking

into this.

Continuing to look. Appended this to the open items list:

https://wiki.postgresql.org/wiki/PostgreSQL_19_Open_Items#Open_Issues

Thanks again, Nik, for the thorough analysis and the reproducers --
they made all three easy to confirm and pin down. Patches attached:
0001 for defect 1, 0002 for defects 2 and 3.

0001 (defect 1): check and flush before writing the row rather than
after, and add a per-entry "flushing" flag so a re-entrant add on the
same entry during a flush takes the per-row path instead of touching
the mid-flush batch. The flag is cleared in a PG_FINALLY, which also
resets batch_count, so the entry stays reusable if a flush error is
caught by a savepoint.

0002 (defects 2 and 3): rather than track subxact membership per row,
confine batching to the top transaction level -- in RI_FKey_check,
when GetCurrentTransactionNestLevel() > 1, use the per-row path. I
went this way because per-entry subxact tracking isn't enough (one
entry's batch can mix rows from several levels, since the cache is
keyed by constraint), and flushing at subxact boundaries doesn't work
for deferred constraints. Once the cache only ever holds top-level
rows, a subxact abort has nothing of its own to discard, so
ri_FastPathSubXactCallback goes away -- that's what fixes your defect
2 reproducer. For defect 3, which is still reachable at the top level,
the same patch adds a cache-wide flag set while ri_FastPathEndBatch
iterates, so a re-entrant check during the scan takes the per-row path
instead of inserting into the cache being scanned.

The per-row path still bypasses SPI, so these stay well ahead of the
pre-19 check in terms of performance. I'd like to recover batching
across subtransactions properly in v20 but didn't want to rush it now.

On defect 3, can you check whether your reproducer still commits the
orphan with 0002 applied, or whether (like on my build) it now raises
the violation? I'd like to be sure the bucket-placement variation you
hit is actually covered. And of course any review of the patches is
welcome.

--
Thanks, Amit Langote

Hi Amit,

Thanks for the quick fixes.

I checked v1-0001 + v1-0002 against current master (e18b0cb7) with an
assertion/debug build.

- Both apply cleanly to master (in sequence)
- Defect 1 same-FK re-entry no longer crashes; the original shape completes
and leaves the expected rows
- Defect 2 subtransaction-abort case now raises the FK violation instead of
committing orphans
- For your defect 3 question: with 0002 applied, my reproducer no longer
commits the child2 orphan. It raises:
ERROR: insert or update on table "child2" violates foreign key
constraint "child2_fkey"
DETAIL: Key (a)=(999999) is not present in table "parent".

After the error, child2_orphans = 0 and child2 is empty in my run.

I also ran the regression suite in that tree; foreign_key passed, and the
full run reported all 245 tests passed.

So v1 looks good to me for the three reported cases.

Thanks!

Nik

#6Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Nikolay Samokhvalov (#5)
Re: PG19 FK fast path: OOB write and missed FK checks during batched

On Wed, Jun 10, 2026 at 5:16 PM Nikolay Samokhvalov <nik@postgres.ai> wrote:

On Tue, Jun 9, 2026 at 6:31 AM Amit Langote <amitlangote09@gmail.com> wrote:

On Mon, Jun 8, 2026 at 5:18 PM Amit Langote <amitlangote09@gmail.com> wrote:

On Sat, Jun 6, 2026 at 6:13 PM Amit Langote <amitlangote09@gmail.com> wrote:

Thanks for the detailed report and reproducers. I’ve started looking into this.

Continuing to look. Appended this to the open items list:

https://wiki.postgresql.org/wiki/PostgreSQL_19_Open_Items#Open_Issues

Thanks again, Nik, for the thorough analysis and the reproducers --
they made all three easy to confirm and pin down. Patches attached:
0001 for defect 1, 0002 for defects 2 and 3.

0001 (defect 1): check and flush before writing the row rather than
after, and add a per-entry "flushing" flag so a re-entrant add on the
same entry during a flush takes the per-row path instead of touching
the mid-flush batch. The flag is cleared in a PG_FINALLY, which also
resets batch_count, so the entry stays reusable if a flush error is
caught by a savepoint.

0002 (defects 2 and 3): rather than track subxact membership per row,
confine batching to the top transaction level -- in RI_FKey_check,
when GetCurrentTransactionNestLevel() > 1, use the per-row path. I
went this way because per-entry subxact tracking isn't enough (one
entry's batch can mix rows from several levels, since the cache is
keyed by constraint), and flushing at subxact boundaries doesn't work
for deferred constraints. Once the cache only ever holds top-level
rows, a subxact abort has nothing of its own to discard, so
ri_FastPathSubXactCallback goes away -- that's what fixes your defect
2 reproducer. For defect 3, which is still reachable at the top level,
the same patch adds a cache-wide flag set while ri_FastPathEndBatch
iterates, so a re-entrant check during the scan takes the per-row path
instead of inserting into the cache being scanned.

The per-row path still bypasses SPI, so these stay well ahead of the
pre-19 check in terms of performance. I'd like to recover batching
across subtransactions properly in v20 but didn't want to rush it now.

On defect 3, can you check whether your reproducer still commits the
orphan with 0002 applied, or whether (like on my build) it now raises
the violation? I'd like to be sure the bucket-placement variation you
hit is actually covered. And of course any review of the patches is
welcome.

Hi Amit,

Thanks for the quick fixes.

I checked v1-0001 + v1-0002 against current master (e18b0cb7) with an assertion/debug build.

- Both apply cleanly to master (in sequence)
- Defect 1 same-FK re-entry no longer crashes; the original shape completes and leaves the expected rows
- Defect 2 subtransaction-abort case now raises the FK violation instead of committing orphans
- For your defect 3 question: with 0002 applied, my reproducer no longer commits the child2 orphan. It raises:
ERROR: insert or update on table "child2" violates foreign key constraint "child2_fkey"
DETAIL: Key (a)=(999999) is not present in table "parent".

After the error, child2_orphans = 0 and child2 is empty in my run.

I also ran the regression suite in that tree; foreign_key passed, and the full run reported all 245 tests passed.

So v1 looks good to me for the three reported cases.

Thanks for checking. I will review them a bit more closely before
committing by Friday. Other reviews are welcome.

--
Thanks, Amit Langote

#7Ayush Tiwari
ayushtiwari.slg01@gmail.com
In reply to: Amit Langote (#6)
Re: PG19 FK fast path: OOB write and missed FK checks during batched

Hi,

On Wed, 10 Jun 2026 at 14:02, Amit Langote <amitlangote09@gmail.com> wrote

Thanks for checking. I will review them a bit more closely before
committing by Friday. Other reviews are welcome.

Thanks for the patch!

I read through v1-0001 and v1-0002 and tried them locally. I had a couple of
things I wanted to ask about.

1. The per-entry "flushing" flag and test coverage. If I'm reading the two
patches together correctly, with both applied the 64-row re-entry test in
0001
reaches the flush through ri_FastPathEndBatch(), where 0002's cache-wide
ri_fastpath_flushing guard already routes the re-entrant check to the
per-row
path before it gets back into ri_FastPathBatchAdd(). Does that mean the
per-entry flag from 0001 isn't really exercised by that test once 0002 is
in?
As far as I can tell you'd need the flush to fire from ri_FastPathBatchAdd()
itself (a 65th row) to reach it. I tried a 65-row variant (same FK,
re-entrant
DML from the cast during the full-batch flush), including a case where the
re-entrant row was an orphan, and it seemed to do the right thing; the
per-row fallback still raised the violation. Would it be worth switching
the
test to 65 rows, or adding that variant, so the per-entry guard is covered
too?
Or am I missing a path where the committed test already hits it?

2. Resetting ri_fastpath_flushing. I noticed it's cleared only in the
PG_FINALLY of ri_FastPathEndBatch(), which does seem to cover the cases I
could
think of. Since ri_FastPathXactCallback already NULLs ri_fastpath_cache and
clears ri_fastpath_callback_registered at transaction end, I wondered
whether
it might be worth clearing ri_fastpath_flushing there too, just as cheap
insurance against some future path that leaves it set across transactions
though maybe that's unnecessary given the PG_FINALLY.

Other than the above queries, the patch looks good to me.

Regards,
Ayush

#8Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Ayush Tiwari (#7)
Re: PG19 FK fast path: OOB write and missed FK checks during batched

Hi Ayush,

Thanks for the review.

On Wed, Jun 10, 2026 at 7:09 PM Ayush Tiwari
<ayushtiwari.slg01@gmail.com> wrote:

On Wed, 10 Jun 2026 at 14:02, Amit Langote <amitlangote09@gmail.com> wrote

Thanks for checking. I will review them a bit more closely before
committing by Friday. Other reviews are welcome.

Thanks for the patch!

I read through v1-0001 and v1-0002 and tried them locally. I had a couple of
things I wanted to ask about.

1. The per-entry "flushing" flag and test coverage. If I'm reading the two
patches together correctly, with both applied the 64-row re-entry test in 0001
reaches the flush through ri_FastPathEndBatch(), where 0002's cache-wide
ri_fastpath_flushing guard already routes the re-entrant check to the per-row
path before it gets back into ri_FastPathBatchAdd(). Does that mean the
per-entry flag from 0001 isn't really exercised by that test once 0002 is in?
As far as I can tell you'd need the flush to fire from ri_FastPathBatchAdd()
itself (a 65th row) to reach it. I tried a 65-row variant (same FK, re-entrant
DML from the cast during the full-batch flush), including a case where the
re-entrant row was an orphan, and it seemed to do the right thing; the
per-row fallback still raised the violation. Would it be worth switching the
test to 65 rows, or adding that variant, so the per-entry guard is covered too?
Or am I missing a path where the committed test already hits it?

You're right. With 0002 applied, the 64-row test reaches the flush
through ri_FastPathEndBatch(), where the cache-wide
ri_fastpath_flushing guard catches the re-entry before it returns to
ri_FastPathBatchAdd(), so the per-entry flag is no longer exercised by
that test. To hit the per-entry flag the flush has to fire from
ri_FastPathBatchAdd() itself, which the 64-row case no longer does
once the add and flush are reordered.

Rather than bump the test to 65 rows, I'd prefer to keep the flush
firing from ri_FastPathBatchAdd() at 64 by not reordering the add and
flush, and prevent the OOB write by bounds-checking the write instead,
as done in the attached updated 0001. A re-entrant add then can't
overrun the array regardless of the flag, the per-entry flushing guard
still routes the re-entry to the per-row path, and a 64-row statement
flushes from ri_FastPathBatchAdd() on the 64th row, so the existing
test exercises the per-entry guard.

2. Resetting ri_fastpath_flushing. I noticed it's cleared only in the
PG_FINALLY of ri_FastPathEndBatch(), which does seem to cover the cases I could
think of. Since ri_FastPathXactCallback already NULLs ri_fastpath_cache and
clears ri_fastpath_callback_registered at transaction end, I wondered whether
it might be worth clearing ri_fastpath_flushing there too, just as cheap
insurance against some future path that leaves it set across transactions
though maybe that's unnecessary given the PG_FINALLY.

Agreed, it's cheap and matches the existing resets there, so I've
added it to ri_FastPathXactCallback() in v2-0002.

Other than the above queries, the patch looks good to me.

Updated patches attached.

--
Thanks, Amit Langote

Attachments:

v2-0001-Fix-out-of-bounds-write-in-RI-fast-path-batch-on-.patchapplication/octet-stream; name=v2-0001-Fix-out-of-bounds-write-in-RI-fast-path-batch-on-.patchDownload+155-18
v2-0002-Confine-RI-fast-path-batching-to-the-top-transact.patchapplication/octet-stream; name=v2-0002-Confine-RI-fast-path-batching-to-the-top-transact.patchDownload+103-27
#9Ayush Tiwari
ayushtiwari.slg01@gmail.com
In reply to: Amit Langote (#8)
Re: PG19 FK fast path: OOB write and missed FK checks during batched

Hi,

On Wed, 10 Jun 2026 at 17:47, Amit Langote <amitlangote09@gmail.com> wrote:

Hi Ayush,

Thanks for the review.

On Wed, Jun 10, 2026 at 7:09 PM Ayush Tiwari
<ayushtiwari.slg01@gmail.com> wrote:

On Wed, 10 Jun 2026 at 14:02, Amit Langote <amitlangote09@gmail.com>

wrote

Thanks for checking. I will review them a bit more closely before
committing by Friday. Other reviews are welcome.

Thanks for the patch!

I read through v1-0001 and v1-0002 and tried them locally. I had a

couple of

things I wanted to ask about.

1. The per-entry "flushing" flag and test coverage. If I'm reading the

two

patches together correctly, with both applied the 64-row re-entry test

in 0001

reaches the flush through ri_FastPathEndBatch(), where 0002's cache-wide
ri_fastpath_flushing guard already routes the re-entrant check to the

per-row

path before it gets back into ri_FastPathBatchAdd(). Does that mean the
per-entry flag from 0001 isn't really exercised by that test once 0002

is in?

As far as I can tell you'd need the flush to fire from

ri_FastPathBatchAdd()

itself (a 65th row) to reach it. I tried a 65-row variant (same FK,

re-entrant

DML from the cast during the full-batch flush), including a case where

the

re-entrant row was an orphan, and it seemed to do the right thing; the
per-row fallback still raised the violation. Would it be worth

switching the

test to 65 rows, or adding that variant, so the per-entry guard is

covered too?

Or am I missing a path where the committed test already hits it?

You're right. With 0002 applied, the 64-row test reaches the flush
through ri_FastPathEndBatch(), where the cache-wide
ri_fastpath_flushing guard catches the re-entry before it returns to
ri_FastPathBatchAdd(), so the per-entry flag is no longer exercised by
that test. To hit the per-entry flag the flush has to fire from
ri_FastPathBatchAdd() itself, which the 64-row case no longer does
once the add and flush are reordered.

Rather than bump the test to 65 rows, I'd prefer to keep the flush
firing from ri_FastPathBatchAdd() at 64 by not reordering the add and
flush, and prevent the OOB write by bounds-checking the write instead,
as done in the attached updated 0001. A re-entrant add then can't
overrun the array regardless of the flag, the per-entry flushing guard
still routes the re-entry to the per-row path, and a 64-row statement
flushes from ri_FastPathBatchAdd() on the 64th row, so the existing
test exercises the per-entry guard.

Makes sense, it is better.

2. Resetting ri_fastpath_flushing. I noticed it's cleared only in the

PG_FINALLY of ri_FastPathEndBatch(), which does seem to cover the cases

I could

think of. Since ri_FastPathXactCallback already NULLs ri_fastpath_cache

and

clears ri_fastpath_callback_registered at transaction end, I wondered

whether

it might be worth clearing ri_fastpath_flushing there too, just as cheap
insurance against some future path that leaves it set across transactions
though maybe that's unnecessary given the PG_FINALLY.

Agreed, it's cheap and matches the existing resets there, so I've
added it to ri_FastPathXactCallback() in v2-0002.

Other than the above queries, the patch looks good to me.

Updated patches attached.

Thanks for the updated patches!

Both patches, lgtm.

Regards,
Ayush

#10Junwang Zhao
zhjwpku@gmail.com
In reply to: Amit Langote (#8)
Re: PG19 FK fast path: OOB write and missed FK checks during batched

Hi Amit,

On Wed, Jun 10, 2026 at 8:17 PM Amit Langote <amitlangote09@gmail.com> wrote:

Hi Ayush,

Thanks for the review.

On Wed, Jun 10, 2026 at 7:09 PM Ayush Tiwari
<ayushtiwari.slg01@gmail.com> wrote:

On Wed, 10 Jun 2026 at 14:02, Amit Langote <amitlangote09@gmail.com> wrote

Thanks for checking. I will review them a bit more closely before
committing by Friday. Other reviews are welcome.

Thanks for the patch!

I read through v1-0001 and v1-0002 and tried them locally. I had a couple of
things I wanted to ask about.

1. The per-entry "flushing" flag and test coverage. If I'm reading the two
patches together correctly, with both applied the 64-row re-entry test in 0001
reaches the flush through ri_FastPathEndBatch(), where 0002's cache-wide
ri_fastpath_flushing guard already routes the re-entrant check to the per-row
path before it gets back into ri_FastPathBatchAdd(). Does that mean the
per-entry flag from 0001 isn't really exercised by that test once 0002 is in?
As far as I can tell you'd need the flush to fire from ri_FastPathBatchAdd()
itself (a 65th row) to reach it. I tried a 65-row variant (same FK, re-entrant
DML from the cast during the full-batch flush), including a case where the
re-entrant row was an orphan, and it seemed to do the right thing; the
per-row fallback still raised the violation. Would it be worth switching the
test to 65 rows, or adding that variant, so the per-entry guard is covered too?
Or am I missing a path where the committed test already hits it?

You're right. With 0002 applied, the 64-row test reaches the flush
through ri_FastPathEndBatch(), where the cache-wide
ri_fastpath_flushing guard catches the re-entry before it returns to
ri_FastPathBatchAdd(), so the per-entry flag is no longer exercised by
that test. To hit the per-entry flag the flush has to fire from
ri_FastPathBatchAdd() itself, which the 64-row case no longer does
once the add and flush are reordered.

Rather than bump the test to 65 rows, I'd prefer to keep the flush
firing from ri_FastPathBatchAdd() at 64 by not reordering the add and
flush, and prevent the OOB write by bounds-checking the write instead,
as done in the attached updated 0001. A re-entrant add then can't
overrun the array regardless of the flag, the per-entry flushing guard
still routes the re-entry to the per-row path, and a 64-row statement
flushes from ri_FastPathBatchAdd() on the 64th row, so the existing
test exercises the per-entry guard.

2. Resetting ri_fastpath_flushing. I noticed it's cleared only in the
PG_FINALLY of ri_FastPathEndBatch(), which does seem to cover the cases I could
think of. Since ri_FastPathXactCallback already NULLs ri_fastpath_cache and
clears ri_fastpath_callback_registered at transaction end, I wondered whether
it might be worth clearing ri_fastpath_flushing there too, just as cheap
insurance against some future path that leaves it set across transactions
though maybe that's unnecessary given the PG_FINALLY.

Agreed, it's cheap and matches the existing resets there, so I've
added it to ri_FastPathXactCallback() in v2-0002.

Other than the above queries, the patch looks good to me.

Updated patches attached.

I only reviewed and applied patch 0001 on my local machine, and it
successfully fixed the crash.

One minor comment:

+ if (fpentry->flushing)
+ {
+ ri_FastPathCheck(riinfo, fk_rel, newslot);
+ return;
+ }

Would it be worth wrapping the condition with unlikely()? It seems
this branch is expected to be false in most cases, not a strong
opinion though.

--
Thanks, Amit Langote

--
Regards
Junwang Zhao

#11Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Junwang Zhao (#10)
Re: PG19 FK fast path: OOB write and missed FK checks during batched

On Thu, Jun 11, 2026 at 5:18 PM Junwang Zhao <zhjwpku@gmail.com> wrote:

I only reviewed and applied patch 0001 on my local machine, and it
successfully fixed the crash.

One minor comment:

+ if (fpentry->flushing)
+ {
+ ri_FastPathCheck(riinfo, fk_rel, newslot);
+ return;
+ }

Would it be worth wrapping the condition with unlikely()? It seems
this branch is expected to be false in most cases, not a strong
opinion though.

Good idea. Will do.

Are you planning to look at 0002?

--
Thanks, Amit Langote

#12Junwang Zhao
zhjwpku@gmail.com
In reply to: Amit Langote (#11)
Re: PG19 FK fast path: OOB write and missed FK checks during batched

Hi Amit,

On Thu, Jun 11, 2026 at 5:05 PM Amit Langote <amitlangote09@gmail.com> wrote:

On Thu, Jun 11, 2026 at 5:18 PM Junwang Zhao <zhjwpku@gmail.com> wrote:

I only reviewed and applied patch 0001 on my local machine, and it
successfully fixed the crash.

One minor comment:

+ if (fpentry->flushing)
+ {
+ ri_FastPathCheck(riinfo, fk_rel, newslot);
+ return;
+ }

Would it be worth wrapping the condition with unlikely()? It seems
this branch is expected to be false in most cases, not a strong
opinion though.

Good idea. Will do.

Are you planning to look at 0002?

I just applied 0002 and ran the regression successfully.

I have one trivial comment, subXact abort doesn't NULL the
ri_fastpath_cache, so I think the following comment of
RI_FastPathEntry should be polished accordingly by removing the
`SubXactCallback`.

* ri_FastPathEndBatch(); on abort, ResourceOwner releases the cached
* relations and the XactCallback/SubXactCallback NULL the static cache pointer
* to prevent any subsequent access.

--
Thanks, Amit Langote

--
Regards
Junwang Zhao

#13Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Junwang Zhao (#12)
Re: PG19 FK fast path: OOB write and missed FK checks during batched

On Thu, Jun 11, 2026 at 6:51 PM Junwang Zhao <zhjwpku@gmail.com> wrote:

On Thu, Jun 11, 2026 at 5:05 PM Amit Langote <amitlangote09@gmail.com> wrote:

On Thu, Jun 11, 2026 at 5:18 PM Junwang Zhao <zhjwpku@gmail.com> wrote:

I only reviewed and applied patch 0001 on my local machine, and it
successfully fixed the crash.

One minor comment:

+ if (fpentry->flushing)
+ {
+ ri_FastPathCheck(riinfo, fk_rel, newslot);
+ return;
+ }

Would it be worth wrapping the condition with unlikely()? It seems
this branch is expected to be false in most cases, not a strong
opinion though.

Good idea. Will do.

Are you planning to look at 0002?

I just applied 0002 and ran the regression successfully.

I have one trivial comment, subXact abort doesn't NULL the
ri_fastpath_cache, so I think the following comment of
RI_FastPathEntry should be polished accordingly by removing the
`SubXactCallback`.

* ri_FastPathEndBatch(); on abort, ResourceOwner releases the cached
* relations and the XactCallback/SubXactCallback NULL the static cache pointer
* to prevent any subsequent access.

Thanks for the review. Yes, I missed that.

I've updated the patches to address your comments and did some other polishing.

--
Thanks, Amit Langote

Attachments:

v3-0002-Confine-RI-fast-path-batching-to-the-top-transact.patchapplication/x-patch; name=v3-0002-Confine-RI-fast-path-batching-to-the-top-transact.patchDownload+105-29
v3-0001-Fix-out-of-bounds-write-in-RI-fast-path-batch-on-.patchapplication/x-patch; name=v3-0001-Fix-out-of-bounds-write-in-RI-fast-path-batch-on-.patchDownload+165-18
#14Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#13)
Re: PG19 FK fast path: OOB write and missed FK checks during batched

On Thu, Jun 11, 2026 at 7:47 PM Amit Langote <amitlangote09@gmail.com> wrote:

On Thu, Jun 11, 2026 at 6:51 PM Junwang Zhao <zhjwpku@gmail.com> wrote:

On Thu, Jun 11, 2026 at 5:05 PM Amit Langote <amitlangote09@gmail.com> wrote:

On Thu, Jun 11, 2026 at 5:18 PM Junwang Zhao <zhjwpku@gmail.com> wrote:

I only reviewed and applied patch 0001 on my local machine, and it
successfully fixed the crash.

One minor comment:

+ if (fpentry->flushing)
+ {
+ ri_FastPathCheck(riinfo, fk_rel, newslot);
+ return;
+ }

Would it be worth wrapping the condition with unlikely()? It seems
this branch is expected to be false in most cases, not a strong
opinion though.

Good idea. Will do.

Are you planning to look at 0002?

I just applied 0002 and ran the regression successfully.

I have one trivial comment, subXact abort doesn't NULL the
ri_fastpath_cache, so I think the following comment of
RI_FastPathEntry should be polished accordingly by removing the
`SubXactCallback`.

* ri_FastPathEndBatch(); on abort, ResourceOwner releases the cached
* relations and the XactCallback/SubXactCallback NULL the static cache pointer
* to prevent any subsequent access.

Thanks for the review. Yes, I missed that.

I've updated the patches to address your comments and did some other polishing.

I've pushed these now. Thank you everyone.

--
Thanks, Amit Langote