REPACK CONCURRENTLY fails on tables with generated columns

Started by Ewan Young12 days ago7 messageshackers
Jump to latest
#1Ewan Young
kdbase.hack@gmail.com

Hi,

REPACK (CONCURRENTLY) aborts with an internal error on any table that has
a STORED generated column, if a concurrent UPDATE that requires index
maintenance is applied during the catch-up phase:

ERROR: no generation expression found for column number 3 of table
"pg_temp_16396"
Plain (non-concurrent) REPACK on such a table works fine, and so does
REPACK (CONCURRENTLY) as long as no qualifying concurrent change is
applied -- so the problem is specific to the concurrent-change apply path.

The attached patch adds an isolation test, but here is the manual
sequence (server built with --enable-injection-points):

CREATE EXTENSION injection_points;
CREATE TABLE t (i int PRIMARY KEY, v int,
g int GENERATED ALWAYS AS (v * 10) STORED);
CREATE INDEX ON t (v); -- makes UPDATE of v non-HOT
INSERT INTO t(i, v) VALUES (1, 1);

-- session 1:
SELECT injection_points_attach('repack-concurrently-before-lock', 'wait');
REPACK (CONCURRENTLY) t; -- blocks at the injection point

-- session 2, once session 1 is waiting:
UPDATE t SET v = v + 1 WHERE i = 1;
SELECT injection_points_wakeup('repack-concurrently-before-lock');

-- session 1 then fails with the ERROR above.
Without injection points this is a race: the concurrent UPDATE has to be
decoded and applied during catch-up, and it has to be a non-HOT update
(one that goes through index maintenance). It is reliably hit on a busy
table with a generated column.

The transient heap built by make_new_heap() is intentionally created
without the old table's defaults and constraints, so it has no generation
expressions for its generated columns, even though the tuple descriptor
still has attgenerated set.

When apply_concurrent_update() replays a non-HOT update, it calls
ExecInsertIndexTuples() with EIIT_IS_UPDATE. To decide whether to pass
the "indexUnchanged" hint, that calls index_unchanged_by_update() ->
ExecGetExtraUpdatedCols() -> ExecInitGenerated(), which looks up the
generation expression of each generated column via build_column_default()
and errors out when it finds none on the transient heap.

The apply path does not need to recompute generated columns at all: the
decoded tuple already carries the correct value, and it is only inserted.
Note also that ExecGetUpdatedCols() already returns an empty set for this
ResultRelInfo, because it is not part of any range table -- so the
indexUnchanged determination here is already approximate.

Regards,
Ewan Young

Attachments:

v1-0001-Fix-REPACK-CONCURRENTLY-on-tables-with-generated-.patchapplication/octet-stream; name=v1-0001-Fix-REPACK-CONCURRENTLY-on-tables-with-generated-.patchDownload+117-1
#2Antonin Houska
ah@cybertec.at
In reply to: Ewan Young (#1)
Re: REPACK CONCURRENTLY fails on tables with generated columns

Ewan Young <kdbase.hack@gmail.com> wrote:

REPACK (CONCURRENTLY) aborts with an internal error on any table that has
a STORED generated column, if a concurrent UPDATE that requires index
maintenance is applied during the catch-up phase:

ERROR: no generation expression found for column number 3 of table
"pg_temp_16396"
Plain (non-concurrent) REPACK on such a table works fine, and so does
REPACK (CONCURRENTLY) as long as no qualifying concurrent change is
applied -- so the problem is specific to the concurrent-change apply path.

Thanks for the report!

The attached patch adds an isolation test, but here is the manual
sequence (server built with --enable-injection-points):

CREATE EXTENSION injection_points;
CREATE TABLE t (i int PRIMARY KEY, v int,
g int GENERATED ALWAYS AS (v * 10) STORED);
CREATE INDEX ON t (v); -- makes UPDATE of v non-HOT
INSERT INTO t(i, v) VALUES (1, 1);

-- session 1:
SELECT injection_points_attach('repack-concurrently-before-lock', 'wait');
REPACK (CONCURRENTLY) t; -- blocks at the injection point

-- session 2, once session 1 is waiting:
UPDATE t SET v = v + 1 WHERE i = 1;
SELECT injection_points_wakeup('repack-concurrently-before-lock');

-- session 1 then fails with the ERROR above.

I confirm I can reproduce it. I'll post a fix next week.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

#3Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Antonin Houska (#2)
Re: REPACK CONCURRENTLY fails on tables with generated columns

On 2026-Jun-12, Antonin Houska wrote:

I confirm I can reproduce it. I'll post a fix next week.

Didn't you like Ewan's proposed patch?

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Update: super-fast reaction on the Postgres bugs mailing list. The report
was acknowledged [...], and a fix is under discussion.
The wonders of open-source !"
https://twitter.com/gunnarmorling/status/1596080409259003906

#4Srinath Reddy Sadipiralla
srinath2133@gmail.com
In reply to: Ewan Young (#1)
Re: REPACK CONCURRENTLY fails on tables with generated columns

Hi Ewan,

On Fri, Jun 12, 2026 at 2:10 PM Ewan Young <kdbase.hack@gmail.com> wrote:

Hi,

REPACK (CONCURRENTLY) aborts with an internal error on any table that has
a STORED generated column, if a concurrent UPDATE that requires index
maintenance is applied during the catch-up phase:

ERROR: no generation expression found for column number 3 of table
"pg_temp_16396"
Plain (non-concurrent) REPACK on such a table works fine, and so does
REPACK (CONCURRENTLY) as long as no qualifying concurrent change is
applied -- so the problem is specific to the concurrent-change apply path.

The attached patch adds an isolation test, but here is the manual
sequence (server built with --enable-injection-points):

CREATE EXTENSION injection_points;
CREATE TABLE t (i int PRIMARY KEY, v int,
g int GENERATED ALWAYS AS (v * 10) STORED);
CREATE INDEX ON t (v); -- makes UPDATE of v non-HOT
INSERT INTO t(i, v) VALUES (1, 1);

-- session 1:
SELECT injection_points_attach('repack-concurrently-before-lock', 'wait');
REPACK (CONCURRENTLY) t; -- blocks at the injection point

-- session 2, once session 1 is waiting:
UPDATE t SET v = v + 1 WHERE i = 1;
SELECT injection_points_wakeup('repack-concurrently-before-lock');

-- session 1 then fails with the ERROR above.
Without injection points this is a race: the concurrent UPDATE has to be
decoded and applied during catch-up, and it has to be a non-HOT update
(one that goes through index maintenance). It is reliably hit on a busy
table with a generated column.

I was able to reproduce this.

The transient heap built by make_new_heap() is intentionally created
without the old table's defaults and constraints, so it has no generation
expressions for its generated columns, even though the tuple descriptor
still has attgenerated set.

When apply_concurrent_update() replays a non-HOT update, it calls
ExecInsertIndexTuples() with EIIT_IS_UPDATE. To decide whether to pass
the "indexUnchanged" hint, that calls index_unchanged_by_update() ->
ExecGetExtraUpdatedCols() -> ExecInitGenerated(), which looks up the
generation expression of each generated column via build_column_default()
and errors out when it finds none on the transient heap.

The apply path does not need to recompute generated columns at all: the
decoded tuple already carries the correct value, and it is only inserted.
Note also that ExecGetUpdatedCols() already returns an empty set for this
ResultRelInfo, because it is not part of any range table -- so the
indexUnchanged determination here is already approximate.

makes sense, i have reviewed the patch, it LGTM.

--
Thanks :)
Srinath Reddy Sadipiralla
EDB: https://www.enterprisedb.com/

#5Ewan Young
kdbase.hack@gmail.com
In reply to: Srinath Reddy Sadipiralla (#4)
Re: REPACK CONCURRENTLY fails on tables with generated columns

Hi Srinath,

Thanks a lot for reproducing it and for the careful review.

On Sun, Jun 14, 2026 at 11:57 AM Srinath Reddy Sadipiralla
<srinath2133@gmail.com> wrote:

Show quoted text

Hi Ewan,

On Fri, Jun 12, 2026 at 2:10 PM Ewan Young <kdbase.hack@gmail.com> wrote:

Hi,

REPACK (CONCURRENTLY) aborts with an internal error on any table that has
a STORED generated column, if a concurrent UPDATE that requires index
maintenance is applied during the catch-up phase:

ERROR: no generation expression found for column number 3 of table
"pg_temp_16396"
Plain (non-concurrent) REPACK on such a table works fine, and so does
REPACK (CONCURRENTLY) as long as no qualifying concurrent change is
applied -- so the problem is specific to the concurrent-change apply path.

The attached patch adds an isolation test, but here is the manual
sequence (server built with --enable-injection-points):

CREATE EXTENSION injection_points;
CREATE TABLE t (i int PRIMARY KEY, v int,
g int GENERATED ALWAYS AS (v * 10) STORED);
CREATE INDEX ON t (v); -- makes UPDATE of v non-HOT
INSERT INTO t(i, v) VALUES (1, 1);

-- session 1:
SELECT injection_points_attach('repack-concurrently-before-lock', 'wait');
REPACK (CONCURRENTLY) t; -- blocks at the injection point

-- session 2, once session 1 is waiting:
UPDATE t SET v = v + 1 WHERE i = 1;
SELECT injection_points_wakeup('repack-concurrently-before-lock');

-- session 1 then fails with the ERROR above.
Without injection points this is a race: the concurrent UPDATE has to be
decoded and applied during catch-up, and it has to be a non-HOT update
(one that goes through index maintenance). It is reliably hit on a busy
table with a generated column.

I was able to reproduce this.

The transient heap built by make_new_heap() is intentionally created
without the old table's defaults and constraints, so it has no generation
expressions for its generated columns, even though the tuple descriptor
still has attgenerated set.

When apply_concurrent_update() replays a non-HOT update, it calls
ExecInsertIndexTuples() with EIIT_IS_UPDATE. To decide whether to pass
the "indexUnchanged" hint, that calls index_unchanged_by_update() ->
ExecGetExtraUpdatedCols() -> ExecInitGenerated(), which looks up the
generation expression of each generated column via build_column_default()
and errors out when it finds none on the transient heap.

The apply path does not need to recompute generated columns at all: the
decoded tuple already carries the correct value, and it is only inserted.
Note also that ExecGetUpdatedCols() already returns an empty set for this
ResultRelInfo, because it is not part of any range table -- so the
indexUnchanged determination here is already approximate.

makes sense, i have reviewed the patch, it LGTM.

--
Thanks :)
Srinath Reddy Sadipiralla
EDB: https://www.enterprisedb.com/

#6Antonin Houska
ah@cybertec.at
In reply to: Ewan Young (#1)
Re: REPACK CONCURRENTLY fails on tables with generated columns

Ewan Young <kdbase.hack@gmail.com> wrote:

The transient heap built by make_new_heap() is intentionally created
without the old table's defaults and constraints, so it has no generation
expressions for its generated columns, even though the tuple descriptor
still has attgenerated set.

When apply_concurrent_update() replays a non-HOT update, it calls
ExecInsertIndexTuples() with EIIT_IS_UPDATE. To decide whether to pass
the "indexUnchanged" hint, that calls index_unchanged_by_update() ->
ExecGetExtraUpdatedCols() -> ExecInitGenerated(), which looks up the
generation expression of each generated column via build_column_default()
and errors out when it finds none on the transient heap.

The apply path does not need to recompute generated columns at all: the
decoded tuple already carries the correct value, and it is only inserted.
Note also that ExecGetUpdatedCols() already returns an empty set for this
ResultRelInfo, because it is not part of any range table -- so the
indexUnchanged determination here is already approximate.

I'm sorry for the confusion, but the fact that ExecGetUpdatedCols() returns an
empty set is an omission rather than deliberate choice. Assuming we fix that,
the result of ExecGetExtraUpdatedCols() does matter. Thus we should copy the
related pg_attrdef entries, as I suggest in this patch.

Another question is how serious problem it is that ExecGetUpdatedCols()
returns empty set. AFAICS, "indexUnchanged" does not affect correctness - it's
is only a hint that helps the btree AM decide whent the bottom-up deletion and
de-duplication techniques should (not) be used. I'm not sure it's easy to
update the set for individual UPDATEs: the UPDATE commands REPACK replays
originate from different SQL queries and the logical decoding does not
transfer this information.

Even then, I think it'd be "less bad" to have ExecGetUpdatedCols() return a
set containing all the attributes rather than empty set. That is, avoid using
the btree optimizations altogether rather than allow them them when not
appropriate. However, per index_unchanged_by_update(), if ExecGetUpdatedCols()
tells that all columns are updated, the result of ExecGetExtraUpdatedCols()
does not matter. Nevertheless, I'd still slightly prefer copying the
pg_attrdef entries to hacking the executor.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

Attachments:

0001-Copy-the-relevant-pg_attrdef-catalog-entries-for-the.patchtext/x-diffDownload+99-2
#7Ewan Young
kdbase.hack@gmail.com
In reply to: Antonin Houska (#6)
Re: REPACK CONCURRENTLY fails on tables with generated columns

Hi Antonin,

On Mon, Jun 22, 2026 at 7:12 PM Antonin Houska <ah@cybertec.at> wrote:

Ewan Young <kdbase.hack@gmail.com> wrote:

The transient heap built by make_new_heap() is intentionally created
without the old table's defaults and constraints, so it has no generation
expressions for its generated columns, even though the tuple descriptor
still has attgenerated set.

When apply_concurrent_update() replays a non-HOT update, it calls
ExecInsertIndexTuples() with EIIT_IS_UPDATE. To decide whether to pass
the "indexUnchanged" hint, that calls index_unchanged_by_update() ->
ExecGetExtraUpdatedCols() -> ExecInitGenerated(), which looks up the
generation expression of each generated column via build_column_default()
and errors out when it finds none on the transient heap.

The apply path does not need to recompute generated columns at all: the
decoded tuple already carries the correct value, and it is only inserted.
Note also that ExecGetUpdatedCols() already returns an empty set for this
ResultRelInfo, because it is not part of any range table -- so the
indexUnchanged determination here is already approximate.

I'm sorry for the confusion, but the fact that ExecGetUpdatedCols() returns an
empty set is an omission rather than deliberate choice. Assuming we fix that,
the result of ExecGetExtraUpdatedCols() does matter. Thus we should copy the
related pg_attrdef entries, as I suggest in this patch.

Another question is how serious problem it is that ExecGetUpdatedCols()
returns empty set. AFAICS, "indexUnchanged" does not affect correctness - it's
is only a hint that helps the btree AM decide whent the bottom-up deletion and
de-duplication techniques should (not) be used. I'm not sure it's easy to
update the set for individual UPDATEs: the UPDATE commands REPACK replays
originate from different SQL queries and the logical decoding does not
transfer this information.

Even then, I think it'd be "less bad" to have ExecGetUpdatedCols() return a
set containing all the attributes rather than empty set. That is, avoid using
the btree optimizations altogether rather than allow them them when not
appropriate. However, per index_unchanged_by_update(), if ExecGetUpdatedCols()
tells that all columns are updated, the result of ExecGetExtraUpdatedCols()
does not matter. Nevertheless, I'd still slightly prefer copying the
pg_attrdef entries to hacking the executor.

Agreed, thanks for the correction. Relying on the empty ExecGetUpdatedCols()
set was the weak point of my version -- it's an omission, not something to
build a second approximation on. Copying the pg_attrdef entries fixes the
inconsistency at the root, so let's go with your approach.

I applied the patch and ran it through an injection-point reproducer
(cassert). Without the fix the bug reproduces (ERROR: no generation
expression found for column number 3 ...); with it, REPACK CONCURRENTLY
succeeds under a concurrent non-HOT UPDATE for a STORED generated column, an
index directly on the generated column, and a VIRTUAL column, with correct
values afterwards. Your repack.spec change passes.

The approach is right and I've confirmed it fixes the bug, so +1 from me in
this direction.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

Regards,
Ewan Young