Large expressions in indexes can't be stored (non-TOASTable)
Hi,
I ran into an issue (previously discussed[1]/messages/by-id/84ddff04-f122-784b-b6c5-3536804495f8@joeconway.com; quoting Andres out of
context that not addressing it then would "[a]ll but guarantee that
we'll have this discussion again"[2]/messages/by-id/20180720000356.5zkhvfpsqswngyob@alap3.anarazel.de) when trying to build a very large
expression index that did not fit within the page boundary. The
real-world use case was related to a vector search technique where I
wanted to use binary quantization based on the relationship between a
constant vector (the average at a point-in-time across the entire data
set) and the target vector[3]https://github.com/pgvector/pgvector[4]https://jkatz05.com/post/postgres/pgvector-scalar-binary-quantization/. An example:
CREATE INDEX ON embeddings
USING hnsw((quantization_func(embedding, $VECTOR)) bit_hamming_ops);
However, I ran into the issue in[1]/messages/by-id/84ddff04-f122-784b-b6c5-3536804495f8@joeconway.com, where pg_index was identified as
catalog that is missing a toast table, even though `indexprs` is marked
for extended storage. Providing a very simple reproducer in psql below:
----
CREATE TABLE def (id int);
SELECT array_agg(n) b FROM generate_series(1,10_000) n \gset
CREATE OR REPLACE FUNCTION vec_quantizer (a int, b int[]) RETURNS bool
AS $$ SELECT true $$ LANGUAGE SQL IMMUTABLE;
CREATE INDEX ON def (vec_quantizer(id, :'b'));
ERROR: row is too big: size 29448, maximum size 8160
---
This can come up with vector searches as vectors can be quite large -
the case I was testing involved a 1536-dim floating point vector (~6KB),
and the node parse tree pushed past the page boundary by about 2KB.
One could argue that pgvector or an extension can build in capabilities
to handle quantization internally without requiring the user to provide
a source vector (pgvectorscale does this). However, this also limits
flexibility to users, as they may want to bring their own quantization
functions to vector searches, e.g., as different quantization techniques
emerge, or if a particular technique is more suitable for a person's
dataset.
Thanks,
Jonathan
[1]: /messages/by-id/84ddff04-f122-784b-b6c5-3536804495f8@joeconway.com
/messages/by-id/84ddff04-f122-784b-b6c5-3536804495f8@joeconway.com
[2]: /messages/by-id/20180720000356.5zkhvfpsqswngyob@alap3.anarazel.de
/messages/by-id/20180720000356.5zkhvfpsqswngyob@alap3.anarazel.de
[3]: https://github.com/pgvector/pgvector
[4]: https://jkatz05.com/post/postgres/pgvector-scalar-binary-quantization/
On Tue, Sep 03, 2024 at 12:35:42PM -0400, Jonathan S. Katz wrote:
However, I ran into the issue in[1], where pg_index was identified as
catalog that is missing a toast table, even though `indexprs` is marked for
extended storage. Providing a very simple reproducer in psql below:
Thanks to commit 96cdeae, only a few catalogs remain that are missing TOAST
tables: pg_attribute, pg_class, pg_index, pg_largeobject, and
pg_largeobject_metadata. I've attached a short patch to add one for
pg_index, which resolves the issue cited here. This passes "check-world"
and didn't fail for a few ad hoc tests (e.g., VACUUM FULL on pg_index). I
haven't spent too much time investigating possible circularity issues, but
I'll note that none of the system indexes presently use the indexprs and
indpred columns.
If we do want to proceed with adding a TOAST table to pg_index, IMHO it
would be better to do it sooner than later so that it has plenty of time to
bake.
--
nathan
Attachments:
v1-0001-add-toast-table-to-pg_index.patchtext/plain; charset=us-asciiDownload
From 301aeeb346c3499e4555e4db1f4fc124f9cbab62 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Wed, 4 Sep 2024 13:28:27 -0500
Subject: [PATCH v1 1/1] add toast table to pg_index
---
src/include/catalog/pg_index.h | 2 ++
src/test/regress/expected/misc_sanity.out | 4 +---
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/include/catalog/pg_index.h b/src/include/catalog/pg_index.h
index 3462572eb5..6c89639a9e 100644
--- a/src/include/catalog/pg_index.h
+++ b/src/include/catalog/pg_index.h
@@ -69,6 +69,8 @@ CATALOG(pg_index,2610,IndexRelationId) BKI_SCHEMA_MACRO
*/
typedef FormData_pg_index *Form_pg_index;
+DECLARE_TOAST_WITH_MACRO(pg_index, 8149, 8150, PgIndexToastTable, PgIndexToastIndex);
+
DECLARE_INDEX(pg_index_indrelid_index, 2678, IndexIndrelidIndexId, pg_index, btree(indrelid oid_ops));
DECLARE_UNIQUE_INDEX_PKEY(pg_index_indexrelid_index, 2679, IndexRelidIndexId, pg_index, btree(indexrelid oid_ops));
diff --git a/src/test/regress/expected/misc_sanity.out b/src/test/regress/expected/misc_sanity.out
index ad88cbd5c4..2152c65810 100644
--- a/src/test/regress/expected/misc_sanity.out
+++ b/src/test/regress/expected/misc_sanity.out
@@ -56,11 +56,9 @@ ORDER BY 1, 2;
pg_class | relacl | aclitem[]
pg_class | reloptions | text[]
pg_class | relpartbound | pg_node_tree
- pg_index | indexprs | pg_node_tree
- pg_index | indpred | pg_node_tree
pg_largeobject | data | bytea
pg_largeobject_metadata | lomacl | aclitem[]
-(11 rows)
+(9 rows)
-- system catalogs without primary keys
--
--
2.39.3 (Apple Git-146)
Nathan Bossart <nathandbossart@gmail.com> writes:
Thanks to commit 96cdeae, only a few catalogs remain that are missing TOAST
tables: pg_attribute, pg_class, pg_index, pg_largeobject, and
pg_largeobject_metadata. I've attached a short patch to add one for
pg_index, which resolves the issue cited here. This passes "check-world"
and didn't fail for a few ad hoc tests (e.g., VACUUM FULL on pg_index). I
haven't spent too much time investigating possible circularity issues, but
I'll note that none of the system indexes presently use the indexprs and
indpred columns.
Yeah, the possibility of circularity seems like the main hazard, but
I agree it's unlikely that the entries for system indexes could ever
need out-of-line storage. There are many other things that would have
to be improved before a system index could use indexprs or indpred.
regards, tom lane
On 9/4/24 3:08 PM, Tom Lane wrote:
Nathan Bossart <nathandbossart@gmail.com> writes:
Thanks to commit 96cdeae, only a few catalogs remain that are missing TOAST
tables: pg_attribute, pg_class, pg_index, pg_largeobject, and
pg_largeobject_metadata. I've attached a short patch to add one for
pg_index, which resolves the issue cited here. This passes "check-world"
and didn't fail for a few ad hoc tests (e.g., VACUUM FULL on pg_index). I
haven't spent too much time investigating possible circularity issues, but
I'll note that none of the system indexes presently use the indexprs and
indpred columns.Yeah, the possibility of circularity seems like the main hazard, but
I agree it's unlikely that the entries for system indexes could ever
need out-of-line storage. There are many other things that would have
to be improved before a system index could use indexprs or indpred.
Agreed on the unlikeliness of that, certainly in the short-to-mid term.
The impetus driving this is dealing with a data type that can be quite
large, and it's unlikely system catalogs will be dealing with anything
of that nature, or requiring very long expressions that couldn't be
encapsulated in a different way.
Just to be fair, in the case I presented there's an argument that what
I'm trying to do is fairly inefficient for an expression, given I'm
passing around an additional several KB payload into the query. However,
we'd likely have to do that anyway for this problem space, and the
overall performance hit is negligible compared to the search relevancy
boost.
I'm working on a much more robust test, but using a known 10MM 768-dim
dataset and two C-based quantization functions (one using the
expression), I got a 3% relevancy boost with a 2% reduction in latency
and throughput. On some other known datasets, I was able to improve
relevancy 40% or more, though given they were initially returning with
0% relevancy in some cases, it's not fair to compare performance numbers.
There are other ways to solve the problem as well, but allowing for the
larger expression gives users more choices in how they can approach it.
Jonathan
On Wed, Sep 04, 2024 at 03:20:33PM -0400, Jonathan S. Katz wrote:
On 9/4/24 3:08 PM, Tom Lane wrote:
Nathan Bossart <nathandbossart@gmail.com> writes:
Thanks to commit 96cdeae, only a few catalogs remain that are missing TOAST
tables: pg_attribute, pg_class, pg_index, pg_largeobject, and
pg_largeobject_metadata. I've attached a short patch to add one for
pg_index, which resolves the issue cited here. This passes "check-world"
and didn't fail for a few ad hoc tests (e.g., VACUUM FULL on pg_index). I
haven't spent too much time investigating possible circularity issues, but
I'll note that none of the system indexes presently use the indexprs and
indpred columns.Yeah, the possibility of circularity seems like the main hazard, but
I agree it's unlikely that the entries for system indexes could ever
need out-of-line storage. There are many other things that would have
to be improved before a system index could use indexprs or indpred.Agreed on the unlikeliness of that, certainly in the short-to-mid term. The
impetus driving this is dealing with a data type that can be quite large,
and it's unlikely system catalogs will be dealing with anything of that
nature, or requiring very long expressions that couldn't be encapsulated in
a different way.
Any objections to committing this? I've still been unable to identify any
breakage, and adding it now would give us ~1 year of testing before it'd be
available in a GA release. Perhaps we should at least add something to
misc_sanity.sql that verifies no system indexes are using pg_index's TOAST
table.
--
nathan
Nathan Bossart <nathandbossart@gmail.com> writes:
On Wed, Sep 04, 2024 at 03:20:33PM -0400, Jonathan S. Katz wrote:
On 9/4/24 3:08 PM, Tom Lane wrote:
Nathan Bossart <nathandbossart@gmail.com> writes:
Thanks to commit 96cdeae, only a few catalogs remain that are missing TOAST
tables: pg_attribute, pg_class, pg_index, pg_largeobject, and
pg_largeobject_metadata. I've attached a short patch to add one for
pg_index, which resolves the issue cited here.
Any objections to committing this?
Nope.
regards, tom lane
On Wed, Sep 18, 2024 at 10:54:56AM -0400, Tom Lane wrote:
Nathan Bossart <nathandbossart@gmail.com> writes:
Any objections to committing this?
Nope.
Committed. I waffled on whether to add a test for system indexes that used
pg_index's varlena columns, but I ended up leaving it out. I've attached
it here in case anyone thinks we should add it.
--
nathan
Attachments:
test.patchtext/plain; charset=us-asciiDownload
diff --git a/src/test/regress/expected/misc_sanity.out b/src/test/regress/expected/misc_sanity.out
index 45e7492877..2af66cf648 100644
--- a/src/test/regress/expected/misc_sanity.out
+++ b/src/test/regress/expected/misc_sanity.out
@@ -60,6 +60,31 @@ ORDER BY 1, 2;
pg_largeobject_metadata | lomacl | aclitem[]
(9 rows)
+-- **************** pg_index ****************
+-- Look for system indexes that use varlena columns in pg_index. There
+-- shouldn't be any to avoid circularity hazards with pg_index's TOAST table.
+--
+-- The first query returns the varlena columns in pg_index. Each should be
+-- listed in the second query's WHERE clause.
+SELECT attname
+FROM pg_attribute
+WHERE attstorage != 'p' AND
+ attrelid = 'pg_catalog.pg_index'::regclass
+ORDER BY 1;
+ attname
+----------
+ indexprs
+ indpred
+(2 rows)
+
+SELECT relname
+FROM pg_class c JOIN pg_index i ON indexrelid = c.oid
+WHERE c.oid < 16384 AND
+ (i.indexprs IS NOT NULL OR i.indpred IS NOT NULL);
+ relname
+---------
+(0 rows)
+
-- system catalogs without primary keys
--
-- Current exceptions:
diff --git a/src/test/regress/sql/misc_sanity.sql b/src/test/regress/sql/misc_sanity.sql
index 16f3a7c0c1..3e87c50b84 100644
--- a/src/test/regress/sql/misc_sanity.sql
+++ b/src/test/regress/sql/misc_sanity.sql
@@ -53,6 +53,26 @@ WHERE c.oid < 16384 AND
ORDER BY 1, 2;
+-- **************** pg_index ****************
+
+-- Look for system indexes that use varlena columns in pg_index. There
+-- shouldn't be any to avoid circularity hazards with pg_index's TOAST table.
+--
+-- The first query returns the varlena columns in pg_index. Each should be
+-- listed in the second query's WHERE clause.
+
+SELECT attname
+FROM pg_attribute
+WHERE attstorage != 'p' AND
+ attrelid = 'pg_catalog.pg_index'::regclass
+ORDER BY 1;
+
+SELECT relname
+FROM pg_class c JOIN pg_index i ON indexrelid = c.oid
+WHERE c.oid < 16384 AND
+ (i.indexprs IS NOT NULL OR i.indpred IS NOT NULL);
+
+
-- system catalogs without primary keys
--
-- Current exceptions:
Hello Nathan,
18.09.2024 22:52, Nathan Bossart wrote:
Committed. I waffled on whether to add a test for system indexes that used
pg_index's varlena columns, but I ended up leaving it out. I've attached
it here in case anyone thinks we should add it.
I've discovered that Jonathan's initial script:
CREATE TABLE def (id int);
SELECT array_agg(n) b FROM generate_series(1,10_000) n \gset
CREATE OR REPLACE FUNCTION vec_quantizer (a int, b int[]) RETURNS bool
AS $$ SELECT true $$ LANGUAGE SQL IMMUTABLE;
CREATE INDEX ON def (vec_quantizer(id, :'b'));
completed with:
DROP INDEX CONCURRENTLY def_vec_quantizer_idx;
triggers an assertion failure:
TRAP: failed Assert("HaveRegisteredOrActiveSnapshot()"), File: "toast_internals.c", Line: 668, PID: 3723372
with the following stack trace:
ExceptionalCondition at assert.c:52:13
init_toast_snapshot at toast_internals.c:670:2
toast_delete_datum at toast_internals.c:429:60
toast_tuple_cleanup at toast_helper.c:303:30
heap_toast_insert_or_update at heaptoast.c:335:9
heap_update at heapam.c:3752:14
simple_heap_update at heapam.c:4210:11
CatalogTupleUpdate at indexing.c:324:2
index_set_state_flags at index.c:3522:2
index_concurrently_set_dead at index.c:1848:2
index_drop at index.c:2286:3
doDeletion at dependency.c:1362:5
deleteOneObject at dependency.c:1279:12
deleteObjectsInList at dependency.c:229:3
performMultipleDeletions at dependency.c:393:2
RemoveRelations at tablecmds.c:1594:2
ExecDropStmt at utility.c:2008:4
...
This class of assert failures is not new, see e. g., bugs #13809, #18127,
but this concrete instance (with index_set_state_flags()) emerged with
b52c4fc3c and may be worth fixing while on it...
Best regards,
Alexander
On Thu, Sep 19, 2024 at 12:00:00PM +0300, Alexander Lakhin wrote:
I've discovered that Jonathan's initial script:
CREATE TABLE def (id int);
SELECT array_agg(n) b FROM generate_series(1,10_000) n \gset
CREATE OR REPLACE FUNCTION vec_quantizer (a int, b int[]) RETURNS bool
AS $$ SELECT true $$ LANGUAGE SQL IMMUTABLE;
CREATE INDEX ON def (vec_quantizer(id, :'b'));completed with:
DROP INDEX CONCURRENTLY def_vec_quantizer_idx;triggers an assertion failure:
TRAP: failed Assert("HaveRegisteredOrActiveSnapshot()"), File: "toast_internals.c", Line: 668, PID: 3723372
Ha, that was fast. The attached patch seems to fix the assertion failures.
It's probably worth checking if any of the adjacent code paths are
affected, too.
--
nathan
Attachments:
fix_assert.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index b2b3ecb524..2e378ef4ef 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -2255,6 +2255,7 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode)
PopActiveSnapshot();
CommitTransactionCommand();
StartTransactionCommand();
+ PushActiveSnapshot(GetTransactionSnapshot());
/*
* Now we must wait until no running transaction could be using the
@@ -2283,8 +2284,10 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode)
* Again, commit the transaction to make the pg_index update visible
* to other sessions.
*/
+ PopActiveSnapshot();
CommitTransactionCommand();
StartTransactionCommand();
+ PushActiveSnapshot(GetTransactionSnapshot());
/*
* Wait till every transaction that saw the old index state has
@@ -2387,6 +2390,8 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode)
{
UnlockRelationIdForSession(&heaprelid, ShareUpdateExclusiveLock);
UnlockRelationIdForSession(&indexrelid, ShareUpdateExclusiveLock);
+
+ PopActiveSnapshot();
}
}
On Thu, Sep 19, 2024 at 01:36:36PM -0500, Nathan Bossart wrote:
+ PushActiveSnapshot(GetTransactionSnapshot());
/* * Now we must wait until no running transaction could be using the @@ -2283,8 +2284,10 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode) * Again, commit the transaction to make the pg_index update visible * to other sessions. */ + PopActiveSnapshot(); CommitTransactionCommand(); StartTransactionCommand(); + PushActiveSnapshot(GetTransactionSnapshot());/*
* Wait till every transaction that saw the old index state has
@@ -2387,6 +2390,8 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode)
{
UnlockRelationIdForSession(&heaprelid, ShareUpdateExclusiveLock);
UnlockRelationIdForSession(&indexrelid, ShareUpdateExclusiveLock);
+
+ PopActiveSnapshot();
}
}
Perhaps the reason why these snapshots are pushed should be documented
with a comment?
--
Michael
Hello Nathan,
19.09.2024 21:36, Nathan Bossart wrote:
On Thu, Sep 19, 2024 at 12:00:00PM +0300, Alexander Lakhin wrote:
completed with:
DROP INDEX CONCURRENTLY def_vec_quantizer_idx;triggers an assertion failure:
TRAP: failed Assert("HaveRegisteredOrActiveSnapshot()"), File: "toast_internals.c", Line: 668, PID: 3723372Ha, that was fast. The attached patch seems to fix the assertion failures.
It's probably worth checking if any of the adjacent code paths are
affected, too.
Thank you for your attention to that issue!
I've found another two paths to reach that condition:
CREATE INDEX CONCURRENTLY ON def (vec_quantizer(id, :'b'));
ERROR: cannot fetch toast data without an active snapshot
REINDEX INDEX CONCURRENTLY def_vec_quantizer_idx;
(or REINDEX TABLE CONCURRENTLY def;)
TRAP: failed Assert("HaveRegisteredOrActiveSnapshot()"), File: "toast_internals.c", Line: 668, PID: 2934502
ExceptionalCondition at assert.c:52:13
init_toast_snapshot at toast_internals.c:670:2
toast_delete_datum at toast_internals.c:429:60
toast_tuple_cleanup at toast_helper.c:303:30
heap_toast_insert_or_update at heaptoast.c:335:9
heap_update at heapam.c:3752:14
simple_heap_update at heapam.c:4210:11
CatalogTupleUpdate at indexing.c:324:2
index_concurrently_swap at index.c:1649:2
ReindexRelationConcurrently at indexcmds.c:4270:3
ReindexIndex at indexcmds.c:2962:1
ExecReindex at indexcmds.c:2884:4
ProcessUtilitySlow at utility.c:1570:22
...
Perhaps it would make sense to check all CatalogTupleUpdate(pg_index, ...)
calls (I've found 10 such instances, but haven't checked them yet).
Best regards,
Alexander
On Fri, Sep 20, 2024 at 08:16:24AM +0900, Michael Paquier wrote:
Perhaps the reason why these snapshots are pushed should be documented
with a comment?
Definitely. I'll add those once we are more confident that we've
identified all the bugs.
--
nathan
On Fri, Sep 20, 2024 at 07:00:00AM +0300, Alexander Lakhin wrote:
I've found another two paths to reach that condition:
CREATE INDEX CONCURRENTLY ON def (vec_quantizer(id, :'b'));
ERROR:� cannot fetch toast data without an active snapshotREINDEX INDEX CONCURRENTLY def_vec_quantizer_idx;
(or REINDEX TABLE CONCURRENTLY def;)
TRAP: failed Assert("HaveRegisteredOrActiveSnapshot()"), File: "toast_internals.c", Line: 668, PID: 2934502
Here's a (probably naive) attempt at fixing these, too. I'll give each
path a closer look once it feels like we've identified all the bugs.
Perhaps it would make sense to check all CatalogTupleUpdate(pg_index, ...)
calls (I've found 10 such instances, but haven't checked them yet).
Indeed.
--
nathan
Attachments:
v2-0001-fix-failed-assertions-due-to-pg_index-s-TOAST-tab.patchtext/plain; charset=us-asciiDownload
From b7432c42c0cea9c1aadba7c72f9ce2ba6e6407d2 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Fri, 20 Sep 2024 11:48:52 -0500
Subject: [PATCH v2 1/1] fix failed assertions due to pg_index's TOAST table
---
src/backend/catalog/index.c | 5 +++++
src/backend/commands/indexcmds.c | 9 +++++++++
2 files changed, 14 insertions(+)
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index b2b3ecb524..2e378ef4ef 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -2255,6 +2255,7 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode)
PopActiveSnapshot();
CommitTransactionCommand();
StartTransactionCommand();
+ PushActiveSnapshot(GetTransactionSnapshot());
/*
* Now we must wait until no running transaction could be using the
@@ -2283,8 +2284,10 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode)
* Again, commit the transaction to make the pg_index update visible
* to other sessions.
*/
+ PopActiveSnapshot();
CommitTransactionCommand();
StartTransactionCommand();
+ PushActiveSnapshot(GetTransactionSnapshot());
/*
* Wait till every transaction that saw the old index state has
@@ -2387,6 +2390,8 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode)
{
UnlockRelationIdForSession(&heaprelid, ShareUpdateExclusiveLock);
UnlockRelationIdForSession(&indexrelid, ShareUpdateExclusiveLock);
+
+ PopActiveSnapshot();
}
}
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index f99c2d2dee..36318c81ea 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -1798,11 +1798,15 @@ DefineIndex(Oid tableId,
PROGRESS_CREATEIDX_PHASE_WAIT_3);
WaitForOlderSnapshots(limitXmin, true);
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Index can now be marked valid -- update its pg_index entry
*/
index_set_state_flags(indexRelationId, INDEX_CREATE_SET_VALID);
+ PopActiveSnapshot();
+
/*
* The pg_index update will cause backends (including this one) to update
* relcache entries for the index itself, but we should also send a
@@ -4236,6 +4240,8 @@ ReindexRelationConcurrently(const ReindexStmt *stmt, Oid relationOid, const Rein
*/
set_indexsafe_procflags();
+ PushActiveSnapshot(GetTransactionSnapshot());
+
forboth(lc, indexIds, lc2, newIndexIds)
{
ReindexIndexInfo *oldidx = lfirst(lc);
@@ -4280,8 +4286,10 @@ ReindexRelationConcurrently(const ReindexStmt *stmt, Oid relationOid, const Rein
}
/* Commit this transaction and make index swaps visible */
+ PopActiveSnapshot();
CommitTransactionCommand();
StartTransactionCommand();
+ PushActiveSnapshot(GetTransactionSnapshot());
/*
* While we could set PROC_IN_SAFE_IC if all indexes qualified, there's no
@@ -4316,6 +4324,7 @@ ReindexRelationConcurrently(const ReindexStmt *stmt, Oid relationOid, const Rein
}
/* Commit this transaction to make the updates visible. */
+ PopActiveSnapshot();
CommitTransactionCommand();
StartTransactionCommand();
--
2.39.5 (Apple Git-154)
Hello Nathan,
20.09.2024 19:51, Nathan Bossart wrote:
Here's a (probably naive) attempt at fixing these, too. I'll give each
path a closer look once it feels like we've identified all the bugs.
Thank you for the updated patch!
I tested it with two code modifications (1st is to make each created
expression index TOASTed (by prepending 1M of spaces to the indexeprs
value) and 2nd to make each created index an expression index (by
modifying index_elem_options in gram.y) — both modifications are kludgy so
I don't dare to publish them) and found no other snapshot-related issues
during `make check-world`.
Best regards,
Alexander
On Mon, Sep 23, 2024 at 04:00:00PM +0300, Alexander Lakhin wrote:
I tested it with two code modifications (1st is to make each created
expression index TOASTed (by prepending 1M of spaces to the indexeprs
value) and 2nd to make each created index an expression index (by
modifying index_elem_options in gram.y) - both modifications are kludgy so
I don't dare to publish them) and found no other snapshot-related issues
during `make check-world`.
Thanks. Here is an updated patch with tests and comments. I've also moved
the calls to PushActiveSnapshot()/PopActiveSnapshot() to surround only the
section of code where the snapshot is needed. Besides being more similar
in style to other fixes I found, I think this is safer because much of this
code is cautious to avoid deadlocks. For example, DefineIndex() has the
following comment:
/*
* The snapshot subsystem could still contain registered snapshots that
* are holding back our process's advertised xmin; in particular, if
* default_transaction_isolation = serializable, there is a transaction
* snapshot that is still active. The CatalogSnapshot is likewise a
* hazard. To ensure no deadlocks, we must commit and start yet another
* transaction, and do our wait before any snapshot has been taken in it.
*/
I carefully inspected all the code paths this patch touches, and I think
I've got all the details right, but I would be grateful if someone else
could take a look.
--
nathan
Attachments:
v3-0001-Ensure-we-have-a-snapshot-when-updating-pg_index-.patchtext/plain; charset=us-asciiDownload
From 3be62fac910e930f5635193ac60d1536b17e0d6d Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Mon, 23 Sep 2024 10:33:58 -0500
Subject: [PATCH v3 1/1] Ensure we have a snapshot when updating pg_index
entries.
Creating, reindexing, and dropping an index concurrently could
entail accessing pg_index's TOAST table, which was recently added
in commit b52c4fc3c0. These code paths start and commit their own
transactions internally, but they do not always set an active
snapshot. This rightfully leads to assertion failures and ERRORs
when trying to access pg_index's TOAST table, such as the
following:
ERROR: cannot fetch toast data without an active snapshot
To fix, push an active snapshot just before each section of code
that might require accessing pg_index's TOAST table, and pop it
shortly afterwards.
Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/a97d7401-e7c9-f771-6a00-037379f0a8bb%40gmail.com
---
src/backend/catalog/index.c | 21 +++++++++++++++++++++
src/backend/commands/indexcmds.c | 24 ++++++++++++++++++++++++
src/test/regress/expected/indexing.out | 15 +++++++++++++++
src/test/regress/sql/indexing.sql | 16 ++++++++++++++++
4 files changed, 76 insertions(+)
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index b2b3ecb524..09d79b26b8 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -2276,9 +2276,17 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode)
*/
WaitForLockers(heaplocktag, AccessExclusiveLock, true);
+ /*
+ * Updating pg_index might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/* Finish invalidation of index and mark it as dead */
index_concurrently_set_dead(heapId, indexId);
+ PopActiveSnapshot();
+
/*
* Again, commit the transaction to make the pg_index update visible
* to other sessions.
@@ -2326,6 +2334,16 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode)
RelationForgetRelation(indexId);
+ /*
+ * Updating pg_index might involve TOAST table access, so ensure we have a
+ * valid snapshot. We only expect to get here without an active snapshot
+ * in the concurrent path.
+ */
+ if (concurrent)
+ PushActiveSnapshot(GetTransactionSnapshot());
+ else
+ Assert(ActiveSnapshotSet());
+
/*
* fix INDEX relation, and check for expressional index
*/
@@ -2343,6 +2361,9 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode)
ReleaseSysCache(tuple);
table_close(indexRelation, RowExclusiveLock);
+ if (concurrent)
+ PopActiveSnapshot();
+
/*
* if it has any expression columns, we might have stored statistics about
* them.
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index f99c2d2dee..7d41cb73ab 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -1798,11 +1798,19 @@ DefineIndex(Oid tableId,
PROGRESS_CREATEIDX_PHASE_WAIT_3);
WaitForOlderSnapshots(limitXmin, true);
+ /*
+ * Updating pg_index might involve TOAST table access, so ensure we have a
+ * valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Index can now be marked valid -- update its pg_index entry
*/
index_set_state_flags(indexRelationId, INDEX_CREATE_SET_VALID);
+ PopActiveSnapshot();
+
/*
* The pg_index update will cause backends (including this one) to update
* relcache entries for the index itself, but we should also send a
@@ -4256,12 +4264,20 @@ ReindexRelationConcurrently(const ReindexStmt *stmt, Oid relationOid, const Rein
get_rel_namespace(oldidx->tableId),
false);
+ /*
+ * Updating pg_index might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Swap old index with the new one. This also marks the new one as
* valid and the old one as not valid.
*/
index_concurrently_swap(newidx->indexId, oldidx->indexId, oldName);
+ PopActiveSnapshot();
+
/*
* Invalidate the relcache for the table, so that after this commit
* all sessions will refresh any cached plans that might reference the
@@ -4312,7 +4328,15 @@ ReindexRelationConcurrently(const ReindexStmt *stmt, Oid relationOid, const Rein
*/
CHECK_FOR_INTERRUPTS();
+ /*
+ * Updating pg_index might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
index_concurrently_set_dead(oldidx->tableId, oldidx->indexId);
+
+ PopActiveSnapshot();
}
/* Commit this transaction to make the updates visible. */
diff --git a/src/test/regress/expected/indexing.out b/src/test/regress/expected/indexing.out
index f25723da92..49f92fe81f 100644
--- a/src/test/regress/expected/indexing.out
+++ b/src/test/regress/expected/indexing.out
@@ -1640,3 +1640,18 @@ select indexrelid::regclass, indisvalid, indisreplident,
(3 rows)
drop table parted_replica_tab;
+-- indexing commands work with TOASTed values in pg_index
+create table test_pg_index_toast_table (a int);
+create or replace function test_pg_index_toast_func (a int, b int[])
+ returns bool as $$ select true $$ language sql immutable;
+select array_agg(n) b from generate_series(1, 10000) n \gset
+create index concurrently test_pg_index_toast_index
+ on test_pg_index_toast_table (test_pg_index_toast_func(a, :'b'));
+reindex index concurrently test_pg_index_toast_index;
+drop index concurrently test_pg_index_toast_index;
+create index test_pg_index_toast_index
+ on test_pg_index_toast_table (test_pg_index_toast_func(a, :'b'));
+reindex index test_pg_index_toast_index;
+drop index test_pg_index_toast_index;
+drop function test_pg_index_toast_func;
+drop table test_pg_index_toast_table;
diff --git a/src/test/regress/sql/indexing.sql b/src/test/regress/sql/indexing.sql
index 5f1f4b80c9..8c07d49d46 100644
--- a/src/test/regress/sql/indexing.sql
+++ b/src/test/regress/sql/indexing.sql
@@ -919,3 +919,19 @@ select indexrelid::regclass, indisvalid, indisreplident,
where indexrelid::regclass::text like 'parted_replica%'
order by indexrelid::regclass::text collate "C";
drop table parted_replica_tab;
+
+-- indexing commands work with TOASTed values in pg_index
+create table test_pg_index_toast_table (a int);
+create or replace function test_pg_index_toast_func (a int, b int[])
+ returns bool as $$ select true $$ language sql immutable;
+select array_agg(n) b from generate_series(1, 10000) n \gset
+create index concurrently test_pg_index_toast_index
+ on test_pg_index_toast_table (test_pg_index_toast_func(a, :'b'));
+reindex index concurrently test_pg_index_toast_index;
+drop index concurrently test_pg_index_toast_index;
+create index test_pg_index_toast_index
+ on test_pg_index_toast_table (test_pg_index_toast_func(a, :'b'));
+reindex index test_pg_index_toast_index;
+drop index test_pg_index_toast_index;
+drop function test_pg_index_toast_func;
+drop table test_pg_index_toast_table;
--
2.39.5 (Apple Git-154)
On Mon, Sep 23, 2024 at 10:50:21AM -0500, Nathan Bossart wrote:
I carefully inspected all the code paths this patch touches, and I think
I've got all the details right, but I would be grateful if someone else
could take a look.
No objections from here with putting the snapshots pops and pushes
outside the inner routines of reindex/drop concurrently, meaning that
ReindexRelationConcurrently(), DefineIndex() and index_drop() are fine
to do these operations.
Looking at the patch, we could just add an assertion based on
ActiveSnapshotSet() in index_set_state_flags().
Actually, thinking more... Could it be better to have some more
sanity checks in the stack outside the toast code for catalogs with
toast tables? For example, I could imagine adding a check in
CatalogTupleUpdate() so as all catalog accessed that have a toast
relation require an active snapshot. That would make checks more
aggressive, because we would not need any toast data in a catalog to
make sure that there is a snapshot set. This strikes me as something
we could do better to improve the detection of failures like the one
reported by Alexander when updating catalog tuples as this can be
triggered each time we do a CatalogTupleUpdate() when dirtying a
catalog tuple. The idea is then to have something before the
HaveRegisteredOrActiveSnapshot() in the toast internals, for catalogs,
and we would not require toast data to detect problems.
--
Michael
On Tue, Sep 24, 2024 at 01:21:45PM +0900, Michael Paquier wrote:
On Mon, Sep 23, 2024 at 10:50:21AM -0500, Nathan Bossart wrote:
I carefully inspected all the code paths this patch touches, and I think
I've got all the details right, but I would be grateful if someone else
could take a look.No objections from here with putting the snapshots pops and pushes
outside the inner routines of reindex/drop concurrently, meaning that
ReindexRelationConcurrently(), DefineIndex() and index_drop() are fine
to do these operations.
Great. I plan to push 0001 shortly.
Actually, thinking more... Could it be better to have some more
sanity checks in the stack outside the toast code for catalogs with
toast tables? For example, I could imagine adding a check in
CatalogTupleUpdate() so as all catalog accessed that have a toast
relation require an active snapshot. That would make checks more
aggressive, because we would not need any toast data in a catalog to
make sure that there is a snapshot set. This strikes me as something
we could do better to improve the detection of failures like the one
reported by Alexander when updating catalog tuples as this can be
triggered each time we do a CatalogTupleUpdate() when dirtying a
catalog tuple. The idea is then to have something before the
HaveRegisteredOrActiveSnapshot() in the toast internals, for catalogs,
and we would not require toast data to detect problems.
I gave this a try and, unsurprisingly, found a bunch of other problems. I
hastily hacked together the attached patch that should fix all of them, but
I'd still like to comb through the code a bit more. The three catalogs
with problems are pg_replication_origin, pg_subscription, and
pg_constraint. pg_contraint has had a TOAST table for a while, and I don't
think it's unheard of for conbin to be large, so this one is probably worth
fixing. pg_subscription hasn't had its TOAST table for quite as long, but
presumably subpublications could be large enough to require out-of-line
storage. pg_replication_origin, however, only has one varlena column:
roname. Three out of the seven problem areas involve
pg_replication_origin, but AFAICT that'd only ever be a problem if the name
of your replication origin requires out-of-line storage. So... maybe we
should just remove pg_replication_origin's TOAST table instead...
--
nathan
Attachments:
toast_snapshot.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/catalog/indexing.c b/src/backend/catalog/indexing.c
index d0d1abda58..b8be124555 100644
--- a/src/backend/catalog/indexing.c
+++ b/src/backend/catalog/indexing.c
@@ -22,6 +22,7 @@
#include "catalog/index.h"
#include "catalog/indexing.h"
#include "executor/executor.h"
+#include "miscadmin.h"
#include "utils/rel.h"
@@ -234,6 +235,14 @@ CatalogTupleInsert(Relation heapRel, HeapTuple tup)
{
CatalogIndexState indstate;
+ /*
+ * If we might need TOAST access, make sure the caller has set up a valid
+ * snapshot.
+ */
+ Assert(HaveRegisteredOrActiveSnapshot() ||
+ !OidIsValid(heapRel->rd_rel->reltoastrelid) ||
+ !IsNormalProcessingMode());
+
CatalogTupleCheckConstraints(heapRel, tup);
indstate = CatalogOpenIndexes(heapRel);
@@ -256,6 +265,14 @@ void
CatalogTupleInsertWithInfo(Relation heapRel, HeapTuple tup,
CatalogIndexState indstate)
{
+ /*
+ * If we might need TOAST access, make sure the caller has set up a valid
+ * snapshot.
+ */
+ Assert(HaveRegisteredOrActiveSnapshot() ||
+ !OidIsValid(heapRel->rd_rel->reltoastrelid) ||
+ !IsNormalProcessingMode());
+
CatalogTupleCheckConstraints(heapRel, tup);
simple_heap_insert(heapRel, tup);
@@ -273,6 +290,14 @@ void
CatalogTuplesMultiInsertWithInfo(Relation heapRel, TupleTableSlot **slot,
int ntuples, CatalogIndexState indstate)
{
+ /*
+ * If we might need TOAST access, make sure the caller has set up a valid
+ * snapshot.
+ */
+ Assert(HaveRegisteredOrActiveSnapshot() ||
+ !OidIsValid(heapRel->rd_rel->reltoastrelid) ||
+ !IsNormalProcessingMode());
+
/* Nothing to do */
if (ntuples <= 0)
return;
@@ -315,6 +340,14 @@ CatalogTupleUpdate(Relation heapRel, ItemPointer otid, HeapTuple tup)
CatalogIndexState indstate;
TU_UpdateIndexes updateIndexes = TU_All;
+ /*
+ * If we might need TOAST access, make sure the caller has set up a valid
+ * snapshot.
+ */
+ Assert(HaveRegisteredOrActiveSnapshot() ||
+ !OidIsValid(heapRel->rd_rel->reltoastrelid) ||
+ !IsNormalProcessingMode());
+
CatalogTupleCheckConstraints(heapRel, tup);
indstate = CatalogOpenIndexes(heapRel);
@@ -339,6 +372,14 @@ CatalogTupleUpdateWithInfo(Relation heapRel, ItemPointer otid, HeapTuple tup,
{
TU_UpdateIndexes updateIndexes = TU_All;
+ /*
+ * If we might need TOAST access, make sure the caller has set up a valid
+ * snapshot.
+ */
+ Assert(HaveRegisteredOrActiveSnapshot() ||
+ !OidIsValid(heapRel->rd_rel->reltoastrelid) ||
+ !IsNormalProcessingMode());
+
CatalogTupleCheckConstraints(heapRel, tup);
simple_heap_update(heapRel, otid, tup, &updateIndexes);
@@ -364,5 +405,13 @@ CatalogTupleUpdateWithInfo(Relation heapRel, ItemPointer otid, HeapTuple tup,
void
CatalogTupleDelete(Relation heapRel, ItemPointer tid)
{
+ /*
+ * If we might need TOAST access, make sure the caller has set up a valid
+ * snapshot.
+ */
+ Assert(HaveRegisteredOrActiveSnapshot() ||
+ !OidIsValid(heapRel->rd_rel->reltoastrelid) ||
+ !IsNormalProcessingMode());
+
simple_heap_delete(heapRel, tid);
}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index d27e6cf345..06ac1d39d2 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -19292,9 +19292,17 @@ ATExecDetachPartition(List **wqueue, AlteredTableInfo *tab, Relation rel,
tab->rel = rel;
}
+ if (concurrent)
+ PushActiveSnapshot(GetTransactionSnapshot());
+ else
+ Assert(HaveRegisteredOrActiveSnapshot());
+
/* Do the final part of detaching */
DetachPartitionFinalize(rel, partRel, concurrent, defaultPartOid);
+ if (concurrent)
+ PopActiveSnapshot();
+
ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partRel));
/* keep our lock until commit */
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index e03e761392..ed7d187bfe 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -377,6 +377,8 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
replorigin_session_origin_lsn = InvalidXLogRecPtr;
replorigin_session_origin_timestamp = 0;
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Drop the tablesync's origin tracking if exists.
*
@@ -387,6 +389,8 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
*/
replorigin_drop_by_name(originname, true, false);
+ PopActiveSnapshot();
+
finish_sync_worker();
}
else
@@ -483,6 +487,8 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
started_tx = true;
}
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Remove the tablesync origin tracking if exists.
*
@@ -500,6 +506,8 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
sizeof(originname));
replorigin_drop_by_name(originname, true, false);
+ PopActiveSnapshot();
+
/*
* Update the state to READY only after the origin cleanup.
*/
@@ -1456,7 +1464,9 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
* logged for the purpose of recovery. Locks are to prevent the
* replication origin from vanishing while advancing.
*/
+ PushActiveSnapshot(GetTransactionSnapshot());
originid = replorigin_create(originname);
+ PopActiveSnapshot();
LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
replorigin_advance(originid, *origin_startpos, InvalidXLogRecPtr,
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 925dff9cc4..502033f3d8 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -4604,8 +4604,10 @@ run_apply_worker()
walrcv_startstreaming(LogRepWorkerWalRcvConn, &options);
StartTransactionCommand();
+ PushActiveSnapshot(GetTransactionSnapshot());
UpdateTwoPhaseState(MySubscription->oid, LOGICALREP_TWOPHASE_STATE_ENABLED);
MySubscription->twophasestate = LOGICALREP_TWOPHASE_STATE_ENABLED;
+ PopActiveSnapshot();
CommitTransactionCommand();
}
else
@@ -4821,7 +4823,9 @@ DisableSubscriptionAndExit(void)
/* Disable the subscription */
StartTransactionCommand();
+ PushActiveSnapshot(GetTransactionSnapshot());
DisableSubscription(MySubscription->oid);
+ PopActiveSnapshot();
CommitTransactionCommand();
/* Ensure we remove no-longer-useful entry for worker's start time */
@@ -4925,6 +4929,8 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
started_tx = true;
}
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Protect subskiplsn of pg_subscription from being concurrently updated
* while clearing it.
@@ -4983,6 +4989,8 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
heap_freetuple(tup);
table_close(rel, NoLock);
+ PopActiveSnapshot();
+
if (started_tx)
CommitTransactionCommand();
}
On Tue, Sep 24, 2024 at 02:26:08PM -0500, Nathan Bossart wrote:
I gave this a try and, unsurprisingly, found a bunch of other problems. I
hastily hacked together the attached patch that should fix all of them, but
I'd still like to comb through the code a bit more. The three catalogs
with problems are pg_replication_origin, pg_subscription, and
pg_constraint.
Regression tests don't blow up after this patch and the reindex parts.
pg_contraint has had a TOAST table for a while, and I don't
think it's unheard of for conbin to be large, so this one is probably worth
fixing.
Ahh. That's the tablecmds.c part for the partition detach.
pg_subscription hasn't had its TOAST table for quite as long, but
presumably subpublications could be large enough to require out-of-line
storage. pg_replication_origin, however, only has one varlena column:
roname. Three out of the seven problem areas involve
pg_replication_origin, but AFAICT that'd only ever be a problem if the name
of your replication origin requires out-of-line storage. So... maybe we
should just remove pg_replication_origin's TOAST table instead...
I'd rather keep it, FWIW. Contrary to pg_authid it does not imply
problems at the same scale because we would have access to the toast
relation in all the code paths with logical workers or table syncs.
The other one was at early authentication stages.
+ /*
+ * If we might need TOAST access, make sure the caller has set up a valid
+ * snapshot.
+ */
+ Assert(HaveRegisteredOrActiveSnapshot() ||
+ !OidIsValid(heapRel->rd_rel->reltoastrelid) ||
+ !IsNormalProcessingMode());
+
I didn't catch that we could just reuse the opened Relation in these
paths and check for reltoastrelid. Nice.
It sounds to me that we should be much more proactive in detecting
these failures and add something like that on HEAD. That's cheap
enough. As the checks are the same for all these code paths, perhaps
just hide them behind a local macro to reduce the duplication?
Not the responsibility of this patch, but the business with
clear_subscription_skip_lsn() with its conditional transaction start
feels messy. This comes down to the way handles work for 2PC and the
streaming, which may or may not be in a transaction depending on the
state of the upper caller. Your patch looks right in the way
snapshots are set, as far as I've checked.
--
Michael
On Tue, Sep 24, 2024 at 02:26:08PM -0500, Nathan Bossart wrote:
On Tue, Sep 24, 2024 at 01:21:45PM +0900, Michael Paquier wrote:
On Mon, Sep 23, 2024 at 10:50:21AM -0500, Nathan Bossart wrote:
I carefully inspected all the code paths this patch touches, and I think
I've got all the details right, but I would be grateful if someone else
could take a look.No objections from here with putting the snapshots pops and pushes
outside the inner routines of reindex/drop concurrently, meaning that
ReindexRelationConcurrently(), DefineIndex() and index_drop() are fine
to do these operations.Great. I plan to push 0001 shortly.
Committed this one.
--
nathan
On Wed, Sep 25, 2024 at 01:05:26PM +0900, Michael Paquier wrote:
On Tue, Sep 24, 2024 at 02:26:08PM -0500, Nathan Bossart wrote:
So... maybe we
should just remove pg_replication_origin's TOAST table instead...I'd rather keep it, FWIW. Contrary to pg_authid it does not imply
problems at the same scale because we would have access to the toast
relation in all the code paths with logical workers or table syncs.
The other one was at early authentication stages.
Okay.
It sounds to me that we should be much more proactive in detecting
these failures and add something like that on HEAD. That's cheap
enough. As the checks are the same for all these code paths, perhaps
just hide them behind a local macro to reduce the duplication?
In v2, I moved the assertions to a new function called by the heapam.c
routines. I was hoping to move them to the tableam.h routines, but several
callers (in particular, the catalog/indexing.c ones that are causing
problems) call the heap ones directly. I've also included a 0001 patch
that introduces a RelationGetToastRelid() macro because I got tired of
typing "rel->rd_rel->reltoastrelid".
0002 could probably use some more commentary, but otherwise I think it is
in decent shape. You (Michael) seem to be implying that I should
back-patch the actual fixes and only apply the new assertions to v18. Am I
understanding you correctly?
--
nathan
Attachments:
v2-0001-add-RelationGetToastRelid-macro.patchtext/plain; charset=us-asciiDownload
From 3eb1c0997e18d9e7a56cc3e974076e58613a1bb9 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Tue, 8 Oct 2024 11:28:58 -0500
Subject: [PATCH v2 1/2] add RelationGetToastRelid macro
---
contrib/amcheck/verify_heapam.c | 6 +++---
src/backend/access/common/toast_internals.c | 2 +-
src/backend/access/heap/heaptoast.c | 6 +++---
src/backend/catalog/heap.c | 2 +-
src/backend/catalog/index.c | 2 +-
src/backend/catalog/toasting.c | 2 +-
src/backend/commands/cluster.c | 19 ++++++++++---------
src/backend/commands/indexcmds.c | 4 ++--
src/backend/commands/tablecmds.c | 8 ++++----
src/backend/commands/vacuum.c | 2 +-
.../replication/logical/reorderbuffer.c | 4 ++--
src/backend/utils/adt/dbsize.c | 4 ++--
src/include/utils/rel.h | 6 ++++++
13 files changed, 37 insertions(+), 30 deletions(-)
diff --git a/contrib/amcheck/verify_heapam.c b/contrib/amcheck/verify_heapam.c
index f2526ed63a..78316daacd 100644
--- a/contrib/amcheck/verify_heapam.c
+++ b/contrib/amcheck/verify_heapam.c
@@ -367,12 +367,12 @@ verify_heapam(PG_FUNCTION_ARGS)
}
/* Optionally open the toast relation, if any. */
- if (ctx.rel->rd_rel->reltoastrelid && check_toast)
+ if (RelationGetToastRelid(ctx.rel) && check_toast)
{
int offset;
/* Main relation has associated toast relation */
- ctx.toast_rel = table_open(ctx.rel->rd_rel->reltoastrelid,
+ ctx.toast_rel = table_open(RelationGetToastRelid(ctx.rel),
AccessShareLock);
offset = toast_open_indexes(ctx.toast_rel,
AccessShareLock,
@@ -1721,7 +1721,7 @@ check_tuple_attribute(HeapCheckContext *ctx)
}
/* The relation better have a toast table */
- if (!ctx->rel->rd_rel->reltoastrelid)
+ if (!OidIsValid(RelationGetToastRelid(ctx->rel)))
{
report_corruption(ctx,
psprintf("toast value %u is external but relation has no toast relation",
diff --git a/src/backend/access/common/toast_internals.c b/src/backend/access/common/toast_internals.c
index 90d0654e62..39d93d12f2 100644
--- a/src/backend/access/common/toast_internals.c
+++ b/src/backend/access/common/toast_internals.c
@@ -151,7 +151,7 @@ toast_save_datum(Relation rel, Datum value,
* uniqueness of the OID we assign to the toasted item, even though it has
* additional columns besides OID.
*/
- toastrel = table_open(rel->rd_rel->reltoastrelid, RowExclusiveLock);
+ toastrel = table_open(RelationGetToastRelid(rel), RowExclusiveLock);
toasttupDesc = toastrel->rd_att;
/* Open all the toast indexes and look for the valid one */
diff --git a/src/backend/access/heap/heaptoast.c b/src/backend/access/heap/heaptoast.c
index a420e16530..cf6f3dff53 100644
--- a/src/backend/access/heap/heaptoast.c
+++ b/src/backend/access/heap/heaptoast.c
@@ -213,7 +213,7 @@ heap_toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple oldtup,
* XXX maybe the threshold should be less than maxDataLen?
*/
if (toast_attr[biggest_attno].tai_size > maxDataLen &&
- rel->rd_rel->reltoastrelid != InvalidOid)
+ OidIsValid(RelationGetToastRelid(rel)))
toast_tuple_externalize(&ttc, biggest_attno, options);
}
@@ -224,7 +224,7 @@ heap_toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple oldtup,
*/
while (heap_compute_data_size(tupleDesc,
toast_values, toast_isnull) > maxDataLen &&
- rel->rd_rel->reltoastrelid != InvalidOid)
+ OidIsValid(RelationGetToastRelid(rel)))
{
int biggest_attno;
@@ -259,7 +259,7 @@ heap_toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple oldtup,
while (heap_compute_data_size(tupleDesc,
toast_values, toast_isnull) > maxDataLen &&
- rel->rd_rel->reltoastrelid != InvalidOid)
+ OidIsValid(RelationGetToastRelid(rel)))
{
int biggest_attno;
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 78e59384d1..966cf39f34 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -3075,7 +3075,7 @@ heap_truncate_one_rel(Relation rel)
RelationTruncateIndexes(rel);
/* If there is a toast table, truncate that too */
- toastrelid = rel->rd_rel->reltoastrelid;
+ toastrelid = RelationGetToastRelid(rel);
if (OidIsValid(toastrelid))
{
Relation toastrel = table_open(toastrelid, AccessExclusiveLock);
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 6084dfa97c..7f07fda372 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3930,7 +3930,7 @@ reindex_relation(const ReindexStmt *stmt, Oid relid, int flags,
get_namespace_name(RelationGetNamespace(rel)),
RelationGetRelationName(rel));
- toast_relid = rel->rd_rel->reltoastrelid;
+ toast_relid = RelationGetToastRelid(rel);
/*
* Get the list of index OIDs for this relation. (We trust the relcache
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index ad3082c62a..cf47e0960e 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -149,7 +149,7 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
/*
* Is it already toasted?
*/
- if (rel->rd_rel->reltoastrelid != InvalidOid)
+ if (OidIsValid(RelationGetToastRelid(rel)))
return false;
/*
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 78f96789b0..20da5201b6 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -780,7 +780,7 @@ make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, Oid NewAccessMethod,
* Note that NewHeapCreateToastTable ends with CommandCounterIncrement, so
* that the TOAST table will be visible for insertion.
*/
- toastid = OldHeap->rd_rel->reltoastrelid;
+ toastid = RelationGetToastRelid(OldHeap);
if (OidIsValid(toastid))
{
/* keep the existing toast table's reloptions, if any */
@@ -870,8 +870,8 @@ copy_table_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
* We don't need to open the toast relation here, just lock it. The lock
* will be held till end of transaction.
*/
- if (OldHeap->rd_rel->reltoastrelid)
- LockRelationOid(OldHeap->rd_rel->reltoastrelid, AccessExclusiveLock);
+ if (RelationGetToastRelid(OldHeap))
+ LockRelationOid(RelationGetToastRelid(OldHeap), AccessExclusiveLock);
/*
* If both tables have TOAST tables, perform toast swap by content. It is
@@ -880,7 +880,8 @@ copy_table_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
* swap by links. This is okay because swap by content is only essential
* for system catalogs, and we don't support schema changes for them.
*/
- if (OldHeap->rd_rel->reltoastrelid && NewHeap->rd_rel->reltoastrelid)
+ if (OidIsValid(RelationGetToastRelid(OldHeap)) &&
+ OidIsValid(RelationGetToastRelid(NewHeap)))
{
*pSwapToastByContent = true;
@@ -902,7 +903,7 @@ copy_table_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
* work for values copied over from the old toast table, but not for
* any values that we toast which were previously not toasted.)
*/
- NewHeap->rd_toastoid = OldHeap->rd_rel->reltoastrelid;
+ NewHeap->rd_toastoid = RelationGetToastRelid(OldHeap);
}
else
*pSwapToastByContent = false;
@@ -1581,19 +1582,19 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
Relation newrel;
newrel = table_open(OIDOldHeap, NoLock);
- if (OidIsValid(newrel->rd_rel->reltoastrelid))
+ if (OidIsValid(RelationGetToastRelid(newrel)))
{
Oid toastidx;
char NewToastName[NAMEDATALEN];
/* Get the associated valid index to be renamed */
- toastidx = toast_get_valid_index(newrel->rd_rel->reltoastrelid,
+ toastidx = toast_get_valid_index(RelationGetToastRelid(newrel),
NoLock);
/* rename the toast table ... */
snprintf(NewToastName, NAMEDATALEN, "pg_toast_%u",
OIDOldHeap);
- RenameRelationInternal(newrel->rd_rel->reltoastrelid,
+ RenameRelationInternal(RelationGetToastRelid(newrel),
NewToastName, true, false);
/* ... and its valid index too. */
@@ -1609,7 +1610,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
* that is updated as part of RenameRelationInternal.
*/
CommandCounterIncrement();
- ResetRelRewrite(newrel->rd_rel->reltoastrelid);
+ ResetRelRewrite(RelationGetToastRelid(newrel));
}
relation_close(newrel, NoLock);
}
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index e33ad81529..b7ddced7f0 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -3680,9 +3680,9 @@ ReindexRelationConcurrently(const ReindexStmt *stmt, Oid relationOid, const Rein
}
/* Also add the toast indexes */
- if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+ if (OidIsValid(RelationGetToastRelid(heapRelation)))
{
- Oid toastOid = heapRelation->rd_rel->reltoastrelid;
+ Oid toastOid = RelationGetToastRelid(heapRelation);
Relation toastRelation = table_open(toastOid,
ShareUpdateExclusiveLock);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index af8c05b91f..1788fbcf60 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -2139,7 +2139,7 @@ ExecuteTruncateGuts(List *explicit_rels,
/*
* The same for the toast table, if any.
*/
- toast_relid = rel->rd_rel->reltoastrelid;
+ toast_relid = RelationGetToastRelid(rel);
if (OidIsValid(toast_relid))
{
Relation toastrel = relation_open(toast_relid,
@@ -15160,10 +15160,10 @@ ATExecSetRelOptions(Relation rel, List *defList, AlterTableType operation,
ReleaseSysCache(tuple);
/* repeat the whole exercise for the toast table, if there's one */
- if (OidIsValid(rel->rd_rel->reltoastrelid))
+ if (OidIsValid(RelationGetToastRelid(rel)))
{
Relation toastrel;
- Oid toastid = rel->rd_rel->reltoastrelid;
+ Oid toastid = RelationGetToastRelid(rel);
toastrel = table_open(toastid, lockmode);
@@ -15252,7 +15252,7 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
return;
}
- reltoastrelid = rel->rd_rel->reltoastrelid;
+ reltoastrelid = RelationGetToastRelid(rel);
/* Fetch the list of indexes on toast relation if necessary */
if (OidIsValid(reltoastrelid))
{
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index ac8f5d9c25..08bd489aed 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2187,7 +2187,7 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params,
if ((params->options & VACOPT_PROCESS_TOAST) != 0 &&
((params->options & VACOPT_FULL) == 0 ||
(params->options & VACOPT_PROCESS_MAIN) == 0))
- toast_relid = rel->rd_rel->reltoastrelid;
+ toast_relid = RelationGetToastRelid(rel);
else
toast_relid = InvalidOid;
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 22bcf171ff..c9d9590108 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4822,10 +4822,10 @@ ReorderBufferToastReplace(ReorderBuffer *rb, ReorderBufferTXN *txn,
desc = RelationGetDescr(relation);
- toast_rel = RelationIdGetRelation(relation->rd_rel->reltoastrelid);
+ toast_rel = RelationIdGetRelation(RelationGetToastRelid(relation));
if (!RelationIsValid(toast_rel))
elog(ERROR, "could not open toast relation with OID %u (base relation \"%s\")",
- relation->rd_rel->reltoastrelid, RelationGetRelationName(relation));
+ RelationGetToastRelid(relation), RelationGetRelationName(relation));
toast_desc = RelationGetDescr(toast_rel);
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index e63e99c141..a3597be9bc 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -436,8 +436,8 @@ calculate_table_size(Relation rel)
/*
* Size of toast relation
*/
- if (OidIsValid(rel->rd_rel->reltoastrelid))
- size += calculate_toast_table_size(rel->rd_rel->reltoastrelid);
+ if (OidIsValid(RelationGetToastRelid(rel)))
+ size += calculate_toast_table_size(RelationGetToastRelid(rel));
return size;
}
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 8700204953..6631b37d97 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -504,6 +504,12 @@ typedef struct ViewOptions
*/
#define RelationGetRelid(relation) ((relation)->rd_id)
+/*
+ * RelationGetToastRelid
+ * Returns the OID of the relation's TOAST table (or InvalidOid if none)
+ */
+#define RelationGetToastRelid(relation) ((relation)->rd_rel->reltoastrelid)
+
/*
* RelationGetNumberOfAttributes
* Returns the total number of attributes in a relation.
--
2.39.5 (Apple Git-154)
v2-0002-ensure-we-have-a-snapshot-if-we-might-need-toast-.patchtext/plain; charset=us-asciiDownload
From 5f4f8053ee4e11276540d9cd48e149c8f02a7588 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Tue, 8 Oct 2024 13:39:30 -0500
Subject: [PATCH v2 2/2] ensure we have a snapshot if we might need toast
access
---
src/backend/access/heap/heapam.c | 25 +++++++++++++++++++++
src/backend/commands/tablecmds.c | 8 +++++++
src/backend/replication/logical/tablesync.c | 10 +++++++++
src/backend/replication/logical/worker.c | 8 +++++++
4 files changed, 51 insertions(+)
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index da5e656a08..d51525a85c 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -210,6 +210,23 @@ static const int MultiXactStatusLock[MaxMultiXactStatus + 1] =
#define TUPLOCK_from_mxstatus(status) \
(MultiXactStatusLock[(status)])
+/*
+ * Check that we have a valid snapshot if we might need TOAST access.
+ */
+static inline void
+AssertHasSnapshotForToast(Relation rel)
+{
+ /* bootstrap mode in particular breaks this rule */
+ if (!IsNormalProcessingMode())
+ return;
+
+ /* if the relation doesn't have a TOAST table, we are good */
+ if (!OidIsValid(RelationGetToastRelid(rel)))
+ return;
+
+ Assert(HaveRegisteredOrActiveSnapshot());
+}
+
/* ----------------------------------------------------------------
* heap support routines
* ----------------------------------------------------------------
@@ -1995,6 +2012,8 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
Assert(HeapTupleHeaderGetNatts(tup->t_data) <=
RelationGetNumberOfAttributes(relation));
+ AssertHasSnapshotForToast(relation);
+
/*
* Fill in tuple header fields and toast the tuple if necessary.
*
@@ -2272,6 +2291,8 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
/* currently not needed (thus unsupported) for heap_multi_insert() */
Assert(!(options & HEAP_INSERT_NO_LOGICAL));
+ AssertHasSnapshotForToast(relation);
+
needwal = RelationNeedsWAL(relation);
saveFreeSpace = RelationGetTargetPageFreeSpace(relation,
HEAP_DEFAULT_FILLFACTOR);
@@ -2694,6 +2715,8 @@ heap_delete(Relation relation, ItemPointer tid,
Assert(ItemPointerIsValid(tid));
+ AssertHasSnapshotForToast(relation);
+
/*
* Forbid this during a parallel operation, lest it allocate a combo CID.
* Other workers might need that combo CID for visibility checks, and we
@@ -3189,6 +3212,8 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
Assert(HeapTupleHeaderGetNatts(newtup->t_data) <=
RelationGetNumberOfAttributes(relation));
+ AssertHasSnapshotForToast(relation);
+
/*
* Forbid this during a parallel operation, lest it allocate a combo CID.
* Other workers might need that combo CID for visibility checks, and we
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 1788fbcf60..2e468d1d7d 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -19281,9 +19281,17 @@ ATExecDetachPartition(List **wqueue, AlteredTableInfo *tab, Relation rel,
tab->rel = rel;
}
+ if (concurrent)
+ PushActiveSnapshot(GetTransactionSnapshot());
+ else
+ Assert(HaveRegisteredOrActiveSnapshot());
+
/* Do the final part of detaching */
DetachPartitionFinalize(rel, partRel, concurrent, defaultPartOid);
+ if (concurrent)
+ PopActiveSnapshot();
+
ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partRel));
/* keep our lock until commit */
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index e03e761392..ed7d187bfe 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -377,6 +377,8 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
replorigin_session_origin_lsn = InvalidXLogRecPtr;
replorigin_session_origin_timestamp = 0;
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Drop the tablesync's origin tracking if exists.
*
@@ -387,6 +389,8 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
*/
replorigin_drop_by_name(originname, true, false);
+ PopActiveSnapshot();
+
finish_sync_worker();
}
else
@@ -483,6 +487,8 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
started_tx = true;
}
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Remove the tablesync origin tracking if exists.
*
@@ -500,6 +506,8 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
sizeof(originname));
replorigin_drop_by_name(originname, true, false);
+ PopActiveSnapshot();
+
/*
* Update the state to READY only after the origin cleanup.
*/
@@ -1456,7 +1464,9 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
* logged for the purpose of recovery. Locks are to prevent the
* replication origin from vanishing while advancing.
*/
+ PushActiveSnapshot(GetTransactionSnapshot());
originid = replorigin_create(originname);
+ PopActiveSnapshot();
LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
replorigin_advance(originid, *origin_startpos, InvalidXLogRecPtr,
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 925dff9cc4..502033f3d8 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -4604,8 +4604,10 @@ run_apply_worker()
walrcv_startstreaming(LogRepWorkerWalRcvConn, &options);
StartTransactionCommand();
+ PushActiveSnapshot(GetTransactionSnapshot());
UpdateTwoPhaseState(MySubscription->oid, LOGICALREP_TWOPHASE_STATE_ENABLED);
MySubscription->twophasestate = LOGICALREP_TWOPHASE_STATE_ENABLED;
+ PopActiveSnapshot();
CommitTransactionCommand();
}
else
@@ -4821,7 +4823,9 @@ DisableSubscriptionAndExit(void)
/* Disable the subscription */
StartTransactionCommand();
+ PushActiveSnapshot(GetTransactionSnapshot());
DisableSubscription(MySubscription->oid);
+ PopActiveSnapshot();
CommitTransactionCommand();
/* Ensure we remove no-longer-useful entry for worker's start time */
@@ -4925,6 +4929,8 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
started_tx = true;
}
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Protect subskiplsn of pg_subscription from being concurrently updated
* while clearing it.
@@ -4983,6 +4989,8 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
heap_freetuple(tup);
table_close(rel, NoLock);
+ PopActiveSnapshot();
+
if (started_tx)
CommitTransactionCommand();
}
--
2.39.5 (Apple Git-154)
On Tue, Oct 08, 2024 at 01:50:52PM -0500, Nathan Bossart wrote:
0002 could probably use some more commentary, but otherwise I think it is
in decent shape. You (Michael) seem to be implying that I should
back-patch the actual fixes and only apply the new assertions to v18. Am I
understanding you correctly?
Here is a reorganized patch set. 0001 would be back-patched, but the
others would only be applied to v18.
--
nathan
Attachments:
v3-0001-Ensure-we-have-a-snapshot-when-updating-various-s.patchtext/plain; charset=us-asciiDownload
From 6792eeb656e8fc707d85dc07595e2183853d2ce5 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Mon, 14 Oct 2024 14:43:26 -0500
Subject: [PATCH v3 1/3] Ensure we have a snapshot when updating various system
catalogs.
Reported-by: Alexander Lakhin
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org
Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
Backpatch-through: FIXME
---
src/backend/commands/tablecmds.c | 13 +++++++++++
src/backend/replication/logical/tablesync.c | 21 ++++++++++++++++++
src/backend/replication/logical/worker.c | 24 +++++++++++++++++++++
3 files changed, 58 insertions(+)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 1ccc80087c..c6f82e93b9 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -19283,9 +19283,22 @@ ATExecDetachPartition(List **wqueue, AlteredTableInfo *tab, Relation rel,
tab->rel = rel;
}
+ /*
+ * Detaching the partition might involve TOAST table access, so ensure we
+ * have a valid snapshot. We only expect to get here without a snapshot
+ * in the concurrent path.
+ */
+ if (concurrent)
+ PushActiveSnapshot(GetTransactionSnapshot());
+ else
+ Assert(HaveRegisteredOrActiveSnapshot());
+
/* Do the final part of detaching */
DetachPartitionFinalize(rel, partRel, concurrent, defaultPartOid);
+ if (concurrent)
+ PopActiveSnapshot();
+
ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partRel));
/* keep our lock until commit */
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index e03e761392..d3fbb18438 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -377,6 +377,12 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
replorigin_session_origin_lsn = InvalidXLogRecPtr;
replorigin_session_origin_timestamp = 0;
+ /*
+ * Updating pg_replication_origin might involve TOAST table access, so
+ * ensure we have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Drop the tablesync's origin tracking if exists.
*
@@ -387,6 +393,8 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
*/
replorigin_drop_by_name(originname, true, false);
+ PopActiveSnapshot();
+
finish_sync_worker();
}
else
@@ -483,6 +491,12 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
started_tx = true;
}
+ /*
+ * Updating pg_replication_origin might involve TOAST table
+ * access, so ensure we have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Remove the tablesync origin tracking if exists.
*
@@ -500,6 +514,8 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
sizeof(originname));
replorigin_drop_by_name(originname, true, false);
+ PopActiveSnapshot();
+
/*
* Update the state to READY only after the origin cleanup.
*/
@@ -1455,8 +1471,13 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
* Then advance to the LSN got from walrcv_create_slot. This is WAL
* logged for the purpose of recovery. Locks are to prevent the
* replication origin from vanishing while advancing.
+ *
+ * Updating pg_replication_origin might involve TOAST table access, so
+ * ensure we have a valid snapshot.
*/
+ PushActiveSnapshot(GetTransactionSnapshot());
originid = replorigin_create(originname);
+ PopActiveSnapshot();
LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
replorigin_advance(originid, *origin_startpos, InvalidXLogRecPtr,
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 925dff9cc4..804e5dbb75 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -4604,8 +4604,16 @@ run_apply_worker()
walrcv_startstreaming(LogRepWorkerWalRcvConn, &options);
StartTransactionCommand();
+
+ /*
+ * Updating pg_subscription might involve TOAST table access, so
+ * ensure we have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
UpdateTwoPhaseState(MySubscription->oid, LOGICALREP_TWOPHASE_STATE_ENABLED);
MySubscription->twophasestate = LOGICALREP_TWOPHASE_STATE_ENABLED;
+ PopActiveSnapshot();
CommitTransactionCommand();
}
else
@@ -4821,7 +4829,15 @@ DisableSubscriptionAndExit(void)
/* Disable the subscription */
StartTransactionCommand();
+
+ /*
+ * Updating pg_subscription might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
DisableSubscription(MySubscription->oid);
+ PopActiveSnapshot();
CommitTransactionCommand();
/* Ensure we remove no-longer-useful entry for worker's start time */
@@ -4925,6 +4941,12 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
started_tx = true;
}
+ /*
+ * Updating pg_subscription might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Protect subskiplsn of pg_subscription from being concurrently updated
* while clearing it.
@@ -4983,6 +5005,8 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
heap_freetuple(tup);
table_close(rel, NoLock);
+ PopActiveSnapshot();
+
if (started_tx)
CommitTransactionCommand();
}
--
2.39.5 (Apple Git-154)
v3-0002-Introduce-RelationGetToastRelid-macro.patchtext/plain; charset=us-asciiDownload
From 4294f5f3c81ec4365ec5bd8bfa80069496732dc5 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Mon, 14 Oct 2024 14:50:59 -0500
Subject: [PATCH v3 2/3] Introduce RelationGetToastRelid macro.
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
---
contrib/amcheck/verify_heapam.c | 6 +++---
src/backend/access/common/toast_internals.c | 2 +-
src/backend/access/heap/heaptoast.c | 6 +++---
src/backend/catalog/heap.c | 2 +-
src/backend/catalog/index.c | 2 +-
src/backend/catalog/toasting.c | 2 +-
src/backend/commands/cluster.c | 19 ++++++++++---------
src/backend/commands/indexcmds.c | 4 ++--
src/backend/commands/tablecmds.c | 8 ++++----
src/backend/commands/vacuum.c | 2 +-
.../replication/logical/reorderbuffer.c | 4 ++--
src/backend/utils/adt/dbsize.c | 4 ++--
src/include/utils/rel.h | 6 ++++++
13 files changed, 37 insertions(+), 30 deletions(-)
diff --git a/contrib/amcheck/verify_heapam.c b/contrib/amcheck/verify_heapam.c
index f2526ed63a..78316daacd 100644
--- a/contrib/amcheck/verify_heapam.c
+++ b/contrib/amcheck/verify_heapam.c
@@ -367,12 +367,12 @@ verify_heapam(PG_FUNCTION_ARGS)
}
/* Optionally open the toast relation, if any. */
- if (ctx.rel->rd_rel->reltoastrelid && check_toast)
+ if (RelationGetToastRelid(ctx.rel) && check_toast)
{
int offset;
/* Main relation has associated toast relation */
- ctx.toast_rel = table_open(ctx.rel->rd_rel->reltoastrelid,
+ ctx.toast_rel = table_open(RelationGetToastRelid(ctx.rel),
AccessShareLock);
offset = toast_open_indexes(ctx.toast_rel,
AccessShareLock,
@@ -1721,7 +1721,7 @@ check_tuple_attribute(HeapCheckContext *ctx)
}
/* The relation better have a toast table */
- if (!ctx->rel->rd_rel->reltoastrelid)
+ if (!OidIsValid(RelationGetToastRelid(ctx->rel)))
{
report_corruption(ctx,
psprintf("toast value %u is external but relation has no toast relation",
diff --git a/src/backend/access/common/toast_internals.c b/src/backend/access/common/toast_internals.c
index 90d0654e62..39d93d12f2 100644
--- a/src/backend/access/common/toast_internals.c
+++ b/src/backend/access/common/toast_internals.c
@@ -151,7 +151,7 @@ toast_save_datum(Relation rel, Datum value,
* uniqueness of the OID we assign to the toasted item, even though it has
* additional columns besides OID.
*/
- toastrel = table_open(rel->rd_rel->reltoastrelid, RowExclusiveLock);
+ toastrel = table_open(RelationGetToastRelid(rel), RowExclusiveLock);
toasttupDesc = toastrel->rd_att;
/* Open all the toast indexes and look for the valid one */
diff --git a/src/backend/access/heap/heaptoast.c b/src/backend/access/heap/heaptoast.c
index a420e16530..cf6f3dff53 100644
--- a/src/backend/access/heap/heaptoast.c
+++ b/src/backend/access/heap/heaptoast.c
@@ -213,7 +213,7 @@ heap_toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple oldtup,
* XXX maybe the threshold should be less than maxDataLen?
*/
if (toast_attr[biggest_attno].tai_size > maxDataLen &&
- rel->rd_rel->reltoastrelid != InvalidOid)
+ OidIsValid(RelationGetToastRelid(rel)))
toast_tuple_externalize(&ttc, biggest_attno, options);
}
@@ -224,7 +224,7 @@ heap_toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple oldtup,
*/
while (heap_compute_data_size(tupleDesc,
toast_values, toast_isnull) > maxDataLen &&
- rel->rd_rel->reltoastrelid != InvalidOid)
+ OidIsValid(RelationGetToastRelid(rel)))
{
int biggest_attno;
@@ -259,7 +259,7 @@ heap_toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple oldtup,
while (heap_compute_data_size(tupleDesc,
toast_values, toast_isnull) > maxDataLen &&
- rel->rd_rel->reltoastrelid != InvalidOid)
+ OidIsValid(RelationGetToastRelid(rel)))
{
int biggest_attno;
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 0078a12f26..1b912488d6 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -3073,7 +3073,7 @@ heap_truncate_one_rel(Relation rel)
RelationTruncateIndexes(rel);
/* If there is a toast table, truncate that too */
- toastrelid = rel->rd_rel->reltoastrelid;
+ toastrelid = RelationGetToastRelid(rel);
if (OidIsValid(toastrelid))
{
Relation toastrel = table_open(toastrelid, AccessExclusiveLock);
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 12822d0b14..3eab421a16 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3930,7 +3930,7 @@ reindex_relation(const ReindexStmt *stmt, Oid relid, int flags,
get_namespace_name(RelationGetNamespace(rel)),
RelationGetRelationName(rel));
- toast_relid = rel->rd_rel->reltoastrelid;
+ toast_relid = RelationGetToastRelid(rel);
/*
* Get the list of index OIDs for this relation. (We trust the relcache
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index ad3082c62a..cf47e0960e 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -149,7 +149,7 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
/*
* Is it already toasted?
*/
- if (rel->rd_rel->reltoastrelid != InvalidOid)
+ if (OidIsValid(RelationGetToastRelid(rel)))
return false;
/*
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 78f96789b0..20da5201b6 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -780,7 +780,7 @@ make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, Oid NewAccessMethod,
* Note that NewHeapCreateToastTable ends with CommandCounterIncrement, so
* that the TOAST table will be visible for insertion.
*/
- toastid = OldHeap->rd_rel->reltoastrelid;
+ toastid = RelationGetToastRelid(OldHeap);
if (OidIsValid(toastid))
{
/* keep the existing toast table's reloptions, if any */
@@ -870,8 +870,8 @@ copy_table_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
* We don't need to open the toast relation here, just lock it. The lock
* will be held till end of transaction.
*/
- if (OldHeap->rd_rel->reltoastrelid)
- LockRelationOid(OldHeap->rd_rel->reltoastrelid, AccessExclusiveLock);
+ if (RelationGetToastRelid(OldHeap))
+ LockRelationOid(RelationGetToastRelid(OldHeap), AccessExclusiveLock);
/*
* If both tables have TOAST tables, perform toast swap by content. It is
@@ -880,7 +880,8 @@ copy_table_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
* swap by links. This is okay because swap by content is only essential
* for system catalogs, and we don't support schema changes for them.
*/
- if (OldHeap->rd_rel->reltoastrelid && NewHeap->rd_rel->reltoastrelid)
+ if (OidIsValid(RelationGetToastRelid(OldHeap)) &&
+ OidIsValid(RelationGetToastRelid(NewHeap)))
{
*pSwapToastByContent = true;
@@ -902,7 +903,7 @@ copy_table_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
* work for values copied over from the old toast table, but not for
* any values that we toast which were previously not toasted.)
*/
- NewHeap->rd_toastoid = OldHeap->rd_rel->reltoastrelid;
+ NewHeap->rd_toastoid = RelationGetToastRelid(OldHeap);
}
else
*pSwapToastByContent = false;
@@ -1581,19 +1582,19 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
Relation newrel;
newrel = table_open(OIDOldHeap, NoLock);
- if (OidIsValid(newrel->rd_rel->reltoastrelid))
+ if (OidIsValid(RelationGetToastRelid(newrel)))
{
Oid toastidx;
char NewToastName[NAMEDATALEN];
/* Get the associated valid index to be renamed */
- toastidx = toast_get_valid_index(newrel->rd_rel->reltoastrelid,
+ toastidx = toast_get_valid_index(RelationGetToastRelid(newrel),
NoLock);
/* rename the toast table ... */
snprintf(NewToastName, NAMEDATALEN, "pg_toast_%u",
OIDOldHeap);
- RenameRelationInternal(newrel->rd_rel->reltoastrelid,
+ RenameRelationInternal(RelationGetToastRelid(newrel),
NewToastName, true, false);
/* ... and its valid index too. */
@@ -1609,7 +1610,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
* that is updated as part of RenameRelationInternal.
*/
CommandCounterIncrement();
- ResetRelRewrite(newrel->rd_rel->reltoastrelid);
+ ResetRelRewrite(RelationGetToastRelid(newrel));
}
relation_close(newrel, NoLock);
}
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index e33ad81529..b7ddced7f0 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -3680,9 +3680,9 @@ ReindexRelationConcurrently(const ReindexStmt *stmt, Oid relationOid, const Rein
}
/* Also add the toast indexes */
- if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+ if (OidIsValid(RelationGetToastRelid(heapRelation)))
{
- Oid toastOid = heapRelation->rd_rel->reltoastrelid;
+ Oid toastOid = RelationGetToastRelid(heapRelation);
Relation toastRelation = table_open(toastOid,
ShareUpdateExclusiveLock);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index c6f82e93b9..cf076dec5d 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -2140,7 +2140,7 @@ ExecuteTruncateGuts(List *explicit_rels,
/*
* The same for the toast table, if any.
*/
- toast_relid = rel->rd_rel->reltoastrelid;
+ toast_relid = RelationGetToastRelid(rel);
if (OidIsValid(toast_relid))
{
Relation toastrel = relation_open(toast_relid,
@@ -15161,10 +15161,10 @@ ATExecSetRelOptions(Relation rel, List *defList, AlterTableType operation,
ReleaseSysCache(tuple);
/* repeat the whole exercise for the toast table, if there's one */
- if (OidIsValid(rel->rd_rel->reltoastrelid))
+ if (OidIsValid(RelationGetToastRelid(rel)))
{
Relation toastrel;
- Oid toastid = rel->rd_rel->reltoastrelid;
+ Oid toastid = RelationGetToastRelid(rel);
toastrel = table_open(toastid, lockmode);
@@ -15253,7 +15253,7 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
return;
}
- reltoastrelid = rel->rd_rel->reltoastrelid;
+ reltoastrelid = RelationGetToastRelid(rel);
/* Fetch the list of indexes on toast relation if necessary */
if (OidIsValid(reltoastrelid))
{
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index ac8f5d9c25..08bd489aed 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2187,7 +2187,7 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params,
if ((params->options & VACOPT_PROCESS_TOAST) != 0 &&
((params->options & VACOPT_FULL) == 0 ||
(params->options & VACOPT_PROCESS_MAIN) == 0))
- toast_relid = rel->rd_rel->reltoastrelid;
+ toast_relid = RelationGetToastRelid(rel);
else
toast_relid = InvalidOid;
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 22bcf171ff..c9d9590108 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4822,10 +4822,10 @@ ReorderBufferToastReplace(ReorderBuffer *rb, ReorderBufferTXN *txn,
desc = RelationGetDescr(relation);
- toast_rel = RelationIdGetRelation(relation->rd_rel->reltoastrelid);
+ toast_rel = RelationIdGetRelation(RelationGetToastRelid(relation));
if (!RelationIsValid(toast_rel))
elog(ERROR, "could not open toast relation with OID %u (base relation \"%s\")",
- relation->rd_rel->reltoastrelid, RelationGetRelationName(relation));
+ RelationGetToastRelid(relation), RelationGetRelationName(relation));
toast_desc = RelationGetDescr(toast_rel);
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index e63e99c141..a3597be9bc 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -436,8 +436,8 @@ calculate_table_size(Relation rel)
/*
* Size of toast relation
*/
- if (OidIsValid(rel->rd_rel->reltoastrelid))
- size += calculate_toast_table_size(rel->rd_rel->reltoastrelid);
+ if (OidIsValid(RelationGetToastRelid(rel)))
+ size += calculate_toast_table_size(RelationGetToastRelid(rel));
return size;
}
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 8700204953..6631b37d97 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -504,6 +504,12 @@ typedef struct ViewOptions
*/
#define RelationGetRelid(relation) ((relation)->rd_id)
+/*
+ * RelationGetToastRelid
+ * Returns the OID of the relation's TOAST table (or InvalidOid if none)
+ */
+#define RelationGetToastRelid(relation) ((relation)->rd_rel->reltoastrelid)
+
/*
* RelationGetNumberOfAttributes
* Returns the total number of attributes in a relation.
--
2.39.5 (Apple Git-154)
v3-0003-Assert-that-we-have-a-valid-snapshot-if-we-might-.patchtext/plain; charset=us-asciiDownload
From 1484561a5a9cfe070293d3cbb8528bac1634f185 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Mon, 14 Oct 2024 14:53:36 -0500
Subject: [PATCH v3 3/3] Assert that we have a valid snapshot if we might need
TOAST access.
Suggested-by: Michael Paquier
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
---
src/backend/access/heap/heapam.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index da5e656a08..d51525a85c 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -210,6 +210,23 @@ static const int MultiXactStatusLock[MaxMultiXactStatus + 1] =
#define TUPLOCK_from_mxstatus(status) \
(MultiXactStatusLock[(status)])
+/*
+ * Check that we have a valid snapshot if we might need TOAST access.
+ */
+static inline void
+AssertHasSnapshotForToast(Relation rel)
+{
+ /* bootstrap mode in particular breaks this rule */
+ if (!IsNormalProcessingMode())
+ return;
+
+ /* if the relation doesn't have a TOAST table, we are good */
+ if (!OidIsValid(RelationGetToastRelid(rel)))
+ return;
+
+ Assert(HaveRegisteredOrActiveSnapshot());
+}
+
/* ----------------------------------------------------------------
* heap support routines
* ----------------------------------------------------------------
@@ -1995,6 +2012,8 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
Assert(HeapTupleHeaderGetNatts(tup->t_data) <=
RelationGetNumberOfAttributes(relation));
+ AssertHasSnapshotForToast(relation);
+
/*
* Fill in tuple header fields and toast the tuple if necessary.
*
@@ -2272,6 +2291,8 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
/* currently not needed (thus unsupported) for heap_multi_insert() */
Assert(!(options & HEAP_INSERT_NO_LOGICAL));
+ AssertHasSnapshotForToast(relation);
+
needwal = RelationNeedsWAL(relation);
saveFreeSpace = RelationGetTargetPageFreeSpace(relation,
HEAP_DEFAULT_FILLFACTOR);
@@ -2694,6 +2715,8 @@ heap_delete(Relation relation, ItemPointer tid,
Assert(ItemPointerIsValid(tid));
+ AssertHasSnapshotForToast(relation);
+
/*
* Forbid this during a parallel operation, lest it allocate a combo CID.
* Other workers might need that combo CID for visibility checks, and we
@@ -3189,6 +3212,8 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
Assert(HeapTupleHeaderGetNatts(newtup->t_data) <=
RelationGetNumberOfAttributes(relation));
+ AssertHasSnapshotForToast(relation);
+
/*
* Forbid this during a parallel operation, lest it allocate a combo CID.
* Other workers might need that combo CID for visibility checks, and we
--
2.39.5 (Apple Git-154)
On Mon, Oct 14, 2024 at 03:02:22PM -0500, Nathan Bossart wrote:
Here is a reorganized patch set. 0001 would be back-patched, but the
others would only be applied to v18.
Right.
- if (!ctx->rel->rd_rel->reltoastrelid)
+ if (!OidIsValid(RelationGetToastRelid(ctx->rel)))
This set of diffs in 0002 is a nice cleanup. I'd wish for relying
less on zero comparitons when assuming that InvalidOid is in use.
+static inline void
+AssertHasSnapshotForToast(Relation rel)
+{
+ /* bootstrap mode in particular breaks this rule */
+ if (!IsNormalProcessingMode())
+ return;
+
+ /* if the relation doesn't have a TOAST table, we are good */
+ if (!OidIsValid(RelationGetToastRelid(rel)))
+ return;
+
+ Assert(HaveRegisteredOrActiveSnapshot());
+}
Using a separate inlined routine is indeed cleaner as you have
documented the assumptions behind the check. Wouldn't it be better to
use a USE_ASSERT_CHECKING block? These two checks for normal
processing and toastrelid are cheap lookups, but we don't need them at
all in non-assert paths, so I'd suggest to ignore them entirely for
the non-USE_ASSERT_CHECKING case.
--
Michael
On Wed, Oct 16, 2024 at 09:12:31AM +0900, Michael Paquier wrote:
- if (!ctx->rel->rd_rel->reltoastrelid) + if (!OidIsValid(RelationGetToastRelid(ctx->rel)))This set of diffs in 0002 is a nice cleanup. I'd wish for relying
less on zero comparitons when assuming that InvalidOid is in use.
I'm wondering if there's any concern about this one causing back-patching
pain. If so, I can just add the macro for use in new code.
+static inline void +AssertHasSnapshotForToast(Relation rel) +{ + /* bootstrap mode in particular breaks this rule */ + if (!IsNormalProcessingMode()) + return; + + /* if the relation doesn't have a TOAST table, we are good */ + if (!OidIsValid(RelationGetToastRelid(rel))) + return; + + Assert(HaveRegisteredOrActiveSnapshot()); +}Using a separate inlined routine is indeed cleaner as you have
documented the assumptions behind the check. Wouldn't it be better to
use a USE_ASSERT_CHECKING block? These two checks for normal
processing and toastrelid are cheap lookups, but we don't need them at
all in non-assert paths, so I'd suggest to ignore them entirely for
the non-USE_ASSERT_CHECKING case.
I assume all of this will get compiled out in non-USE_ASSERT_CHECKING
builds as-is, but I see no problem with surrounding it with an #ifdef to be
sure.
--
nathan
On Tue, Oct 15, 2024 at 08:20:17PM -0500, Nathan Bossart wrote:
On Wed, Oct 16, 2024 at 09:12:31AM +0900, Michael Paquier wrote:
- if (!ctx->rel->rd_rel->reltoastrelid) + if (!OidIsValid(RelationGetToastRelid(ctx->rel)))This set of diffs in 0002 is a nice cleanup. I'd wish for relying
less on zero comparitons when assuming that InvalidOid is in use.I'm wondering if there's any concern about this one causing back-patching
pain. If so, I can just add the macro for use in new code.
This bit does not concern me much. manipulations of reltoastrelid
from Relations are not that common in bug fixes.
I assume all of this will get compiled out in non-USE_ASSERT_CHECKING
builds as-is, but I see no problem with surrounding it with an #ifdef to be
sure.
Yeah, I'm not sure that that would always be the case when optimized.
Code generated can be dumb sometimes even if compilers got much
smarter in the last 10 years or so (not compiler guy here).
--
Michael
On Wed, Oct 16, 2024 at 10:24:32AM +0900, Michael Paquier wrote:
On Tue, Oct 15, 2024 at 08:20:17PM -0500, Nathan Bossart wrote:
I assume all of this will get compiled out in non-USE_ASSERT_CHECKING
builds as-is, but I see no problem with surrounding it with an #ifdef to be
sure.Yeah, I'm not sure that that would always be the case when optimized.
Code generated can be dumb sometimes even if compilers got much
smarter in the last 10 years or so (not compiler guy here).
Here is a new version of the patch set with this #ifdef added. I plan to
give each of the code paths adjusted by 0001 a closer look, but otherwise
I'm hoping to commit these soon.
--
nathan
Attachments:
v4-0002-Introduce-RelationGetToastRelid-macro.patchtext/plain; charset=us-asciiDownload
From eb289ef83fb9e3c83bcef614fc3a8f7e3d7e1dce Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Mon, 14 Oct 2024 14:50:59 -0500
Subject: [PATCH v4 2/3] Introduce RelationGetToastRelid macro.
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
---
contrib/amcheck/verify_heapam.c | 6 +++---
src/backend/access/common/toast_internals.c | 2 +-
src/backend/access/heap/heaptoast.c | 6 +++---
src/backend/catalog/heap.c | 2 +-
src/backend/catalog/index.c | 2 +-
src/backend/catalog/toasting.c | 2 +-
src/backend/commands/cluster.c | 19 ++++++++++---------
src/backend/commands/indexcmds.c | 4 ++--
src/backend/commands/tablecmds.c | 8 ++++----
src/backend/commands/vacuum.c | 2 +-
.../replication/logical/reorderbuffer.c | 4 ++--
src/backend/utils/adt/dbsize.c | 4 ++--
src/include/utils/rel.h | 6 ++++++
13 files changed, 37 insertions(+), 30 deletions(-)
diff --git a/contrib/amcheck/verify_heapam.c b/contrib/amcheck/verify_heapam.c
index f2526ed63a..78316daacd 100644
--- a/contrib/amcheck/verify_heapam.c
+++ b/contrib/amcheck/verify_heapam.c
@@ -367,12 +367,12 @@ verify_heapam(PG_FUNCTION_ARGS)
}
/* Optionally open the toast relation, if any. */
- if (ctx.rel->rd_rel->reltoastrelid && check_toast)
+ if (RelationGetToastRelid(ctx.rel) && check_toast)
{
int offset;
/* Main relation has associated toast relation */
- ctx.toast_rel = table_open(ctx.rel->rd_rel->reltoastrelid,
+ ctx.toast_rel = table_open(RelationGetToastRelid(ctx.rel),
AccessShareLock);
offset = toast_open_indexes(ctx.toast_rel,
AccessShareLock,
@@ -1721,7 +1721,7 @@ check_tuple_attribute(HeapCheckContext *ctx)
}
/* The relation better have a toast table */
- if (!ctx->rel->rd_rel->reltoastrelid)
+ if (!OidIsValid(RelationGetToastRelid(ctx->rel)))
{
report_corruption(ctx,
psprintf("toast value %u is external but relation has no toast relation",
diff --git a/src/backend/access/common/toast_internals.c b/src/backend/access/common/toast_internals.c
index 90d0654e62..39d93d12f2 100644
--- a/src/backend/access/common/toast_internals.c
+++ b/src/backend/access/common/toast_internals.c
@@ -151,7 +151,7 @@ toast_save_datum(Relation rel, Datum value,
* uniqueness of the OID we assign to the toasted item, even though it has
* additional columns besides OID.
*/
- toastrel = table_open(rel->rd_rel->reltoastrelid, RowExclusiveLock);
+ toastrel = table_open(RelationGetToastRelid(rel), RowExclusiveLock);
toasttupDesc = toastrel->rd_att;
/* Open all the toast indexes and look for the valid one */
diff --git a/src/backend/access/heap/heaptoast.c b/src/backend/access/heap/heaptoast.c
index a420e16530..cf6f3dff53 100644
--- a/src/backend/access/heap/heaptoast.c
+++ b/src/backend/access/heap/heaptoast.c
@@ -213,7 +213,7 @@ heap_toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple oldtup,
* XXX maybe the threshold should be less than maxDataLen?
*/
if (toast_attr[biggest_attno].tai_size > maxDataLen &&
- rel->rd_rel->reltoastrelid != InvalidOid)
+ OidIsValid(RelationGetToastRelid(rel)))
toast_tuple_externalize(&ttc, biggest_attno, options);
}
@@ -224,7 +224,7 @@ heap_toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple oldtup,
*/
while (heap_compute_data_size(tupleDesc,
toast_values, toast_isnull) > maxDataLen &&
- rel->rd_rel->reltoastrelid != InvalidOid)
+ OidIsValid(RelationGetToastRelid(rel)))
{
int biggest_attno;
@@ -259,7 +259,7 @@ heap_toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple oldtup,
while (heap_compute_data_size(tupleDesc,
toast_values, toast_isnull) > maxDataLen &&
- rel->rd_rel->reltoastrelid != InvalidOid)
+ OidIsValid(RelationGetToastRelid(rel)))
{
int biggest_attno;
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 0078a12f26..1b912488d6 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -3073,7 +3073,7 @@ heap_truncate_one_rel(Relation rel)
RelationTruncateIndexes(rel);
/* If there is a toast table, truncate that too */
- toastrelid = rel->rd_rel->reltoastrelid;
+ toastrelid = RelationGetToastRelid(rel);
if (OidIsValid(toastrelid))
{
Relation toastrel = table_open(toastrelid, AccessExclusiveLock);
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 12822d0b14..3eab421a16 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3930,7 +3930,7 @@ reindex_relation(const ReindexStmt *stmt, Oid relid, int flags,
get_namespace_name(RelationGetNamespace(rel)),
RelationGetRelationName(rel));
- toast_relid = rel->rd_rel->reltoastrelid;
+ toast_relid = RelationGetToastRelid(rel);
/*
* Get the list of index OIDs for this relation. (We trust the relcache
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index ad3082c62a..cf47e0960e 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -149,7 +149,7 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
/*
* Is it already toasted?
*/
- if (rel->rd_rel->reltoastrelid != InvalidOid)
+ if (OidIsValid(RelationGetToastRelid(rel)))
return false;
/*
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 78f96789b0..20da5201b6 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -780,7 +780,7 @@ make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, Oid NewAccessMethod,
* Note that NewHeapCreateToastTable ends with CommandCounterIncrement, so
* that the TOAST table will be visible for insertion.
*/
- toastid = OldHeap->rd_rel->reltoastrelid;
+ toastid = RelationGetToastRelid(OldHeap);
if (OidIsValid(toastid))
{
/* keep the existing toast table's reloptions, if any */
@@ -870,8 +870,8 @@ copy_table_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
* We don't need to open the toast relation here, just lock it. The lock
* will be held till end of transaction.
*/
- if (OldHeap->rd_rel->reltoastrelid)
- LockRelationOid(OldHeap->rd_rel->reltoastrelid, AccessExclusiveLock);
+ if (RelationGetToastRelid(OldHeap))
+ LockRelationOid(RelationGetToastRelid(OldHeap), AccessExclusiveLock);
/*
* If both tables have TOAST tables, perform toast swap by content. It is
@@ -880,7 +880,8 @@ copy_table_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
* swap by links. This is okay because swap by content is only essential
* for system catalogs, and we don't support schema changes for them.
*/
- if (OldHeap->rd_rel->reltoastrelid && NewHeap->rd_rel->reltoastrelid)
+ if (OidIsValid(RelationGetToastRelid(OldHeap)) &&
+ OidIsValid(RelationGetToastRelid(NewHeap)))
{
*pSwapToastByContent = true;
@@ -902,7 +903,7 @@ copy_table_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
* work for values copied over from the old toast table, but not for
* any values that we toast which were previously not toasted.)
*/
- NewHeap->rd_toastoid = OldHeap->rd_rel->reltoastrelid;
+ NewHeap->rd_toastoid = RelationGetToastRelid(OldHeap);
}
else
*pSwapToastByContent = false;
@@ -1581,19 +1582,19 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
Relation newrel;
newrel = table_open(OIDOldHeap, NoLock);
- if (OidIsValid(newrel->rd_rel->reltoastrelid))
+ if (OidIsValid(RelationGetToastRelid(newrel)))
{
Oid toastidx;
char NewToastName[NAMEDATALEN];
/* Get the associated valid index to be renamed */
- toastidx = toast_get_valid_index(newrel->rd_rel->reltoastrelid,
+ toastidx = toast_get_valid_index(RelationGetToastRelid(newrel),
NoLock);
/* rename the toast table ... */
snprintf(NewToastName, NAMEDATALEN, "pg_toast_%u",
OIDOldHeap);
- RenameRelationInternal(newrel->rd_rel->reltoastrelid,
+ RenameRelationInternal(RelationGetToastRelid(newrel),
NewToastName, true, false);
/* ... and its valid index too. */
@@ -1609,7 +1610,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
* that is updated as part of RenameRelationInternal.
*/
CommandCounterIncrement();
- ResetRelRewrite(newrel->rd_rel->reltoastrelid);
+ ResetRelRewrite(RelationGetToastRelid(newrel));
}
relation_close(newrel, NoLock);
}
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index e33ad81529..b7ddced7f0 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -3680,9 +3680,9 @@ ReindexRelationConcurrently(const ReindexStmt *stmt, Oid relationOid, const Rein
}
/* Also add the toast indexes */
- if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+ if (OidIsValid(RelationGetToastRelid(heapRelation)))
{
- Oid toastOid = heapRelation->rd_rel->reltoastrelid;
+ Oid toastOid = RelationGetToastRelid(heapRelation);
Relation toastRelation = table_open(toastOid,
ShareUpdateExclusiveLock);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index c6f82e93b9..cf076dec5d 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -2140,7 +2140,7 @@ ExecuteTruncateGuts(List *explicit_rels,
/*
* The same for the toast table, if any.
*/
- toast_relid = rel->rd_rel->reltoastrelid;
+ toast_relid = RelationGetToastRelid(rel);
if (OidIsValid(toast_relid))
{
Relation toastrel = relation_open(toast_relid,
@@ -15161,10 +15161,10 @@ ATExecSetRelOptions(Relation rel, List *defList, AlterTableType operation,
ReleaseSysCache(tuple);
/* repeat the whole exercise for the toast table, if there's one */
- if (OidIsValid(rel->rd_rel->reltoastrelid))
+ if (OidIsValid(RelationGetToastRelid(rel)))
{
Relation toastrel;
- Oid toastid = rel->rd_rel->reltoastrelid;
+ Oid toastid = RelationGetToastRelid(rel);
toastrel = table_open(toastid, lockmode);
@@ -15253,7 +15253,7 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
return;
}
- reltoastrelid = rel->rd_rel->reltoastrelid;
+ reltoastrelid = RelationGetToastRelid(rel);
/* Fetch the list of indexes on toast relation if necessary */
if (OidIsValid(reltoastrelid))
{
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index ac8f5d9c25..08bd489aed 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2187,7 +2187,7 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params,
if ((params->options & VACOPT_PROCESS_TOAST) != 0 &&
((params->options & VACOPT_FULL) == 0 ||
(params->options & VACOPT_PROCESS_MAIN) == 0))
- toast_relid = rel->rd_rel->reltoastrelid;
+ toast_relid = RelationGetToastRelid(rel);
else
toast_relid = InvalidOid;
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 22bcf171ff..c9d9590108 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4822,10 +4822,10 @@ ReorderBufferToastReplace(ReorderBuffer *rb, ReorderBufferTXN *txn,
desc = RelationGetDescr(relation);
- toast_rel = RelationIdGetRelation(relation->rd_rel->reltoastrelid);
+ toast_rel = RelationIdGetRelation(RelationGetToastRelid(relation));
if (!RelationIsValid(toast_rel))
elog(ERROR, "could not open toast relation with OID %u (base relation \"%s\")",
- relation->rd_rel->reltoastrelid, RelationGetRelationName(relation));
+ RelationGetToastRelid(relation), RelationGetRelationName(relation));
toast_desc = RelationGetDescr(toast_rel);
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index e63e99c141..a3597be9bc 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -436,8 +436,8 @@ calculate_table_size(Relation rel)
/*
* Size of toast relation
*/
- if (OidIsValid(rel->rd_rel->reltoastrelid))
- size += calculate_toast_table_size(rel->rd_rel->reltoastrelid);
+ if (OidIsValid(RelationGetToastRelid(rel)))
+ size += calculate_toast_table_size(RelationGetToastRelid(rel));
return size;
}
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 8700204953..6631b37d97 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -504,6 +504,12 @@ typedef struct ViewOptions
*/
#define RelationGetRelid(relation) ((relation)->rd_id)
+/*
+ * RelationGetToastRelid
+ * Returns the OID of the relation's TOAST table (or InvalidOid if none)
+ */
+#define RelationGetToastRelid(relation) ((relation)->rd_rel->reltoastrelid)
+
/*
* RelationGetNumberOfAttributes
* Returns the total number of attributes in a relation.
--
2.39.5 (Apple Git-154)
v4-0003-Assert-that-we-have-a-valid-snapshot-if-we-might-.patchtext/plain; charset=us-asciiDownload
From 187ef6ee2abe56f72ee9b32de6961227c4abff38 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Mon, 14 Oct 2024 14:53:36 -0500
Subject: [PATCH v4 3/3] Assert that we have a valid snapshot if we might need
TOAST access.
Suggested-by: Michael Paquier
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
---
src/backend/access/heap/heapam.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index da5e656a08..d819c58ff3 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -210,6 +210,27 @@ static const int MultiXactStatusLock[MaxMultiXactStatus + 1] =
#define TUPLOCK_from_mxstatus(status) \
(MultiXactStatusLock[(status)])
+/*
+ * Check that we have a valid snapshot if we might need TOAST access.
+ */
+static inline void
+AssertHasSnapshotForToast(Relation rel)
+{
+#ifdef USE_ASSERT_CHECKING
+
+ /* bootstrap mode in particular breaks this rule */
+ if (!IsNormalProcessingMode())
+ return;
+
+ /* if the relation doesn't have a TOAST table, we are good */
+ if (!OidIsValid(RelationGetToastRelid(rel)))
+ return;
+
+ Assert(HaveRegisteredOrActiveSnapshot());
+
+#endif /* USE_ASSERT_CHECKING */
+}
+
/* ----------------------------------------------------------------
* heap support routines
* ----------------------------------------------------------------
@@ -1995,6 +2016,8 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
Assert(HeapTupleHeaderGetNatts(tup->t_data) <=
RelationGetNumberOfAttributes(relation));
+ AssertHasSnapshotForToast(relation);
+
/*
* Fill in tuple header fields and toast the tuple if necessary.
*
@@ -2272,6 +2295,8 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
/* currently not needed (thus unsupported) for heap_multi_insert() */
Assert(!(options & HEAP_INSERT_NO_LOGICAL));
+ AssertHasSnapshotForToast(relation);
+
needwal = RelationNeedsWAL(relation);
saveFreeSpace = RelationGetTargetPageFreeSpace(relation,
HEAP_DEFAULT_FILLFACTOR);
@@ -2694,6 +2719,8 @@ heap_delete(Relation relation, ItemPointer tid,
Assert(ItemPointerIsValid(tid));
+ AssertHasSnapshotForToast(relation);
+
/*
* Forbid this during a parallel operation, lest it allocate a combo CID.
* Other workers might need that combo CID for visibility checks, and we
@@ -3189,6 +3216,8 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
Assert(HeapTupleHeaderGetNatts(newtup->t_data) <=
RelationGetNumberOfAttributes(relation));
+ AssertHasSnapshotForToast(relation);
+
/*
* Forbid this during a parallel operation, lest it allocate a combo CID.
* Other workers might need that combo CID for visibility checks, and we
--
2.39.5 (Apple Git-154)
v4-0001-Ensure-we-have-a-snapshot-when-updating-various-s.patchtext/plain; charset=us-asciiDownload
From b9444fd22d8dcc05e3a27817943be7f5296503fa Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Mon, 14 Oct 2024 14:43:26 -0500
Subject: [PATCH v4 1/3] Ensure we have a snapshot when updating various system
catalogs.
Reported-by: Alexander Lakhin
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org
Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
Backpatch-through: FIXME
---
src/backend/commands/tablecmds.c | 13 +++++++++++
src/backend/replication/logical/tablesync.c | 21 ++++++++++++++++++
src/backend/replication/logical/worker.c | 24 +++++++++++++++++++++
3 files changed, 58 insertions(+)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 1ccc80087c..c6f82e93b9 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -19283,9 +19283,22 @@ ATExecDetachPartition(List **wqueue, AlteredTableInfo *tab, Relation rel,
tab->rel = rel;
}
+ /*
+ * Detaching the partition might involve TOAST table access, so ensure we
+ * have a valid snapshot. We only expect to get here without a snapshot
+ * in the concurrent path.
+ */
+ if (concurrent)
+ PushActiveSnapshot(GetTransactionSnapshot());
+ else
+ Assert(HaveRegisteredOrActiveSnapshot());
+
/* Do the final part of detaching */
DetachPartitionFinalize(rel, partRel, concurrent, defaultPartOid);
+ if (concurrent)
+ PopActiveSnapshot();
+
ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partRel));
/* keep our lock until commit */
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index e03e761392..d3fbb18438 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -377,6 +377,12 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
replorigin_session_origin_lsn = InvalidXLogRecPtr;
replorigin_session_origin_timestamp = 0;
+ /*
+ * Updating pg_replication_origin might involve TOAST table access, so
+ * ensure we have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Drop the tablesync's origin tracking if exists.
*
@@ -387,6 +393,8 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
*/
replorigin_drop_by_name(originname, true, false);
+ PopActiveSnapshot();
+
finish_sync_worker();
}
else
@@ -483,6 +491,12 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
started_tx = true;
}
+ /*
+ * Updating pg_replication_origin might involve TOAST table
+ * access, so ensure we have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Remove the tablesync origin tracking if exists.
*
@@ -500,6 +514,8 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
sizeof(originname));
replorigin_drop_by_name(originname, true, false);
+ PopActiveSnapshot();
+
/*
* Update the state to READY only after the origin cleanup.
*/
@@ -1455,8 +1471,13 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
* Then advance to the LSN got from walrcv_create_slot. This is WAL
* logged for the purpose of recovery. Locks are to prevent the
* replication origin from vanishing while advancing.
+ *
+ * Updating pg_replication_origin might involve TOAST table access, so
+ * ensure we have a valid snapshot.
*/
+ PushActiveSnapshot(GetTransactionSnapshot());
originid = replorigin_create(originname);
+ PopActiveSnapshot();
LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
replorigin_advance(originid, *origin_startpos, InvalidXLogRecPtr,
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 925dff9cc4..804e5dbb75 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -4604,8 +4604,16 @@ run_apply_worker()
walrcv_startstreaming(LogRepWorkerWalRcvConn, &options);
StartTransactionCommand();
+
+ /*
+ * Updating pg_subscription might involve TOAST table access, so
+ * ensure we have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
UpdateTwoPhaseState(MySubscription->oid, LOGICALREP_TWOPHASE_STATE_ENABLED);
MySubscription->twophasestate = LOGICALREP_TWOPHASE_STATE_ENABLED;
+ PopActiveSnapshot();
CommitTransactionCommand();
}
else
@@ -4821,7 +4829,15 @@ DisableSubscriptionAndExit(void)
/* Disable the subscription */
StartTransactionCommand();
+
+ /*
+ * Updating pg_subscription might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
DisableSubscription(MySubscription->oid);
+ PopActiveSnapshot();
CommitTransactionCommand();
/* Ensure we remove no-longer-useful entry for worker's start time */
@@ -4925,6 +4941,12 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
started_tx = true;
}
+ /*
+ * Updating pg_subscription might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Protect subskiplsn of pg_subscription from being concurrently updated
* while clearing it.
@@ -4983,6 +5005,8 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
heap_freetuple(tup);
table_close(rel, NoLock);
+ PopActiveSnapshot();
+
if (started_tx)
CommitTransactionCommand();
}
--
2.39.5 (Apple Git-154)
On Wed, Oct 16, 2024 at 10:54:45AM -0500, Nathan Bossart wrote:
Here is a new version of the patch set with this #ifdef added. I plan to
give each of the code paths adjusted by 0001 a closer look, but otherwise
I'm hoping to commit these soon.
0002 and 0003 look sane enough to me as shaped. I'd need to spend a
few more hours on 0001 if I were to do a second round on it, but you
don't really need a second opinion, do you? :D
--
Michael
On Fri, Oct 18, 2024 at 08:08:55AM +0900, Michael Paquier wrote:
0002 and 0003 look sane enough to me as shaped. I'd need to spend a
few more hours on 0001 if I were to do a second round on it, but you
don't really need a second opinion, do you? :D
I'll manage. 0001 was a doozy to back-patch, and this obviously isn't a
terribly pressing issue, so I plan to wait until after the November minor
release to apply this. I'm having second thoughts on 0002, so I'll
probably leave that one uncommitted, but IMHO we definitely need 0003 to
prevent this issue from reoccurring.
--
nathan
Attachments:
v5-0001-Ensure-we-have-a-snapshot-when-updating-va.patch.mastertext/plain; charset=us-asciiDownload
From 58eefc6c2f90ac7d1f23f5b5f3052042cf0e75d9 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Wed, 30 Oct 2024 15:42:29 -0500
Subject: [PATCH v5 1/1] Ensure we have a snapshot when updating various system
catalogs.
Reported-by: Alexander Lakhin
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org
Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
Backpatch-through: 12
---
src/backend/commands/indexcmds.c | 2 +-
src/backend/commands/tablecmds.c | 8 +++++++
src/backend/postmaster/autovacuum.c | 7 ++++++
src/backend/replication/logical/tablesync.c | 21 ++++++++++++++++++
src/backend/replication/logical/worker.c | 24 +++++++++++++++++++++
5 files changed, 61 insertions(+), 1 deletion(-)
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 2f652463e3..fd86f28a35 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -4230,7 +4230,7 @@ ReindexRelationConcurrently(const ReindexStmt *stmt, Oid relationOid, const Rein
false);
/*
- * Updating pg_index might involve TOAST table access, so ensure we
+ * Swapping the indexes might involve TOAST table access, so ensure we
* have a valid snapshot.
*/
PushActiveSnapshot(GetTransactionSnapshot());
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 4345b96de5..f1649419d6 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -19383,9 +19383,17 @@ ATExecDetachPartition(List **wqueue, AlteredTableInfo *tab, Relation rel,
tab->rel = rel;
}
+ /*
+ * Detaching the partition might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/* Do the final part of detaching */
DetachPartitionFinalize(rel, partRel, concurrent, defaultPartOid);
+ PopActiveSnapshot();
+
ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partRel));
/* keep our lock until commit */
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index dc3cf87aba..e84285b644 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2209,6 +2209,12 @@ do_autovacuum(void)
get_namespace_name(classForm->relnamespace),
NameStr(classForm->relname))));
+ /*
+ * Deletion might involve TOAST table access, so ensure we have a
+ * valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
object.classId = RelationRelationId;
object.objectId = relid;
object.objectSubId = 0;
@@ -2221,6 +2227,7 @@ do_autovacuum(void)
* To commit the deletion, end current transaction and start a new
* one. Note this also releases the locks we took.
*/
+ PopActiveSnapshot();
CommitTransactionCommand();
StartTransactionCommand();
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 118503fcb7..57e1eede41 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -377,6 +377,12 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
replorigin_session_origin_lsn = InvalidXLogRecPtr;
replorigin_session_origin_timestamp = 0;
+ /*
+ * Updating pg_replication_origin might involve TOAST table access, so
+ * ensure we have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Drop the tablesync's origin tracking if exists.
*
@@ -387,6 +393,8 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
*/
replorigin_drop_by_name(originname, true, false);
+ PopActiveSnapshot();
+
finish_sync_worker();
}
else
@@ -483,6 +491,12 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
started_tx = true;
}
+ /*
+ * Updating pg_replication_origin might involve TOAST table
+ * access, so ensure we have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Remove the tablesync origin tracking if exists.
*
@@ -500,6 +514,8 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
sizeof(originname));
replorigin_drop_by_name(originname, true, false);
+ PopActiveSnapshot();
+
/*
* Update the state to READY only after the origin cleanup.
*/
@@ -1462,8 +1478,13 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
* Then advance to the LSN got from walrcv_create_slot. This is WAL
* logged for the purpose of recovery. Locks are to prevent the
* replication origin from vanishing while advancing.
+ *
+ * Updating pg_replication_origin might involve TOAST table access, so
+ * ensure we have a valid snapshot.
*/
+ PushActiveSnapshot(GetTransactionSnapshot());
originid = replorigin_create(originname);
+ PopActiveSnapshot();
LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
replorigin_advance(originid, *origin_startpos, InvalidXLogRecPtr,
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 925dff9cc4..804e5dbb75 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -4604,8 +4604,16 @@ run_apply_worker()
walrcv_startstreaming(LogRepWorkerWalRcvConn, &options);
StartTransactionCommand();
+
+ /*
+ * Updating pg_subscription might involve TOAST table access, so
+ * ensure we have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
UpdateTwoPhaseState(MySubscription->oid, LOGICALREP_TWOPHASE_STATE_ENABLED);
MySubscription->twophasestate = LOGICALREP_TWOPHASE_STATE_ENABLED;
+ PopActiveSnapshot();
CommitTransactionCommand();
}
else
@@ -4821,7 +4829,15 @@ DisableSubscriptionAndExit(void)
/* Disable the subscription */
StartTransactionCommand();
+
+ /*
+ * Updating pg_subscription might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
DisableSubscription(MySubscription->oid);
+ PopActiveSnapshot();
CommitTransactionCommand();
/* Ensure we remove no-longer-useful entry for worker's start time */
@@ -4925,6 +4941,12 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
started_tx = true;
}
+ /*
+ * Updating pg_subscription might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Protect subskiplsn of pg_subscription from being concurrently updated
* while clearing it.
@@ -4983,6 +5005,8 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
heap_freetuple(tup);
table_close(rel, NoLock);
+ PopActiveSnapshot();
+
if (started_tx)
CommitTransactionCommand();
}
--
2.39.5 (Apple Git-154)
v5-0001-Ensure-we-have-a-snapshot-when-updating-vario.patch.v12text/plain; charset=us-asciiDownload
From cd8058dd2792f99a390cfb4794fc77921ececc51 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Wed, 30 Oct 2024 15:42:29 -0500
Subject: [PATCH v5 1/1] Ensure we have a snapshot when updating various system
catalogs.
Reported-by: Alexander Lakhin
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org
Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
Backpatch-through: 12
---
src/backend/commands/indexcmds.c | 8 ++++++++
src/backend/postmaster/autovacuum.c | 7 +++++++
2 files changed, 15 insertions(+)
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index e3c61cd907..3168442ea4 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -3477,12 +3477,20 @@ ReindexRelationConcurrently(Oid relationOid, int options)
get_rel_namespace(heapId),
false);
+ /*
+ * Swapping the indexes might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Swap old index with the new one. This also marks the new one as
* valid and the old one as not valid.
*/
index_concurrently_swap(newIndexId, oldIndexId, oldName);
+ PopActiveSnapshot();
+
/*
* Invalidate the relcache for the table, so that after this commit
* all sessions will refresh any cached plans that might reference the
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index dd5050dad8..a6695b74ff 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2296,6 +2296,12 @@ do_autovacuum(void)
get_namespace_name(classForm->relnamespace),
NameStr(classForm->relname))));
+ /*
+ * Deletion might involve TOAST table access, so ensure we have a
+ * valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
object.classId = RelationRelationId;
object.objectId = relid;
object.objectSubId = 0;
@@ -2308,6 +2314,7 @@ do_autovacuum(void)
* To commit the deletion, end current transaction and start a new
* one. Note this also releases the locks we took.
*/
+ PopActiveSnapshot();
CommitTransactionCommand();
StartTransactionCommand();
--
2.39.5 (Apple Git-154)
v5-0001-Ensure-we-have-a-snapshot-when-updating-vario.patch.v13text/plain; charset=us-asciiDownload
From 61c70510702c50e284ab5ed4db6971e3d5a7e836 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Wed, 30 Oct 2024 15:42:29 -0500
Subject: [PATCH v5 1/1] Ensure we have a snapshot when updating various system
catalogs.
Reported-by: Alexander Lakhin
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org
Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
Backpatch-through: 12
---
src/backend/commands/indexcmds.c | 8 ++++++++
src/backend/postmaster/autovacuum.c | 7 +++++++
2 files changed, 15 insertions(+)
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 13fe116f71..9accf86503 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -3512,12 +3512,20 @@ ReindexRelationConcurrently(Oid relationOid, int options)
get_rel_namespace(heapId),
false);
+ /*
+ * Swapping the indexes might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Swap old index with the new one. This also marks the new one as
* valid and the old one as not valid.
*/
index_concurrently_swap(newIndexId, oldIndexId, oldName);
+ PopActiveSnapshot();
+
/*
* Invalidate the relcache for the table, so that after this commit
* all sessions will refresh any cached plans that might reference the
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 94828de9fe..a229741448 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2292,6 +2292,12 @@ do_autovacuum(void)
get_namespace_name(classForm->relnamespace),
NameStr(classForm->relname))));
+ /*
+ * Deletion might involve TOAST table access, so ensure we have a
+ * valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
object.classId = RelationRelationId;
object.objectId = relid;
object.objectSubId = 0;
@@ -2304,6 +2310,7 @@ do_autovacuum(void)
* To commit the deletion, end current transaction and start a new
* one. Note this also releases the locks we took.
*/
+ PopActiveSnapshot();
CommitTransactionCommand();
StartTransactionCommand();
--
2.39.5 (Apple Git-154)
v5-0001-Ensure-we-have-a-snapshot-when-updating-vario.patch.v14text/plain; charset=us-asciiDownload
From 51d43367aef06d43c14fab9985a55fd4dbb67bc2 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Wed, 30 Oct 2024 15:42:29 -0500
Subject: [PATCH v5 1/1] Ensure we have a snapshot when updating various system
catalogs.
Reported-by: Alexander Lakhin
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org
Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
Backpatch-through: 12
---
src/backend/commands/indexcmds.c | 8 ++++++++
src/backend/commands/tablecmds.c | 8 ++++++++
src/backend/postmaster/autovacuum.c | 7 +++++++
src/backend/replication/logical/tablesync.c | 13 +++++++++++++
4 files changed, 36 insertions(+)
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 99c536891a..506039ba5a 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -3998,12 +3998,20 @@ ReindexRelationConcurrently(Oid relationOid, ReindexParams *params)
get_rel_namespace(oldidx->tableId),
false);
+ /*
+ * Swapping the indexes might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Swap old index with the new one. This also marks the new one as
* valid and the old one as not valid.
*/
index_concurrently_swap(newidx->indexId, oldidx->indexId, oldName);
+ PopActiveSnapshot();
+
/*
* Invalidate the relcache for the table, so that after this commit
* all sessions will refresh any cached plans that might reference the
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 6dd4e88d48..d3ddb0c873 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -18200,9 +18200,17 @@ ATExecDetachPartition(List **wqueue, AlteredTableInfo *tab, Relation rel,
tab->rel = rel;
}
+ /*
+ * Detaching the partition might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/* Do the final part of detaching */
DetachPartitionFinalize(rel, partRel, concurrent, defaultPartOid);
+ PopActiveSnapshot();
+
ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partRel));
/* keep our lock until commit */
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 192837bfa3..de84c67ad9 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2319,6 +2319,12 @@ do_autovacuum(void)
get_namespace_name(classForm->relnamespace),
NameStr(classForm->relname))));
+ /*
+ * Deletion might involve TOAST table access, so ensure we have a
+ * valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
object.classId = RelationRelationId;
object.objectId = relid;
object.objectSubId = 0;
@@ -2331,6 +2337,7 @@ do_autovacuum(void)
* To commit the deletion, end current transaction and start a new
* one. Note this also releases the locks we took.
*/
+ PopActiveSnapshot();
CommitTransactionCommand();
StartTransactionCommand();
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 26b71dee67..852ce4f41f 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -451,6 +451,12 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
started_tx = true;
}
+ /*
+ * Updating pg_replication_origin might involve TOAST table
+ * access, so ensure we have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Remove the tablesync origin tracking if exists.
*
@@ -470,6 +476,8 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
sizeof(originname));
replorigin_drop_by_name(originname, true, false);
+ PopActiveSnapshot();
+
/*
* Update the state to READY only after the origin cleanup.
*/
@@ -1092,8 +1100,13 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
* Then advance to the LSN got from walrcv_create_slot. This is WAL
* logged for the purpose of recovery. Locks are to prevent the
* replication origin from vanishing while advancing.
+ *
+ * Updating pg_replication_origin might involve TOAST table access, so
+ * ensure we have a valid snapshot.
*/
+ PushActiveSnapshot(GetTransactionSnapshot());
originid = replorigin_create(originname);
+ PopActiveSnapshot();
LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
replorigin_advance(originid, *origin_startpos, InvalidXLogRecPtr,
--
2.39.5 (Apple Git-154)
v5-0001-Ensure-we-have-a-snapshot-when-updating-vario.patch.v15text/plain; charset=us-asciiDownload
From b827d58167498f0a5dfcf95e84fe297bdeb7b1ad Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Wed, 30 Oct 2024 15:42:29 -0500
Subject: [PATCH v5 1/1] Ensure we have a snapshot when updating various system
catalogs.
Reported-by: Alexander Lakhin
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org
Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
Backpatch-through: 12
---
src/backend/commands/indexcmds.c | 8 +++++++
src/backend/commands/tablecmds.c | 8 +++++++
src/backend/postmaster/autovacuum.c | 7 ++++++
src/backend/replication/logical/tablesync.c | 13 +++++++++++
src/backend/replication/logical/worker.c | 24 +++++++++++++++++++++
5 files changed, 60 insertions(+)
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 157455c52b..99ba76fca3 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -3986,12 +3986,20 @@ ReindexRelationConcurrently(Oid relationOid, ReindexParams *params)
get_rel_namespace(oldidx->tableId),
false);
+ /*
+ * Swapping the indexes might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Swap old index with the new one. This also marks the new one as
* valid and the old one as not valid.
*/
index_concurrently_swap(newidx->indexId, oldidx->indexId, oldName);
+ PopActiveSnapshot();
+
/*
* Invalidate the relcache for the table, so that after this commit
* all sessions will refresh any cached plans that might reference the
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 1402755c36..0b80c64390 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -18769,9 +18769,17 @@ ATExecDetachPartition(List **wqueue, AlteredTableInfo *tab, Relation rel,
tab->rel = rel;
}
+ /*
+ * Detaching the partition might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/* Do the final part of detaching */
DetachPartitionFinalize(rel, partRel, concurrent, defaultPartOid);
+ PopActiveSnapshot();
+
ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partRel));
/* keep our lock until commit */
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 356678b030..f93822cb5b 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2298,6 +2298,12 @@ do_autovacuum(void)
get_namespace_name(classForm->relnamespace),
NameStr(classForm->relname))));
+ /*
+ * Deletion might involve TOAST table access, so ensure we have a
+ * valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
object.classId = RelationRelationId;
object.objectId = relid;
object.objectSubId = 0;
@@ -2310,6 +2316,7 @@ do_autovacuum(void)
* To commit the deletion, end current transaction and start a new
* one. Note this also releases the locks we took.
*/
+ PopActiveSnapshot();
CommitTransactionCommand();
StartTransactionCommand();
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index e6159acba0..2314155338 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -458,6 +458,12 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
started_tx = true;
}
+ /*
+ * Updating pg_replication_origin might involve TOAST table
+ * access, so ensure we have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Remove the tablesync origin tracking if exists.
*
@@ -477,6 +483,8 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
sizeof(originname));
replorigin_drop_by_name(originname, true, false);
+ PopActiveSnapshot();
+
/*
* Update the state to READY only after the origin cleanup.
*/
@@ -1387,8 +1395,13 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
* Then advance to the LSN got from walrcv_create_slot. This is WAL
* logged for the purpose of recovery. Locks are to prevent the
* replication origin from vanishing while advancing.
+ *
+ * Updating pg_replication_origin might involve TOAST table access, so
+ * ensure we have a valid snapshot.
*/
+ PushActiveSnapshot(GetTransactionSnapshot());
originid = replorigin_create(originname);
+ PopActiveSnapshot();
LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
replorigin_advance(originid, *origin_startpos, InvalidXLogRecPtr,
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index dcc3fdf6c7..155bf7c44e 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3794,8 +3794,16 @@ ApplyWorkerMain(Datum main_arg)
walrcv_startstreaming(LogRepWorkerWalRcvConn, &options);
StartTransactionCommand();
+
+ /*
+ * Updating pg_subscription might involve TOAST table access, so
+ * ensure we have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
UpdateTwoPhaseState(MySubscription->oid, LOGICALREP_TWOPHASE_STATE_ENABLED);
MySubscription->twophasestate = LOGICALREP_TWOPHASE_STATE_ENABLED;
+ PopActiveSnapshot();
CommitTransactionCommand();
}
else
@@ -3848,7 +3856,15 @@ DisableSubscriptionAndExit(void)
/* Disable the subscription */
StartTransactionCommand();
+
+ /*
+ * Updating pg_subscription might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
DisableSubscription(MySubscription->oid);
+ PopActiveSnapshot();
CommitTransactionCommand();
/* Notify the subscription has been disabled and exit */
@@ -3939,6 +3955,12 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
started_tx = true;
}
+ /*
+ * Updating pg_subscription might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Protect subskiplsn of pg_subscription from being concurrently updated
* while clearing it.
@@ -3997,6 +4019,8 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
heap_freetuple(tup);
table_close(rel, NoLock);
+ PopActiveSnapshot();
+
if (started_tx)
CommitTransactionCommand();
}
--
2.39.5 (Apple Git-154)
v5-0001-Ensure-we-have-a-snapshot-when-updating-vario.patch.v16text/plain; charset=us-asciiDownload
From 132fe7048627a6d59e7b4310b03425ce8119893a Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Wed, 30 Oct 2024 15:42:29 -0500
Subject: [PATCH v5 1/1] Ensure we have a snapshot when updating various system
catalogs.
Reported-by: Alexander Lakhin
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org
Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
Backpatch-through: 12
---
src/backend/commands/indexcmds.c | 8 +++++++
src/backend/commands/tablecmds.c | 8 +++++++
src/backend/postmaster/autovacuum.c | 7 ++++++
src/backend/replication/logical/tablesync.c | 21 ++++++++++++++++++
src/backend/replication/logical/worker.c | 24 +++++++++++++++++++++
5 files changed, 68 insertions(+)
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 9afd10ec4e..f1d4825e33 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -4057,12 +4057,20 @@ ReindexRelationConcurrently(Oid relationOid, ReindexParams *params)
get_rel_namespace(oldidx->tableId),
false);
+ /*
+ * Swapping the indexes might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Swap old index with the new one. This also marks the new one as
* valid and the old one as not valid.
*/
index_concurrently_swap(newidx->indexId, oldidx->indexId, oldName);
+ PopActiveSnapshot();
+
/*
* Invalidate the relcache for the table, so that after this commit
* all sessions will refresh any cached plans that might reference the
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 0e47f1a457..65103700e5 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -18748,9 +18748,17 @@ ATExecDetachPartition(List **wqueue, AlteredTableInfo *tab, Relation rel,
tab->rel = rel;
}
+ /*
+ * Detaching the partition might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/* Do the final part of detaching */
DetachPartitionFinalize(rel, partRel, concurrent, defaultPartOid);
+ PopActiveSnapshot();
+
ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partRel));
/* keep our lock until commit */
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 7dd9345c61..669e0fb004 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2355,6 +2355,12 @@ do_autovacuum(void)
get_namespace_name(classForm->relnamespace),
NameStr(classForm->relname))));
+ /*
+ * Deletion might involve TOAST table access, so ensure we have a
+ * valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
object.classId = RelationRelationId;
object.objectId = relid;
object.objectSubId = 0;
@@ -2367,6 +2373,7 @@ do_autovacuum(void)
* To commit the deletion, end current transaction and start a new
* one. Note this also releases the locks we took.
*/
+ PopActiveSnapshot();
CommitTransactionCommand();
StartTransactionCommand();
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 7d4ee48206..faa467df7b 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -376,6 +376,12 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
replorigin_session_origin_lsn = InvalidXLogRecPtr;
replorigin_session_origin_timestamp = 0;
+ /*
+ * Updating pg_replication_origin might involve TOAST table access, so
+ * ensure we have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Drop the tablesync's origin tracking if exists.
*
@@ -386,6 +392,8 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
*/
replorigin_drop_by_name(originname, true, false);
+ PopActiveSnapshot();
+
finish_sync_worker();
}
else
@@ -482,6 +490,12 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
started_tx = true;
}
+ /*
+ * Updating pg_replication_origin might involve TOAST table
+ * access, so ensure we have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Remove the tablesync origin tracking if exists.
*
@@ -499,6 +513,8 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
sizeof(originname));
replorigin_drop_by_name(originname, true, false);
+ PopActiveSnapshot();
+
/*
* Update the state to READY only after the origin cleanup.
*/
@@ -1442,8 +1458,13 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
* Then advance to the LSN got from walrcv_create_slot. This is WAL
* logged for the purpose of recovery. Locks are to prevent the
* replication origin from vanishing while advancing.
+ *
+ * Updating pg_replication_origin might involve TOAST table access, so
+ * ensure we have a valid snapshot.
*/
+ PushActiveSnapshot(GetTransactionSnapshot());
originid = replorigin_create(originname);
+ PopActiveSnapshot();
LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
replorigin_advance(originid, *origin_startpos, InvalidXLogRecPtr,
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 18f86c73bd..c37eac9e72 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -4715,8 +4715,16 @@ ApplyWorkerMain(Datum main_arg)
walrcv_startstreaming(LogRepWorkerWalRcvConn, &options);
StartTransactionCommand();
+
+ /*
+ * Updating pg_subscription might involve TOAST table access, so
+ * ensure we have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
UpdateTwoPhaseState(MySubscription->oid, LOGICALREP_TWOPHASE_STATE_ENABLED);
MySubscription->twophasestate = LOGICALREP_TWOPHASE_STATE_ENABLED;
+ PopActiveSnapshot();
CommitTransactionCommand();
}
else
@@ -4769,7 +4777,15 @@ DisableSubscriptionAndExit(void)
/* Disable the subscription */
StartTransactionCommand();
+
+ /*
+ * Updating pg_subscription might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
DisableSubscription(MySubscription->oid);
+ PopActiveSnapshot();
CommitTransactionCommand();
/* Ensure we remove no-longer-useful entry for worker's start time */
@@ -4873,6 +4889,12 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
started_tx = true;
}
+ /*
+ * Updating pg_subscription might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Protect subskiplsn of pg_subscription from being concurrently updated
* while clearing it.
@@ -4931,6 +4953,8 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
heap_freetuple(tup);
table_close(rel, NoLock);
+ PopActiveSnapshot();
+
if (started_tx)
CommitTransactionCommand();
}
--
2.39.5 (Apple Git-154)
v5-0001-Ensure-we-have-a-snapshot-when-updating-vario.patch.v17text/plain; charset=us-asciiDownload
From ef3a3a3aeab379540f894762e798caae3bbc29a7 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Wed, 30 Oct 2024 15:42:29 -0500
Subject: [PATCH v5 1/1] Ensure we have a snapshot when updating various system
catalogs.
Reported-by: Alexander Lakhin
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org
Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
Backpatch-through: 12
---
src/backend/commands/indexcmds.c | 8 +++++++
src/backend/commands/tablecmds.c | 8 +++++++
src/backend/postmaster/autovacuum.c | 7 ++++++
src/backend/replication/logical/tablesync.c | 21 ++++++++++++++++++
src/backend/replication/logical/worker.c | 24 +++++++++++++++++++++
5 files changed, 68 insertions(+)
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 3d8df046da..654f9262b1 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -4087,12 +4087,20 @@ ReindexRelationConcurrently(const ReindexStmt *stmt, Oid relationOid, const Rein
get_rel_namespace(oldidx->tableId),
false);
+ /*
+ * Swapping the indexes might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Swap old index with the new one. This also marks the new one as
* valid and the old one as not valid.
*/
index_concurrently_swap(newidx->indexId, oldidx->indexId, oldName);
+ PopActiveSnapshot();
+
/*
* Invalidate the relcache for the table, so that after this commit
* all sessions will refresh any cached plans that might reference the
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 36717ffcb0..e94e88652e 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -19227,9 +19227,17 @@ ATExecDetachPartition(List **wqueue, AlteredTableInfo *tab, Relation rel,
tab->rel = rel;
}
+ /*
+ * Detaching the partition might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/* Do the final part of detaching */
DetachPartitionFinalize(rel, partRel, concurrent, defaultPartOid);
+ PopActiveSnapshot();
+
ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partRel));
/* keep our lock until commit */
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 8f27026d19..a27ae14b72 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2201,6 +2201,12 @@ do_autovacuum(void)
get_namespace_name(classForm->relnamespace),
NameStr(classForm->relname))));
+ /*
+ * Deletion might involve TOAST table access, so ensure we have a
+ * valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
object.classId = RelationRelationId;
object.objectId = relid;
object.objectSubId = 0;
@@ -2213,6 +2219,7 @@ do_autovacuum(void)
* To commit the deletion, end current transaction and start a new
* one. Note this also releases the locks we took.
*/
+ PopActiveSnapshot();
CommitTransactionCommand();
StartTransactionCommand();
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index b00267f042..e1e5d80997 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -377,6 +377,12 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
replorigin_session_origin_lsn = InvalidXLogRecPtr;
replorigin_session_origin_timestamp = 0;
+ /*
+ * Updating pg_replication_origin might involve TOAST table access, so
+ * ensure we have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Drop the tablesync's origin tracking if exists.
*
@@ -387,6 +393,8 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
*/
replorigin_drop_by_name(originname, true, false);
+ PopActiveSnapshot();
+
finish_sync_worker();
}
else
@@ -483,6 +491,12 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
started_tx = true;
}
+ /*
+ * Updating pg_replication_origin might involve TOAST table
+ * access, so ensure we have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Remove the tablesync origin tracking if exists.
*
@@ -500,6 +514,8 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
sizeof(originname));
replorigin_drop_by_name(originname, true, false);
+ PopActiveSnapshot();
+
/*
* Update the state to READY only after the origin cleanup.
*/
@@ -1454,8 +1470,13 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
* Then advance to the LSN got from walrcv_create_slot. This is WAL
* logged for the purpose of recovery. Locks are to prevent the
* replication origin from vanishing while advancing.
+ *
+ * Updating pg_replication_origin might involve TOAST table access, so
+ * ensure we have a valid snapshot.
*/
+ PushActiveSnapshot(GetTransactionSnapshot());
originid = replorigin_create(originname);
+ PopActiveSnapshot();
LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
replorigin_advance(originid, *origin_startpos, InvalidXLogRecPtr,
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index d091a1dd27..95b680fde4 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -4529,8 +4529,16 @@ run_apply_worker()
walrcv_startstreaming(LogRepWorkerWalRcvConn, &options);
StartTransactionCommand();
+
+ /*
+ * Updating pg_subscription might involve TOAST table access, so
+ * ensure we have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
UpdateTwoPhaseState(MySubscription->oid, LOGICALREP_TWOPHASE_STATE_ENABLED);
MySubscription->twophasestate = LOGICALREP_TWOPHASE_STATE_ENABLED;
+ PopActiveSnapshot();
CommitTransactionCommand();
}
else
@@ -4746,7 +4754,15 @@ DisableSubscriptionAndExit(void)
/* Disable the subscription */
StartTransactionCommand();
+
+ /*
+ * Updating pg_subscription might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
DisableSubscription(MySubscription->oid);
+ PopActiveSnapshot();
CommitTransactionCommand();
/* Ensure we remove no-longer-useful entry for worker's start time */
@@ -4850,6 +4866,12 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
started_tx = true;
}
+ /*
+ * Updating pg_subscription might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Protect subskiplsn of pg_subscription from being concurrently updated
* while clearing it.
@@ -4908,6 +4930,8 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
heap_freetuple(tup);
table_close(rel, NoLock);
+ PopActiveSnapshot();
+
if (started_tx)
CommitTransactionCommand();
}
--
2.39.5 (Apple Git-154)
On Wed, Oct 30, 2024 at 03:54:32PM -0500, Nathan Bossart wrote:
I'll manage. 0001 was a doozy to back-patch, and this obviously isn't a
terribly pressing issue, so I plan to wait until after the November minor
release to apply this.
Okay by me.
I'm having second thoughts on 0002, so I'll
probably leave that one uncommitted, but IMHO we definitely need 0003 to
prevent this issue from reoccurring.
I liked your idea behind 0002, FWIW. We do that for a lot of fields
already in a Relation.
--
Michael
On Thu, Oct 31, 2024 at 08:45:15AM +0900, Michael Paquier wrote:
On Wed, Oct 30, 2024 at 03:54:32PM -0500, Nathan Bossart wrote:
I'll manage. 0001 was a doozy to back-patch, and this obviously isn't a
terribly pressing issue, so I plan to wait until after the November minor
release to apply this.Okay by me.
I took another look at 0001, and I think it still needs a bit of work.
Specifically, IIUC a few of these code paths are actually fine since they
do not ever touch a varlena column. For example,
clear_subscription_skip_lsn() only updates subskiplsn, so unless I am
missing a corner case, there is no risk of trying to access a TOAST table
without a snapshot. Allowing such cases would involve adjusting the new
assertions in 0003 to check for only the varlena columns during
inserts/updates, which is more complicated but seems doable.
Or we could just enforce that you have an active snapshot whenever you
modify a catalog with a TOAST table. That's simpler, but it requires extra
work in some paths (and probably comments to point out that we're only
pushing an active snapshot to satisfy an assertion).
I'm having second thoughts on 0002, so I'll
probably leave that one uncommitted, but IMHO we definitely need 0003 to
prevent this issue from reoccurring.I liked your idea behind 0002, FWIW. We do that for a lot of fields
already in a Relation.
Alright, then I'll plan on proceeding with it.
--
nathan
On Mon, Nov 25, 2024 at 01:29:31PM -0600, Nathan Bossart wrote:
Or we could just enforce that you have an active snapshot whenever you
modify a catalog with a TOAST table. That's simpler, but it requires extra
work in some paths (and probably comments to point out that we're only
pushing an active snapshot to satisfy an assertion).
I may be wrong, but I suspect that enforcing the check without being
column-based is the right way to go and that this is going to catch
more errors in the long-term than being a maintenance burden. So I
would keep the snapshot check even if it's a bit aggressive, still
it's useful. And we are not talking about that may code paths that
need to be switched to require a snapshot, as well. Most of the ones
you have mentioned on this thread are really particular in the ways
they do transaction handling. I suspect that it may also catch
out-of-core issues with extensions doing direct catalog manipulations.
--
Michael
On Wed, Nov 27, 2024 at 03:20:19PM +0900, Michael Paquier wrote:
On Mon, Nov 25, 2024 at 01:29:31PM -0600, Nathan Bossart wrote:
Or we could just enforce that you have an active snapshot whenever you
modify a catalog with a TOAST table. That's simpler, but it requires extra
work in some paths (and probably comments to point out that we're only
pushing an active snapshot to satisfy an assertion).I may be wrong, but I suspect that enforcing the check without being
column-based is the right way to go and that this is going to catch
more errors in the long-term than being a maintenance burden. So I
would keep the snapshot check even if it's a bit aggressive, still
it's useful. And we are not talking about that may code paths that
need to be switched to require a snapshot, as well. Most of the ones
you have mentioned on this thread are really particular in the ways
they do transaction handling. I suspect that it may also catch
out-of-core issues with extensions doing direct catalog manipulations.
That is useful feedback, thanks.
One other thing that caught my eye is that replorigin_create() uses
SnapshotDirty, so I'm unsure if
PushActiveSnapshot(GetTransactionSnapshot()) is correct there. The only
other example I found is RelationFindReplTupleByIndex(), which uses
GetLatestSnapshot(). But I do see that CreateSubscription() calls
replorigin_create(), and it seems to rely on the transaction snapshot, so
maybe it's okay, at least for the purpose of TOAST table access... I'm
finding myself wishing there was a bit more commentary about the proper
usage of these snapshot functions. Maybe I can try to add some as a
follow-up exercise.
--
nathan
On Wed, 27 Nov 2024 at 21:29, Nathan Bossart <nathandbossart@gmail.com> wrote:
On Wed, Nov 27, 2024 at 03:20:19PM +0900, Michael Paquier wrote:
On Mon, Nov 25, 2024 at 01:29:31PM -0600, Nathan Bossart wrote:
Or we could just enforce that you have an active snapshot whenever you
modify a catalog with a TOAST table. That's simpler, but it requires extra
work in some paths (and probably comments to point out that we're only
pushing an active snapshot to satisfy an assertion).I may be wrong, but I suspect that enforcing the check without being
column-based is the right way to go and that this is going to catch
more errors in the long-term than being a maintenance burden. So I
would keep the snapshot check even if it's a bit aggressive, still
it's useful. And we are not talking about that may code paths that
need to be switched to require a snapshot, as well. Most of the ones
you have mentioned on this thread are really particular in the ways
they do transaction handling. I suspect that it may also catch
out-of-core issues with extensions doing direct catalog manipulations.That is useful feedback, thanks.
Michael's feedback from [1]/messages/by-id/Z0a6IwjW36af71J7@paquier.xyz is still pending, so I'm updating the
status to "Waiting on Author." Please provide an updated patch and
change the status back to "Needs Review".
[1]: /messages/by-id/Z0a6IwjW36af71J7@paquier.xyz
Regards,
Vignesh
On Wed, Nov 27, 2024 at 9:29 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Wed, Nov 27, 2024 at 03:20:19PM +0900, Michael Paquier wrote:
On Mon, Nov 25, 2024 at 01:29:31PM -0600, Nathan Bossart wrote:
Or we could just enforce that you have an active snapshot whenever you
modify a catalog with a TOAST table. That's simpler, but it requires extra
work in some paths (and probably comments to point out that we're only
pushing an active snapshot to satisfy an assertion).I may be wrong, but I suspect that enforcing the check without being
column-based is the right way to go and that this is going to catch
more errors in the long-term than being a maintenance burden. So I
would keep the snapshot check even if it's a bit aggressive, still
it's useful. And we are not talking about that may code paths that
need to be switched to require a snapshot, as well. Most of the ones
you have mentioned on this thread are really particular in the ways
they do transaction handling. I suspect that it may also catch
out-of-core issues with extensions doing direct catalog manipulations.That is useful feedback, thanks.
One other thing that caught my eye is that replorigin_create() uses
SnapshotDirty, so I'm unsure if
PushActiveSnapshot(GetTransactionSnapshot()) is correct there.
Can we dodge adding this push call if we restrict the length of the
origin name to some reasonable limit like 256 or 512 and avoid the
need of toast altogether?
--
With Regards,
Amit Kapila.
On Fri, Apr 04, 2025 at 05:16:43PM +0530, Amit Kapila wrote:
Can we dodge adding this push call if we restrict the length of the
origin name to some reasonable limit like 256 or 512 and avoid the
need of toast altogether?
We did consider just removing pg_replication_origin's TOAST table earlier
[0]: /messages/by-id/ZvOMBhlb5zrBCG5p@paquier.xyz
reconsidering...
[0]: /messages/by-id/ZvOMBhlb5zrBCG5p@paquier.xyz
--
nathan
On Fri, Apr 4, 2025 at 7:58 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Fri, Apr 04, 2025 at 05:16:43PM +0530, Amit Kapila wrote:
Can we dodge adding this push call if we restrict the length of the
origin name to some reasonable limit like 256 or 512 and avoid the
need of toast altogether?We did consider just removing pg_replication_origin's TOAST table earlier
[0], but we decided against it at that time. Maybe it's worth
reconsidering...
I don't see any good reason in that email for not removing the TOAST
table for pg_replication_origin. I would prefer to remove it rather
than add protection related to its access.
--
With Regards,
Amit Kapila.
On Tue, Apr 08, 2025 at 04:44:02PM +0530, Amit Kapila wrote:
On Fri, Apr 4, 2025 at 7:58 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Fri, Apr 04, 2025 at 05:16:43PM +0530, Amit Kapila wrote:
Can we dodge adding this push call if we restrict the length of the
origin name to some reasonable limit like 256 or 512 and avoid the
need of toast altogether?We did consider just removing pg_replication_origin's TOAST table earlier
[0], but we decided against it at that time. Maybe it's worth
reconsidering...I don't see any good reason in that email for not removing the TOAST
table for pg_replication_origin. I would prefer to remove it rather
than add protection related to its access.
The only reason I can think of is that folks might have existing
replication origins with extremely long names that would cause upgrades to
fail. While I think it would be probably be okay to remove the TOAST table
and restrict origin names to 512 bytes (like we did for password hashes in
commit 8275325), folks routinely complain about NAMEDATALEN, so I think we
should be cautious here.
--
nathan
On Tue, Apr 8, 2025, at 5:25 PM, Nathan Bossart wrote:
On Tue, Apr 08, 2025 at 04:44:02PM +0530, Amit Kapila wrote:
On Fri, Apr 4, 2025 at 7:58 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Fri, Apr 04, 2025 at 05:16:43PM +0530, Amit Kapila wrote:
Can we dodge adding this push call if we restrict the length of the
origin name to some reasonable limit like 256 or 512 and avoid the
need of toast altogether?We did consider just removing pg_replication_origin's TOAST table earlier
[0], but we decided against it at that time. Maybe it's worth
reconsidering...I don't see any good reason in that email for not removing the TOAST
table for pg_replication_origin. I would prefer to remove it rather
than add protection related to its access.The only reason I can think of is that folks might have existing
replication origins with extremely long names that would cause upgrades to
fail. While I think it would be probably be okay to remove the TOAST table
and restrict origin names to 512 bytes (like we did for password hashes in
commit 8275325), folks routinely complain about NAMEDATALEN, so I think we
should be cautious here.
The logical replication creates origin names as pg_SUBOID_RELID or pg_SUBOID.
It means the maximum origin name is 24. This limited origin name also applies
to pglogical that limits the name to 54 IIRC. I think that covers the majority
of the logical replication setups. There might be a small number of custom
logical replication systems that possibly use long names for replication
origin. I've never seen a replication origin name longer than NAMEDATALEN.
If you consider that the maximum number of replication origin is limited to 2
bytes (65k distinct names), it is reasonable to restrict the replication
origin names to 512 due to the high number of combinations. We generally
expects that a catalog string uses "name" as type if it is an identifier; it
could be the case for roname if author decided to be strict.
This additional TOAST table has no or rare use. +1 for removing it. It is one
less file, one less table and one less index; in short, one less source of data
corruption. ;)
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Tue, Apr 08, 2025 at 11:21:31PM -0300, Euler Taveira wrote:
The logical replication creates origin names as pg_SUBOID_RELID or pg_SUBOID.
It means the maximum origin name is 24. This limited origin name also applies
to pglogical that limits the name to 54 IIRC. I think that covers the majority
of the logical replication setups. There might be a small number of custom
logical replication systems that possibly use long names for replication
origin. I've never seen a replication origin name longer than NAMEDATALEN.
pg_replication_origin_create() can be used with text as input for the
origin name, still your argument sounds sensible here as I would
suspect that most setups of logical replication are these.
If you consider that the maximum number of replication origin is limited to 2
bytes (65k distinct names), it is reasonable to restrict the replication
origin names to 512 due to the high number of combinations. We generally
expects that a catalog string uses "name" as type if it is an identifier; it
could be the case for roname if author decided to be strict.
I would be more cautious than a limit on NAMEDATALEN. The restriction
suggested by Nathan at 512 bytes should be plenty enough.
This additional TOAST table has no or rare use. +1 for removing it. It is one
less file, one less table and one less index; in short, one less source of data
corruption. ;)
I guess that's the consensus, then. No objections to the removal here.
--
Michael
On Wed, Apr 09, 2025 at 01:20:29PM +0900, Michael Paquier wrote:
I guess that's the consensus, then. No objections to the removal here.
That would look like the attached. There are still a couple of other known
TOAST snapshot issues I need to revisit, but this sidesteps a few of them.
But this'll need to wait for a couple of months until v19 development
begins.
--
nathan
Attachments:
v1-0001-Remove-pg_replication_origin-s-TOAST-table.patchtext/plain; charset=us-asciiDownload
From 932a7ff2623c3185f6f98194233d3d99807909fe Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Wed, 9 Apr 2025 14:00:31 -0500
Subject: [PATCH v1 1/1] Remove pg_replication_origin's TOAST table.
---
src/backend/catalog/catalog.c | 2 --
src/backend/replication/logical/origin.c | 12 ++++++++++++
src/include/catalog/pg_replication_origin.h | 2 --
src/include/replication/origin.h | 7 +++++++
src/test/regress/expected/misc_functions.out | 4 ++++
src/test/regress/expected/misc_sanity.out | 6 +++++-
src/test/regress/sql/misc_functions.sql | 3 +++
src/test/regress/sql/misc_sanity.sql | 3 +++
8 files changed, 34 insertions(+), 5 deletions(-)
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index a6edf614606..a7eb60dd2d2 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -315,8 +315,6 @@ IsSharedRelation(Oid relationId)
relationId == PgDbRoleSettingToastIndex ||
relationId == PgParameterAclToastTable ||
relationId == PgParameterAclToastIndex ||
- relationId == PgReplicationOriginToastTable ||
- relationId == PgReplicationOriginToastIndex ||
relationId == PgShdescriptionToastTable ||
relationId == PgShdescriptionToastIndex ||
relationId == PgShseclabelToastTable ||
diff --git a/src/backend/replication/logical/origin.c b/src/backend/replication/logical/origin.c
index 6583dd497da..22b4aa0b382 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -264,6 +264,18 @@ replorigin_create(const char *roname)
SysScanDesc scan;
ScanKeyData key;
+ /*
+ * To avoid needing a TOAST table for pg_replication_origin, we restrict
+ * replication origin names to 512 bytes. This should be more than enough
+ * for all practical use.
+ */
+ if (strlen(roname) > MAX_RONAME_LEN)
+ ereport(ERROR,
+ (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+ errmsg("replication origin name is too long"),
+ errdetail("Repilcation origin names must be no longer than %d bytes.",
+ MAX_RONAME_LEN)));
+
roname_d = CStringGetTextDatum(roname);
Assert(IsTransactionState());
diff --git a/src/include/catalog/pg_replication_origin.h b/src/include/catalog/pg_replication_origin.h
index deb43065fe9..7ade8bbda39 100644
--- a/src/include/catalog/pg_replication_origin.h
+++ b/src/include/catalog/pg_replication_origin.h
@@ -54,8 +54,6 @@ CATALOG(pg_replication_origin,6000,ReplicationOriginRelationId) BKI_SHARED_RELAT
typedef FormData_pg_replication_origin *Form_pg_replication_origin;
-DECLARE_TOAST_WITH_MACRO(pg_replication_origin, 4181, 4182, PgReplicationOriginToastTable, PgReplicationOriginToastIndex);
-
DECLARE_UNIQUE_INDEX_PKEY(pg_replication_origin_roiident_index, 6001, ReplicationOriginIdentIndex, pg_replication_origin, btree(roident oid_ops));
DECLARE_UNIQUE_INDEX(pg_replication_origin_roname_index, 6002, ReplicationOriginNameIndex, pg_replication_origin, btree(roname text_ops));
diff --git a/src/include/replication/origin.h b/src/include/replication/origin.h
index 9cb2248fa9f..095b3e6fa50 100644
--- a/src/include/replication/origin.h
+++ b/src/include/replication/origin.h
@@ -33,6 +33,13 @@ typedef struct xl_replorigin_drop
#define InvalidRepOriginId 0
#define DoNotReplicateId PG_UINT16_MAX
+/*
+ * To avoid needing a TOAST table for pg_replication_origin, we restrict
+ * replication origin names to 512 bytes. This should be more than enough for
+ * all practical use.
+ */
+#define MAX_RONAME_LEN (512)
+
extern PGDLLIMPORT RepOriginId replorigin_session_origin;
extern PGDLLIMPORT XLogRecPtr replorigin_session_origin_lsn;
extern PGDLLIMPORT TimestampTz replorigin_session_origin_timestamp;
diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index d3f5d16a672..db1431a87f7 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -914,3 +914,7 @@ SELECT test_relpath();
(1 row)
+-- pg_replication_origin.roname length restriction
+SELECT pg_replication_origin_create(repeat('0123456789abcdef', 33));
+ERROR: replication origin name is too long
+DETAIL: Repilcation origin names must be no longer than 512 bytes.
diff --git a/src/test/regress/expected/misc_sanity.out b/src/test/regress/expected/misc_sanity.out
index b032a3f4761..0cb58311329 100644
--- a/src/test/regress/expected/misc_sanity.out
+++ b/src/test/regress/expected/misc_sanity.out
@@ -42,6 +42,9 @@ WHERE refclassid = 0 OR refobjid = 0 OR
-- as user data by pg_upgrade, which would cause failures.
-- 3. pg_authid, since its toast table cannot be accessed when it would be
-- needed, i.e., during authentication before we've selected a database.
+-- 4. pg_replication_origin, since its toast table would only be used for
+-- replication origin names, which we can avoid with a reasonable length
+-- restriction.
SELECT relname, attname, atttypid::regtype
FROM pg_class c JOIN pg_attribute a ON c.oid = attrelid
WHERE c.oid < 16384 AND
@@ -61,7 +64,8 @@ ORDER BY 1, 2;
pg_class | relpartbound | pg_node_tree
pg_largeobject | data | bytea
pg_largeobject_metadata | lomacl | aclitem[]
-(10 rows)
+ pg_replication_origin | roname | text
+(11 rows)
-- system catalogs without primary keys
--
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index aaebb298330..7122ff6133d 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -411,3 +411,6 @@ CREATE FUNCTION test_relpath()
AS :'regresslib'
LANGUAGE C;
SELECT test_relpath();
+
+-- pg_replication_origin.roname length restriction
+SELECT pg_replication_origin_create(repeat('0123456789abcdef', 33));
diff --git a/src/test/regress/sql/misc_sanity.sql b/src/test/regress/sql/misc_sanity.sql
index 135793871b4..545b2b48989 100644
--- a/src/test/regress/sql/misc_sanity.sql
+++ b/src/test/regress/sql/misc_sanity.sql
@@ -45,6 +45,9 @@ WHERE refclassid = 0 OR refobjid = 0 OR
-- as user data by pg_upgrade, which would cause failures.
-- 3. pg_authid, since its toast table cannot be accessed when it would be
-- needed, i.e., during authentication before we've selected a database.
+-- 4. pg_replication_origin, since its toast table would only be used for
+-- replication origin names, which we can avoid with a reasonable length
+-- restriction.
SELECT relname, attname, atttypid::regtype
FROM pg_class c JOIN pg_attribute a ON c.oid = attrelid
--
2.39.5 (Apple Git-154)
On Wed, Apr 9, 2025, at 4:05 PM, Nathan Bossart wrote:
On Wed, Apr 09, 2025 at 01:20:29PM +0900, Michael Paquier wrote:
I guess that's the consensus, then. No objections to the removal here.
That would look like the attached. There are still a couple of other known
TOAST snapshot issues I need to revisit, but this sidesteps a few of them.
But this'll need to wait for a couple of months until v19 development
begins.
LGTM. I have a few suggestions.
+ /*
+ * To avoid needing a TOAST table for pg_replication_origin, we restrict
+ * replication origin names to 512 bytes. This should be more than enough
+ * for all practical use.
+ */
+ if (strlen(roname) > MAX_RONAME_LEN)
+ ereport(ERROR,
I wouldn't duplicate the comment. Instead, I would keep it only in origin.h.
+ errdetail("Repilcation origin names must be no longer than %d bytes.",
+ MAX_RONAME_LEN)));
s/Repilcation/Replication/
+#define MAX_RONAME_LEN (512)
It is just a cosmetic suggestion but I usually use parentheses when it is an
expression but don't include it for a single number.
Should we document this maximum length?
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Wed, Apr 09, 2025 at 08:54:21PM -0300, Euler Taveira wrote:
LGTM. I have a few suggestions.
Thanks for reviewing.
+ /* + * To avoid needing a TOAST table for pg_replication_origin, we restrict + * replication origin names to 512 bytes. This should be more than enough + * for all practical use. + */ + if (strlen(roname) > MAX_RONAME_LEN) + ereport(ERROR,I wouldn't duplicate the comment. Instead, I would keep it only in origin.h.
Hm. I agree that duplicating the comment isn't great, but I'm also not
wild about forcing readers to jump to the macro definition to figure out
why there is a length restriction.
+ errdetail("Repilcation origin names must be no longer than %d bytes.", + MAX_RONAME_LEN)));s/Repilcation/Replication/
Fixed.
+#define MAX_RONAME_LEN (512)
It is just a cosmetic suggestion but I usually use parentheses when it is an
expression but don't include it for a single number.
We use both styles, but the no-parentheses style does seem to be preferred.
$ grep -E "^#define\s[A-Z_]+\s\([0-9]+\)$" src/* -rI | wc -l
23
$ grep -E "^#define\s[A-Z_]+\s[0-9]+$" src/* -rI | wc -l
861
Should we document this maximum length?
I've added a note.
--
nathan
Attachments:
v2-0001-Remove-pg_replication_origin-s-TOAST-table.patchtext/plain; charset=us-asciiDownload
From 6762a3786b76379c82ccfa4c9da1812e928179b5 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Wed, 9 Apr 2025 14:00:31 -0500
Subject: [PATCH v2 1/1] Remove pg_replication_origin's TOAST table.
---
doc/src/sgml/func.sgml | 1 +
src/backend/catalog/catalog.c | 2 --
src/backend/replication/logical/origin.c | 12 ++++++++++++
src/include/catalog/pg_replication_origin.h | 2 --
src/include/replication/origin.h | 7 +++++++
src/test/regress/expected/misc_functions.out | 4 ++++
src/test/regress/expected/misc_sanity.out | 6 +++++-
src/test/regress/sql/misc_functions.sql | 3 +++
src/test/regress/sql/misc_sanity.sql | 3 +++
9 files changed, 35 insertions(+), 5 deletions(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 1c5cfee25d1..a9c5c855524 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29941,6 +29941,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
<para>
Creates a replication origin with the given external
name, and returns the internal ID assigned to it.
+ The name must be no longer than 512 bytes.
</para></entry>
</row>
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index a6edf614606..a7eb60dd2d2 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -315,8 +315,6 @@ IsSharedRelation(Oid relationId)
relationId == PgDbRoleSettingToastIndex ||
relationId == PgParameterAclToastTable ||
relationId == PgParameterAclToastIndex ||
- relationId == PgReplicationOriginToastTable ||
- relationId == PgReplicationOriginToastIndex ||
relationId == PgShdescriptionToastTable ||
relationId == PgShdescriptionToastIndex ||
relationId == PgShseclabelToastTable ||
diff --git a/src/backend/replication/logical/origin.c b/src/backend/replication/logical/origin.c
index 6583dd497da..c17c0348c6e 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -264,6 +264,18 @@ replorigin_create(const char *roname)
SysScanDesc scan;
ScanKeyData key;
+ /*
+ * To avoid needing a TOAST table for pg_replication_origin, we restrict
+ * replication origin names to 512 bytes. This should be more than enough
+ * for all practical use.
+ */
+ if (strlen(roname) > MAX_RONAME_LEN)
+ ereport(ERROR,
+ (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+ errmsg("replication origin name is too long"),
+ errdetail("Replication origin names must be no longer than %d bytes.",
+ MAX_RONAME_LEN)));
+
roname_d = CStringGetTextDatum(roname);
Assert(IsTransactionState());
diff --git a/src/include/catalog/pg_replication_origin.h b/src/include/catalog/pg_replication_origin.h
index deb43065fe9..7ade8bbda39 100644
--- a/src/include/catalog/pg_replication_origin.h
+++ b/src/include/catalog/pg_replication_origin.h
@@ -54,8 +54,6 @@ CATALOG(pg_replication_origin,6000,ReplicationOriginRelationId) BKI_SHARED_RELAT
typedef FormData_pg_replication_origin *Form_pg_replication_origin;
-DECLARE_TOAST_WITH_MACRO(pg_replication_origin, 4181, 4182, PgReplicationOriginToastTable, PgReplicationOriginToastIndex);
-
DECLARE_UNIQUE_INDEX_PKEY(pg_replication_origin_roiident_index, 6001, ReplicationOriginIdentIndex, pg_replication_origin, btree(roident oid_ops));
DECLARE_UNIQUE_INDEX(pg_replication_origin_roname_index, 6002, ReplicationOriginNameIndex, pg_replication_origin, btree(roname text_ops));
diff --git a/src/include/replication/origin.h b/src/include/replication/origin.h
index 9cb2248fa9f..29724c0dcae 100644
--- a/src/include/replication/origin.h
+++ b/src/include/replication/origin.h
@@ -33,6 +33,13 @@ typedef struct xl_replorigin_drop
#define InvalidRepOriginId 0
#define DoNotReplicateId PG_UINT16_MAX
+/*
+ * To avoid needing a TOAST table for pg_replication_origin, we restrict
+ * replication origin names to 512 bytes. This should be more than enough for
+ * all practical use.
+ */
+#define MAX_RONAME_LEN 512
+
extern PGDLLIMPORT RepOriginId replorigin_session_origin;
extern PGDLLIMPORT XLogRecPtr replorigin_session_origin_lsn;
extern PGDLLIMPORT TimestampTz replorigin_session_origin_timestamp;
diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index d3f5d16a672..d28b69e50ad 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -914,3 +914,7 @@ SELECT test_relpath();
(1 row)
+-- pg_replication_origin.roname length restriction
+SELECT pg_replication_origin_create(repeat('0123456789abcdef', 33));
+ERROR: replication origin name is too long
+DETAIL: Replication origin names must be no longer than 512 bytes.
diff --git a/src/test/regress/expected/misc_sanity.out b/src/test/regress/expected/misc_sanity.out
index b032a3f4761..0cb58311329 100644
--- a/src/test/regress/expected/misc_sanity.out
+++ b/src/test/regress/expected/misc_sanity.out
@@ -42,6 +42,9 @@ WHERE refclassid = 0 OR refobjid = 0 OR
-- as user data by pg_upgrade, which would cause failures.
-- 3. pg_authid, since its toast table cannot be accessed when it would be
-- needed, i.e., during authentication before we've selected a database.
+-- 4. pg_replication_origin, since its toast table would only be used for
+-- replication origin names, which we can avoid with a reasonable length
+-- restriction.
SELECT relname, attname, atttypid::regtype
FROM pg_class c JOIN pg_attribute a ON c.oid = attrelid
WHERE c.oid < 16384 AND
@@ -61,7 +64,8 @@ ORDER BY 1, 2;
pg_class | relpartbound | pg_node_tree
pg_largeobject | data | bytea
pg_largeobject_metadata | lomacl | aclitem[]
-(10 rows)
+ pg_replication_origin | roname | text
+(11 rows)
-- system catalogs without primary keys
--
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index aaebb298330..7122ff6133d 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -411,3 +411,6 @@ CREATE FUNCTION test_relpath()
AS :'regresslib'
LANGUAGE C;
SELECT test_relpath();
+
+-- pg_replication_origin.roname length restriction
+SELECT pg_replication_origin_create(repeat('0123456789abcdef', 33));
diff --git a/src/test/regress/sql/misc_sanity.sql b/src/test/regress/sql/misc_sanity.sql
index 135793871b4..545b2b48989 100644
--- a/src/test/regress/sql/misc_sanity.sql
+++ b/src/test/regress/sql/misc_sanity.sql
@@ -45,6 +45,9 @@ WHERE refclassid = 0 OR refobjid = 0 OR
-- as user data by pg_upgrade, which would cause failures.
-- 3. pg_authid, since its toast table cannot be accessed when it would be
-- needed, i.e., during authentication before we've selected a database.
+-- 4. pg_replication_origin, since its toast table would only be used for
+-- replication origin names, which we can avoid with a reasonable length
+-- restriction.
SELECT relname, attname, atttypid::regtype
FROM pg_class c JOIN pg_attribute a ON c.oid = attrelid
--
2.39.5 (Apple Git-154)
Apparently replication origins in tests are supposed to be prefixed with
"regress_". Here is a new patch with that fixed.
--
nathan
Attachments:
v3-0001-Remove-pg_replication_origin-s-TOAST-table.patchtext/plain; charset=us-asciiDownload
From fa6a8a5a155a7beb50e54bfe42c81be26ffd428a Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Mon, 21 Apr 2025 10:53:15 -0500
Subject: [PATCH v3 1/1] Remove pg_replication_origin's TOAST table.
---
doc/src/sgml/func.sgml | 1 +
src/backend/catalog/catalog.c | 2 --
src/backend/replication/logical/origin.c | 12 ++++++++++++
src/include/catalog/pg_replication_origin.h | 2 --
src/include/replication/origin.h | 7 +++++++
src/test/regress/expected/misc_functions.out | 4 ++++
src/test/regress/expected/misc_sanity.out | 6 +++++-
src/test/regress/sql/misc_functions.sql | 3 +++
src/test/regress/sql/misc_sanity.sql | 3 +++
9 files changed, 35 insertions(+), 5 deletions(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 574a544d9fa..593e45495be 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29941,6 +29941,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
<para>
Creates a replication origin with the given external
name, and returns the internal ID assigned to it.
+ The name must be no longer than 512 bytes.
</para></entry>
</row>
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 60000bd0bc7..59caae8f1bc 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -346,8 +346,6 @@ IsSharedRelation(Oid relationId)
relationId == PgDbRoleSettingToastIndex ||
relationId == PgParameterAclToastTable ||
relationId == PgParameterAclToastIndex ||
- relationId == PgReplicationOriginToastTable ||
- relationId == PgReplicationOriginToastIndex ||
relationId == PgShdescriptionToastTable ||
relationId == PgShdescriptionToastIndex ||
relationId == PgShseclabelToastTable ||
diff --git a/src/backend/replication/logical/origin.c b/src/backend/replication/logical/origin.c
index 6583dd497da..c17c0348c6e 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -264,6 +264,18 @@ replorigin_create(const char *roname)
SysScanDesc scan;
ScanKeyData key;
+ /*
+ * To avoid needing a TOAST table for pg_replication_origin, we restrict
+ * replication origin names to 512 bytes. This should be more than enough
+ * for all practical use.
+ */
+ if (strlen(roname) > MAX_RONAME_LEN)
+ ereport(ERROR,
+ (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+ errmsg("replication origin name is too long"),
+ errdetail("Replication origin names must be no longer than %d bytes.",
+ MAX_RONAME_LEN)));
+
roname_d = CStringGetTextDatum(roname);
Assert(IsTransactionState());
diff --git a/src/include/catalog/pg_replication_origin.h b/src/include/catalog/pg_replication_origin.h
index deb43065fe9..7ade8bbda39 100644
--- a/src/include/catalog/pg_replication_origin.h
+++ b/src/include/catalog/pg_replication_origin.h
@@ -54,8 +54,6 @@ CATALOG(pg_replication_origin,6000,ReplicationOriginRelationId) BKI_SHARED_RELAT
typedef FormData_pg_replication_origin *Form_pg_replication_origin;
-DECLARE_TOAST_WITH_MACRO(pg_replication_origin, 4181, 4182, PgReplicationOriginToastTable, PgReplicationOriginToastIndex);
-
DECLARE_UNIQUE_INDEX_PKEY(pg_replication_origin_roiident_index, 6001, ReplicationOriginIdentIndex, pg_replication_origin, btree(roident oid_ops));
DECLARE_UNIQUE_INDEX(pg_replication_origin_roname_index, 6002, ReplicationOriginNameIndex, pg_replication_origin, btree(roname text_ops));
diff --git a/src/include/replication/origin.h b/src/include/replication/origin.h
index 9cb2248fa9f..29724c0dcae 100644
--- a/src/include/replication/origin.h
+++ b/src/include/replication/origin.h
@@ -33,6 +33,13 @@ typedef struct xl_replorigin_drop
#define InvalidRepOriginId 0
#define DoNotReplicateId PG_UINT16_MAX
+/*
+ * To avoid needing a TOAST table for pg_replication_origin, we restrict
+ * replication origin names to 512 bytes. This should be more than enough for
+ * all practical use.
+ */
+#define MAX_RONAME_LEN 512
+
extern PGDLLIMPORT RepOriginId replorigin_session_origin;
extern PGDLLIMPORT XLogRecPtr replorigin_session_origin_lsn;
extern PGDLLIMPORT TimestampTz replorigin_session_origin_timestamp;
diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index d3f5d16a672..3de32c3c7ef 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -914,3 +914,7 @@ SELECT test_relpath();
(1 row)
+-- pg_replication_origin.roname length restriction
+SELECT pg_replication_origin_create('regress_' || repeat('a', 505));
+ERROR: replication origin name is too long
+DETAIL: Replication origin names must be no longer than 512 bytes.
diff --git a/src/test/regress/expected/misc_sanity.out b/src/test/regress/expected/misc_sanity.out
index b032a3f4761..0cb58311329 100644
--- a/src/test/regress/expected/misc_sanity.out
+++ b/src/test/regress/expected/misc_sanity.out
@@ -42,6 +42,9 @@ WHERE refclassid = 0 OR refobjid = 0 OR
-- as user data by pg_upgrade, which would cause failures.
-- 3. pg_authid, since its toast table cannot be accessed when it would be
-- needed, i.e., during authentication before we've selected a database.
+-- 4. pg_replication_origin, since its toast table would only be used for
+-- replication origin names, which we can avoid with a reasonable length
+-- restriction.
SELECT relname, attname, atttypid::regtype
FROM pg_class c JOIN pg_attribute a ON c.oid = attrelid
WHERE c.oid < 16384 AND
@@ -61,7 +64,8 @@ ORDER BY 1, 2;
pg_class | relpartbound | pg_node_tree
pg_largeobject | data | bytea
pg_largeobject_metadata | lomacl | aclitem[]
-(10 rows)
+ pg_replication_origin | roname | text
+(11 rows)
-- system catalogs without primary keys
--
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index aaebb298330..ea72a62dab7 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -411,3 +411,6 @@ CREATE FUNCTION test_relpath()
AS :'regresslib'
LANGUAGE C;
SELECT test_relpath();
+
+-- pg_replication_origin.roname length restriction
+SELECT pg_replication_origin_create('regress_' || repeat('a', 505));
diff --git a/src/test/regress/sql/misc_sanity.sql b/src/test/regress/sql/misc_sanity.sql
index 135793871b4..545b2b48989 100644
--- a/src/test/regress/sql/misc_sanity.sql
+++ b/src/test/regress/sql/misc_sanity.sql
@@ -45,6 +45,9 @@ WHERE refclassid = 0 OR refobjid = 0 OR
-- as user data by pg_upgrade, which would cause failures.
-- 3. pg_authid, since its toast table cannot be accessed when it would be
-- needed, i.e., during authentication before we've selected a database.
+-- 4. pg_replication_origin, since its toast table would only be used for
+-- replication origin names, which we can avoid with a reasonable length
+-- restriction.
SELECT relname, attname, atttypid::regtype
FROM pg_class c JOIN pg_attribute a ON c.oid = attrelid
--
2.39.5 (Apple Git-154)
On Mon, Apr 21, 2025 at 9:26 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
Apparently replication origins in tests are supposed to be prefixed with
"regress_". Here is a new patch with that fixed.
Hi Nathan,
Thanks for the patch! I tested it for functionality and here are a few comments:
1) While testing, I noticed an unexpected behavior with the new 512
byte length restriction in place. Since there’s no explicit
restriction on the column roname’s type, it’s possible to insert an
origin name exceeding 512 bytes. For instance, can use the COPY
command -- similar to how pg_dump might dump and restore catalog
tables.
For example:
postgres=# COPY pg_catalog.pg_replication_origin (roident, roname) FROM stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
44 regress_aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa... [>512 bytes string]
\.
COPY 1
Once inserted, I was able to use the pg_replication_origin_xxx
functions with this origin name without any errors.
Not sure how common pg_dump/restore of catalog tables is in real
systems, but should we still consider if this behavior is acceptable?
~~~
Below are a few cosmetic suggestions you might consider.
2)
<para>
Creates a replication origin with the given external
name, and returns the internal ID assigned to it.
+ The name must be no longer than 512 bytes.
</para></entry>
Would it make sense to make the phrasing a bit more formal, perhaps
something like:
“The name must not exceed 512 bytes.”?
3) For the code comments -
+ /*
+ * To avoid needing a TOAST table for pg_replication_origin, we restrict
+ * replication origin names to 512 bytes. This should be more than enough
+ * for all practical use.
+ */
+ if (strlen(roname) > MAX_RONAME_LEN)
a) /bytes. This/bytes. This/ (there is an extra space before “This”)
b) /all practical use/all practical uses/
c) Both (a) and (b) above, also apply to the comment above the macro
“MAX_RONAME_LEN”.
4) The "Detail" part of the error message could be made a bit more
formal and precise -
Current:
DETAIL: Replication origin names must be no longer than 512 bytes.
Suggestion:
DETAIL: Replication origin names must not exceed 512 bytes.
5) As Euler pointed out, logical replication already has a built-in
restriction for internally assigned origin names, limiting them to
NAMEDATALEN. Given that, only the SQL function
pg_replication_origin_create() is capable of creating longer origin
names. Would it make sense to consider moving the check into
pg_replication_origin_create() itself, since it seems redundant for
all other callers?
That said, if there's a possibility of future callers needing this
restriction at a lower level, it may be reasonable to leave it as is.
--
Thanks,
Nisha
On Tue, Apr 22, 2025 at 05:17:17PM +0530, Nisha Moond wrote:
Thanks for the patch! I tested it for functionality and here are a few comments:
Thank you for reviewing.
1) While testing, I noticed an unexpected behavior with the new 512
byte length restriction in place. Since there�s no explicit
restriction on the column roname�s type, it�s possible to insert an
origin name exceeding 512 bytes. For instance, can use the COPY
command -- similar to how pg_dump might dump and restore catalog
tables.For example:
postgres=# COPY pg_catalog.pg_replication_origin (roident, roname) FROM stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.44 regress_aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa... [>512 bytes string]
\.COPY 1
Once inserted, I was able to use the pg_replication_origin_xxx
functions with this origin name without any errors.
Not sure how common pg_dump/restore of catalog tables is in real
systems, but should we still consider if this behavior is acceptable?
I'm personally not too worried about this. The limit is primarily there to
provide a nicer ERROR than "row is too big", which AFAICT is the worst-case
scenario if you go to the trouble of setting allow_system_table_mods and
modifying shared catalogs directly.
5) As Euler pointed out, logical replication already has a built-in
restriction for internally assigned origin names, limiting them to
NAMEDATALEN. Given that, only the SQL function
pg_replication_origin_create() is capable of creating longer origin
names. Would it make sense to consider moving the check into
pg_replication_origin_create() itself, since it seems redundant for
all other callers?
That said, if there's a possibility of future callers needing this
restriction at a lower level, it may be reasonable to leave it as is.
pg_replication_origin_create() might be a better place. After all, that's
where we're enforcing the name doesn't start with "pg_" and, for testing,
_does_ start with "regress_". Plus, it seems highly unlikely that any
other callers of replorigin_create() will ever provide names longer than
512 bytes.
--
nathan
On Tue, Apr 22, 2025 at 12:21:01PM -0500, Nathan Bossart wrote:
On Tue, Apr 22, 2025 at 05:17:17PM +0530, Nisha Moond wrote:
5) As Euler pointed out, logical replication already has a built-in
restriction for internally assigned origin names, limiting them to
NAMEDATALEN. Given that, only the SQL function
pg_replication_origin_create() is capable of creating longer origin
names. Would it make sense to consider moving the check into
pg_replication_origin_create() itself, since it seems redundant for
all other callers?
That said, if there's a possibility of future callers needing this
restriction at a lower level, it may be reasonable to leave it as is.pg_replication_origin_create() might be a better place. After all, that's
where we're enforcing the name doesn't start with "pg_" and, for testing,
_does_ start with "regress_". Plus, it seems highly unlikely that any
other callers of replorigin_create() will ever provide names longer than
512 bytes.
Here is a new version of the patch with this change.
--
nathan
Attachments:
v4-0001-Remove-pg_replication_origin-s-TOAST-table.patchtext/plain; charset=us-asciiDownload
From 1ca6f9c2a4a3c9917bb4846a77d0b7eaf936b8b8 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Wed, 23 Apr 2025 16:21:22 -0500
Subject: [PATCH v4 1/1] Remove pg_replication_origin's TOAST table.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Reviewed-by: Nisha Moond <nisha.moond412@gmail.com>
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
---
doc/src/sgml/func.sgml | 1 +
src/backend/catalog/catalog.c | 2 --
src/backend/replication/logical/origin.c | 12 ++++++++++++
src/include/catalog/pg_replication_origin.h | 2 --
src/include/replication/origin.h | 7 +++++++
src/test/regress/expected/misc_functions.out | 4 ++++
src/test/regress/expected/misc_sanity.out | 6 +++++-
src/test/regress/sql/misc_functions.sql | 3 +++
src/test/regress/sql/misc_sanity.sql | 3 +++
9 files changed, 35 insertions(+), 5 deletions(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 574a544d9fa..593e45495be 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29941,6 +29941,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
<para>
Creates a replication origin with the given external
name, and returns the internal ID assigned to it.
+ The name must be no longer than 512 bytes.
</para></entry>
</row>
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 60000bd0bc7..59caae8f1bc 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -346,8 +346,6 @@ IsSharedRelation(Oid relationId)
relationId == PgDbRoleSettingToastIndex ||
relationId == PgParameterAclToastTable ||
relationId == PgParameterAclToastIndex ||
- relationId == PgReplicationOriginToastTable ||
- relationId == PgReplicationOriginToastIndex ||
relationId == PgShdescriptionToastTable ||
relationId == PgShdescriptionToastIndex ||
relationId == PgShseclabelToastTable ||
diff --git a/src/backend/replication/logical/origin.c b/src/backend/replication/logical/origin.c
index 6583dd497da..e9629350cb0 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -1296,6 +1296,18 @@ pg_replication_origin_create(PG_FUNCTION_ARGS)
elog(WARNING, "replication origins created by regression test cases should have names starting with \"regress_\"");
#endif
+ /*
+ * To avoid needing a TOAST table for pg_replication_origin, we restrict
+ * replication origin names to 512 bytes. This should be more than enough
+ * for all practical use.
+ */
+ if (strlen(name) > MAX_RONAME_LEN)
+ ereport(ERROR,
+ (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+ errmsg("replication origin name is too long"),
+ errdetail("Replication origin names must be no longer than %d bytes.",
+ MAX_RONAME_LEN)));
+
roident = replorigin_create(name);
pfree(name);
diff --git a/src/include/catalog/pg_replication_origin.h b/src/include/catalog/pg_replication_origin.h
index deb43065fe9..7ade8bbda39 100644
--- a/src/include/catalog/pg_replication_origin.h
+++ b/src/include/catalog/pg_replication_origin.h
@@ -54,8 +54,6 @@ CATALOG(pg_replication_origin,6000,ReplicationOriginRelationId) BKI_SHARED_RELAT
typedef FormData_pg_replication_origin *Form_pg_replication_origin;
-DECLARE_TOAST_WITH_MACRO(pg_replication_origin, 4181, 4182, PgReplicationOriginToastTable, PgReplicationOriginToastIndex);
-
DECLARE_UNIQUE_INDEX_PKEY(pg_replication_origin_roiident_index, 6001, ReplicationOriginIdentIndex, pg_replication_origin, btree(roident oid_ops));
DECLARE_UNIQUE_INDEX(pg_replication_origin_roname_index, 6002, ReplicationOriginNameIndex, pg_replication_origin, btree(roname text_ops));
diff --git a/src/include/replication/origin.h b/src/include/replication/origin.h
index 9cb2248fa9f..29724c0dcae 100644
--- a/src/include/replication/origin.h
+++ b/src/include/replication/origin.h
@@ -33,6 +33,13 @@ typedef struct xl_replorigin_drop
#define InvalidRepOriginId 0
#define DoNotReplicateId PG_UINT16_MAX
+/*
+ * To avoid needing a TOAST table for pg_replication_origin, we restrict
+ * replication origin names to 512 bytes. This should be more than enough for
+ * all practical use.
+ */
+#define MAX_RONAME_LEN 512
+
extern PGDLLIMPORT RepOriginId replorigin_session_origin;
extern PGDLLIMPORT XLogRecPtr replorigin_session_origin_lsn;
extern PGDLLIMPORT TimestampTz replorigin_session_origin_timestamp;
diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index d3f5d16a672..3de32c3c7ef 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -914,3 +914,7 @@ SELECT test_relpath();
(1 row)
+-- pg_replication_origin.roname length restriction
+SELECT pg_replication_origin_create('regress_' || repeat('a', 505));
+ERROR: replication origin name is too long
+DETAIL: Replication origin names must be no longer than 512 bytes.
diff --git a/src/test/regress/expected/misc_sanity.out b/src/test/regress/expected/misc_sanity.out
index b032a3f4761..0cb58311329 100644
--- a/src/test/regress/expected/misc_sanity.out
+++ b/src/test/regress/expected/misc_sanity.out
@@ -42,6 +42,9 @@ WHERE refclassid = 0 OR refobjid = 0 OR
-- as user data by pg_upgrade, which would cause failures.
-- 3. pg_authid, since its toast table cannot be accessed when it would be
-- needed, i.e., during authentication before we've selected a database.
+-- 4. pg_replication_origin, since its toast table would only be used for
+-- replication origin names, which we can avoid with a reasonable length
+-- restriction.
SELECT relname, attname, atttypid::regtype
FROM pg_class c JOIN pg_attribute a ON c.oid = attrelid
WHERE c.oid < 16384 AND
@@ -61,7 +64,8 @@ ORDER BY 1, 2;
pg_class | relpartbound | pg_node_tree
pg_largeobject | data | bytea
pg_largeobject_metadata | lomacl | aclitem[]
-(10 rows)
+ pg_replication_origin | roname | text
+(11 rows)
-- system catalogs without primary keys
--
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index aaebb298330..ea72a62dab7 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -411,3 +411,6 @@ CREATE FUNCTION test_relpath()
AS :'regresslib'
LANGUAGE C;
SELECT test_relpath();
+
+-- pg_replication_origin.roname length restriction
+SELECT pg_replication_origin_create('regress_' || repeat('a', 505));
diff --git a/src/test/regress/sql/misc_sanity.sql b/src/test/regress/sql/misc_sanity.sql
index 135793871b4..545b2b48989 100644
--- a/src/test/regress/sql/misc_sanity.sql
+++ b/src/test/regress/sql/misc_sanity.sql
@@ -45,6 +45,9 @@ WHERE refclassid = 0 OR refobjid = 0 OR
-- as user data by pg_upgrade, which would cause failures.
-- 3. pg_authid, since its toast table cannot be accessed when it would be
-- needed, i.e., during authentication before we've selected a database.
+-- 4. pg_replication_origin, since its toast table would only be used for
+-- replication origin names, which we can avoid with a reasonable length
+-- restriction.
SELECT relname, attname, atttypid::regtype
FROM pg_class c JOIN pg_attribute a ON c.oid = attrelid
--
2.39.5 (Apple Git-154)
Nathan Bossart <nathandbossart@gmail.com> writes:
Here is a new version of the patch with this change.
I don't see any comments in this patch that capture the real
reason for removing pg_replication_origin's TOAST table,
namely (IIUC) that we'd like to be able to access that catalog
without a snapshot. I think it's important to record that
knowledge, because otherwise somebody is likely to think they
can undo this change for $whatever-reason.
regards, tom lane
On Wed, Apr 23, 2025 at 05:33:41PM -0400, Tom Lane wrote:
Nathan Bossart <nathandbossart@gmail.com> writes:
Here is a new version of the patch with this change.
I don't see any comments in this patch that capture the real
reason for removing pg_replication_origin's TOAST table,
namely (IIUC) that we'd like to be able to access that catalog
without a snapshot. I think it's important to record that
knowledge, because otherwise somebody is likely to think they
can undo this change for $whatever-reason.
D'oh, yes, I'd better add that.
--
nathan
On Wed, Apr 23, 2025 at 04:44:51PM -0500, Nathan Bossart wrote:
On Wed, Apr 23, 2025 at 05:33:41PM -0400, Tom Lane wrote:
I don't see any comments in this patch that capture the real
reason for removing pg_replication_origin's TOAST table,
namely (IIUC) that we'd like to be able to access that catalog
without a snapshot. I think it's important to record that
knowledge, because otherwise somebody is likely to think they
can undo this change for $whatever-reason.D'oh, yes, I'd better add that.
I updated the comment atop the check in the misc_sanity test for system
tables with varlena columns but no toast table. If you were to reintroduce
pg_replication_origin's toast table, you'd have to at least fix the
expected output for this test, so the comment theoretically has a higher
chance of being seen.
--
nathan
Attachments:
v5-0001-Remove-pg_replication_origin-s-TOAST-table.patchtext/plain; charset=us-asciiDownload
From 2d7e89e4ec49b1156b33b11219dbf403aefc5466 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Mon, 28 Apr 2025 16:27:33 -0500
Subject: [PATCH v5 1/1] Remove pg_replication_origin's TOAST table.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Reviewed-by: Nisha Moond <nisha.moond412@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
---
doc/src/sgml/func.sgml | 1 +
src/backend/catalog/catalog.c | 2 --
src/backend/replication/logical/origin.c | 12 ++++++++++++
src/include/catalog/pg_replication_origin.h | 2 --
src/include/replication/origin.h | 7 +++++++
src/test/regress/expected/misc_functions.out | 4 ++++
src/test/regress/expected/misc_sanity.out | 7 ++++++-
src/test/regress/sql/misc_functions.sql | 3 +++
src/test/regress/sql/misc_sanity.sql | 4 ++++
9 files changed, 37 insertions(+), 5 deletions(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 574a544d9fa..593e45495be 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29941,6 +29941,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
<para>
Creates a replication origin with the given external
name, and returns the internal ID assigned to it.
+ The name must be no longer than 512 bytes.
</para></entry>
</row>
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 60000bd0bc7..59caae8f1bc 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -346,8 +346,6 @@ IsSharedRelation(Oid relationId)
relationId == PgDbRoleSettingToastIndex ||
relationId == PgParameterAclToastTable ||
relationId == PgParameterAclToastIndex ||
- relationId == PgReplicationOriginToastTable ||
- relationId == PgReplicationOriginToastIndex ||
relationId == PgShdescriptionToastTable ||
relationId == PgShdescriptionToastIndex ||
relationId == PgShseclabelToastTable ||
diff --git a/src/backend/replication/logical/origin.c b/src/backend/replication/logical/origin.c
index 6583dd497da..e9629350cb0 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -1296,6 +1296,18 @@ pg_replication_origin_create(PG_FUNCTION_ARGS)
elog(WARNING, "replication origins created by regression test cases should have names starting with \"regress_\"");
#endif
+ /*
+ * To avoid needing a TOAST table for pg_replication_origin, we restrict
+ * replication origin names to 512 bytes. This should be more than enough
+ * for all practical use.
+ */
+ if (strlen(name) > MAX_RONAME_LEN)
+ ereport(ERROR,
+ (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+ errmsg("replication origin name is too long"),
+ errdetail("Replication origin names must be no longer than %d bytes.",
+ MAX_RONAME_LEN)));
+
roident = replorigin_create(name);
pfree(name);
diff --git a/src/include/catalog/pg_replication_origin.h b/src/include/catalog/pg_replication_origin.h
index deb43065fe9..7ade8bbda39 100644
--- a/src/include/catalog/pg_replication_origin.h
+++ b/src/include/catalog/pg_replication_origin.h
@@ -54,8 +54,6 @@ CATALOG(pg_replication_origin,6000,ReplicationOriginRelationId) BKI_SHARED_RELAT
typedef FormData_pg_replication_origin *Form_pg_replication_origin;
-DECLARE_TOAST_WITH_MACRO(pg_replication_origin, 4181, 4182, PgReplicationOriginToastTable, PgReplicationOriginToastIndex);
-
DECLARE_UNIQUE_INDEX_PKEY(pg_replication_origin_roiident_index, 6001, ReplicationOriginIdentIndex, pg_replication_origin, btree(roident oid_ops));
DECLARE_UNIQUE_INDEX(pg_replication_origin_roname_index, 6002, ReplicationOriginNameIndex, pg_replication_origin, btree(roname text_ops));
diff --git a/src/include/replication/origin.h b/src/include/replication/origin.h
index 9cb2248fa9f..29724c0dcae 100644
--- a/src/include/replication/origin.h
+++ b/src/include/replication/origin.h
@@ -33,6 +33,13 @@ typedef struct xl_replorigin_drop
#define InvalidRepOriginId 0
#define DoNotReplicateId PG_UINT16_MAX
+/*
+ * To avoid needing a TOAST table for pg_replication_origin, we restrict
+ * replication origin names to 512 bytes. This should be more than enough for
+ * all practical use.
+ */
+#define MAX_RONAME_LEN 512
+
extern PGDLLIMPORT RepOriginId replorigin_session_origin;
extern PGDLLIMPORT XLogRecPtr replorigin_session_origin_lsn;
extern PGDLLIMPORT TimestampTz replorigin_session_origin_timestamp;
diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index d3f5d16a672..3de32c3c7ef 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -914,3 +914,7 @@ SELECT test_relpath();
(1 row)
+-- pg_replication_origin.roname length restriction
+SELECT pg_replication_origin_create('regress_' || repeat('a', 505));
+ERROR: replication origin name is too long
+DETAIL: Replication origin names must be no longer than 512 bytes.
diff --git a/src/test/regress/expected/misc_sanity.out b/src/test/regress/expected/misc_sanity.out
index b032a3f4761..b0b76fe3d9a 100644
--- a/src/test/regress/expected/misc_sanity.out
+++ b/src/test/regress/expected/misc_sanity.out
@@ -42,6 +42,10 @@ WHERE refclassid = 0 OR refobjid = 0 OR
-- as user data by pg_upgrade, which would cause failures.
-- 3. pg_authid, since its toast table cannot be accessed when it would be
-- needed, i.e., during authentication before we've selected a database.
+-- 4. pg_replication_origin, since we want to be able to access that catalog
+-- without setting up a snapshot. As of this writing, its toast table would
+-- only be used for replication origin names, which we can avoid with a
+-- reasonable length restriction.
SELECT relname, attname, atttypid::regtype
FROM pg_class c JOIN pg_attribute a ON c.oid = attrelid
WHERE c.oid < 16384 AND
@@ -61,7 +65,8 @@ ORDER BY 1, 2;
pg_class | relpartbound | pg_node_tree
pg_largeobject | data | bytea
pg_largeobject_metadata | lomacl | aclitem[]
-(10 rows)
+ pg_replication_origin | roname | text
+(11 rows)
-- system catalogs without primary keys
--
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index aaebb298330..ea72a62dab7 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -411,3 +411,6 @@ CREATE FUNCTION test_relpath()
AS :'regresslib'
LANGUAGE C;
SELECT test_relpath();
+
+-- pg_replication_origin.roname length restriction
+SELECT pg_replication_origin_create('regress_' || repeat('a', 505));
diff --git a/src/test/regress/sql/misc_sanity.sql b/src/test/regress/sql/misc_sanity.sql
index 135793871b4..86c193a0c42 100644
--- a/src/test/regress/sql/misc_sanity.sql
+++ b/src/test/regress/sql/misc_sanity.sql
@@ -45,6 +45,10 @@ WHERE refclassid = 0 OR refobjid = 0 OR
-- as user data by pg_upgrade, which would cause failures.
-- 3. pg_authid, since its toast table cannot be accessed when it would be
-- needed, i.e., during authentication before we've selected a database.
+-- 4. pg_replication_origin, since we want to be able to access that catalog
+-- without setting up a snapshot. As of this writing, its toast table would
+-- only be used for replication origin names, which we can avoid with a
+-- reasonable length restriction.
SELECT relname, attname, atttypid::regtype
FROM pg_class c JOIN pg_attribute a ON c.oid = attrelid
--
2.39.5 (Apple Git-154)
Nathan Bossart <nathandbossart@gmail.com> writes:
On Wed, Apr 23, 2025 at 04:44:51PM -0500, Nathan Bossart wrote:
On Wed, Apr 23, 2025 at 05:33:41PM -0400, Tom Lane wrote:
I don't see any comments in this patch that capture the real
reason for removing pg_replication_origin's TOAST table,
namely (IIUC) that we'd like to be able to access that catalog
without a snapshot.
I updated the comment atop the check in the misc_sanity test for system
tables with varlena columns but no toast table. If you were to reintroduce
pg_replication_origin's toast table, you'd have to at least fix the
expected output for this test, so the comment theoretically has a higher
chance of being seen.
Possibly better idea: can we add something like
Assert(!OidIsValid(reltoastrelid)) in the code that is making this
assumption?
regards, tom lane
On Mon, Apr 28, 2025 at 05:35:08PM -0400, Tom Lane wrote:
Possibly better idea: can we add something like
Assert(!OidIsValid(reltoastrelid)) in the code that is making this
assumption?
Yeah, we could add something like that to replorigin_create() pretty
easily. The comment for that would be even harder to miss.
--
nathan
Attachments:
v6-0001-Remove-pg_replication_origin-s-TOAST-table.patchtext/plain; charset=us-asciiDownload
From 39f870e69aa0cb08c0ef653b60ff07fa020c6bba Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Mon, 28 Apr 2025 16:27:33 -0500
Subject: [PATCH v6 1/1] Remove pg_replication_origin's TOAST table.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Reviewed-by: Nisha Moond <nisha.moond412@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
---
doc/src/sgml/func.sgml | 1 +
src/backend/catalog/catalog.c | 2 --
src/backend/replication/logical/origin.c | 22 ++++++++++++++++++++
src/include/catalog/pg_replication_origin.h | 2 --
src/include/replication/origin.h | 7 +++++++
src/test/regress/expected/misc_functions.out | 4 ++++
src/test/regress/expected/misc_sanity.out | 7 ++++++-
src/test/regress/sql/misc_functions.sql | 3 +++
src/test/regress/sql/misc_sanity.sql | 4 ++++
9 files changed, 47 insertions(+), 5 deletions(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 574a544d9fa..593e45495be 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29941,6 +29941,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
<para>
Creates a replication origin with the given external
name, and returns the internal ID assigned to it.
+ The name must be no longer than 512 bytes.
</para></entry>
</row>
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 60000bd0bc7..59caae8f1bc 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -346,8 +346,6 @@ IsSharedRelation(Oid relationId)
relationId == PgDbRoleSettingToastIndex ||
relationId == PgParameterAclToastTable ||
relationId == PgParameterAclToastIndex ||
- relationId == PgReplicationOriginToastTable ||
- relationId == PgReplicationOriginToastIndex ||
relationId == PgShdescriptionToastTable ||
relationId == PgShdescriptionToastIndex ||
relationId == PgShseclabelToastTable ||
diff --git a/src/backend/replication/logical/origin.c b/src/backend/replication/logical/origin.c
index 6583dd497da..bd827bc6ed0 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -287,6 +287,16 @@ replorigin_create(const char *roname)
rel = table_open(ReplicationOriginRelationId, ExclusiveLock);
+ /*
+ * We want to be able to access pg_replication_origin without setting up a
+ * snapshot. As of this writing, its TOAST table would only be used for
+ * replication origin names, which we avoid with a reasonable length
+ * restriction. If you add a TOAST table to this catalog, be sure to set
+ * up a snapshot everywhere it might be needed. For more information, see
+ * https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan.
+ */
+ Assert(!OidIsValid(rel->rd_rel->reltoastrelid));
+
for (roident = InvalidOid + 1; roident < PG_UINT16_MAX; roident++)
{
bool nulls[Natts_pg_replication_origin];
@@ -1296,6 +1306,18 @@ pg_replication_origin_create(PG_FUNCTION_ARGS)
elog(WARNING, "replication origins created by regression test cases should have names starting with \"regress_\"");
#endif
+ /*
+ * To avoid needing a TOAST table for pg_replication_origin, we restrict
+ * replication origin names to 512 bytes. This should be more than enough
+ * for all practical use.
+ */
+ if (strlen(name) > MAX_RONAME_LEN)
+ ereport(ERROR,
+ (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+ errmsg("replication origin name is too long"),
+ errdetail("Replication origin names must be no longer than %d bytes.",
+ MAX_RONAME_LEN)));
+
roident = replorigin_create(name);
pfree(name);
diff --git a/src/include/catalog/pg_replication_origin.h b/src/include/catalog/pg_replication_origin.h
index deb43065fe9..7ade8bbda39 100644
--- a/src/include/catalog/pg_replication_origin.h
+++ b/src/include/catalog/pg_replication_origin.h
@@ -54,8 +54,6 @@ CATALOG(pg_replication_origin,6000,ReplicationOriginRelationId) BKI_SHARED_RELAT
typedef FormData_pg_replication_origin *Form_pg_replication_origin;
-DECLARE_TOAST_WITH_MACRO(pg_replication_origin, 4181, 4182, PgReplicationOriginToastTable, PgReplicationOriginToastIndex);
-
DECLARE_UNIQUE_INDEX_PKEY(pg_replication_origin_roiident_index, 6001, ReplicationOriginIdentIndex, pg_replication_origin, btree(roident oid_ops));
DECLARE_UNIQUE_INDEX(pg_replication_origin_roname_index, 6002, ReplicationOriginNameIndex, pg_replication_origin, btree(roname text_ops));
diff --git a/src/include/replication/origin.h b/src/include/replication/origin.h
index 9cb2248fa9f..29724c0dcae 100644
--- a/src/include/replication/origin.h
+++ b/src/include/replication/origin.h
@@ -33,6 +33,13 @@ typedef struct xl_replorigin_drop
#define InvalidRepOriginId 0
#define DoNotReplicateId PG_UINT16_MAX
+/*
+ * To avoid needing a TOAST table for pg_replication_origin, we restrict
+ * replication origin names to 512 bytes. This should be more than enough for
+ * all practical use.
+ */
+#define MAX_RONAME_LEN 512
+
extern PGDLLIMPORT RepOriginId replorigin_session_origin;
extern PGDLLIMPORT XLogRecPtr replorigin_session_origin_lsn;
extern PGDLLIMPORT TimestampTz replorigin_session_origin_timestamp;
diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index d3f5d16a672..3de32c3c7ef 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -914,3 +914,7 @@ SELECT test_relpath();
(1 row)
+-- pg_replication_origin.roname length restriction
+SELECT pg_replication_origin_create('regress_' || repeat('a', 505));
+ERROR: replication origin name is too long
+DETAIL: Replication origin names must be no longer than 512 bytes.
diff --git a/src/test/regress/expected/misc_sanity.out b/src/test/regress/expected/misc_sanity.out
index b032a3f4761..b0b76fe3d9a 100644
--- a/src/test/regress/expected/misc_sanity.out
+++ b/src/test/regress/expected/misc_sanity.out
@@ -42,6 +42,10 @@ WHERE refclassid = 0 OR refobjid = 0 OR
-- as user data by pg_upgrade, which would cause failures.
-- 3. pg_authid, since its toast table cannot be accessed when it would be
-- needed, i.e., during authentication before we've selected a database.
+-- 4. pg_replication_origin, since we want to be able to access that catalog
+-- without setting up a snapshot. As of this writing, its toast table would
+-- only be used for replication origin names, which we can avoid with a
+-- reasonable length restriction.
SELECT relname, attname, atttypid::regtype
FROM pg_class c JOIN pg_attribute a ON c.oid = attrelid
WHERE c.oid < 16384 AND
@@ -61,7 +65,8 @@ ORDER BY 1, 2;
pg_class | relpartbound | pg_node_tree
pg_largeobject | data | bytea
pg_largeobject_metadata | lomacl | aclitem[]
-(10 rows)
+ pg_replication_origin | roname | text
+(11 rows)
-- system catalogs without primary keys
--
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index aaebb298330..ea72a62dab7 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -411,3 +411,6 @@ CREATE FUNCTION test_relpath()
AS :'regresslib'
LANGUAGE C;
SELECT test_relpath();
+
+-- pg_replication_origin.roname length restriction
+SELECT pg_replication_origin_create('regress_' || repeat('a', 505));
diff --git a/src/test/regress/sql/misc_sanity.sql b/src/test/regress/sql/misc_sanity.sql
index 135793871b4..86c193a0c42 100644
--- a/src/test/regress/sql/misc_sanity.sql
+++ b/src/test/regress/sql/misc_sanity.sql
@@ -45,6 +45,10 @@ WHERE refclassid = 0 OR refobjid = 0 OR
-- as user data by pg_upgrade, which would cause failures.
-- 3. pg_authid, since its toast table cannot be accessed when it would be
-- needed, i.e., during authentication before we've selected a database.
+-- 4. pg_replication_origin, since we want to be able to access that catalog
+-- without setting up a snapshot. As of this writing, its toast table would
+-- only be used for replication origin names, which we can avoid with a
+-- reasonable length restriction.
SELECT relname, attname, atttypid::regtype
FROM pg_class c JOIN pg_attribute a ON c.oid = attrelid
--
2.39.5 (Apple Git-154)
Nathan Bossart <nathandbossart@gmail.com> writes:
On Mon, Apr 28, 2025 at 05:35:08PM -0400, Tom Lane wrote:
Possibly better idea: can we add something like
Assert(!OidIsValid(reltoastrelid)) in the code that is making this
assumption?
Yeah, we could add something like that to replorigin_create() pretty
easily. The comment for that would be even harder to miss.
Maybe one more sentence in the code comment so that the logic is
easier to follow:
+ /*
+ * We want to be able to access pg_replication_origin without setting up a
+ * snapshot. To make that safe, it needs to not have a TOAST table,
+ * since TOASTed data cannot be fetched without a snapshot. As of this
+ * writing, ...
LGTM other than that nit.
regards, tom lane
On Mon, Apr 28, 2025 at 06:19:19PM -0400, Tom Lane wrote:
LGTM other than that nit.
Thanks. I'll stash this away for July unless someone wants to argue that
it's fair game for v18. IMHO this isn't nearly urgent enough for that, and
the bugs will continue to exist on older major versions regardless.
--
nathan
Attachments:
v7-0001-Remove-pg_replication_origin-s-TOAST-table.patchtext/plain; charset=us-asciiDownload
From 4c015dd4ccc09f193eccb5d447d01d6f5c378397 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Tue, 29 Apr 2025 12:19:15 -0500
Subject: [PATCH v7 1/1] Remove pg_replication_origin's TOAST table.
A few places that access this catalog do not set up an active
snapshot before potentially accessing its TOAST table, which is
unsafe. However, roname (the replication origin name) is the only
varlena column, so this is only ever a problem if the name requires
out-of-line storage, which seems unlikely. This commit removes its
TOAST table to avoid needing to set up a snapshot, and it
establishes a length restriction of 512 bytes for replication
origin names. Even without this restriction, names that would
require out-of-line storage would fail with a "row is too big"
error; the additional length restriction just provides a more
specific error message.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Reviewed-by: Nisha Moond <nisha.moond412@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
---
doc/src/sgml/func.sgml | 1 +
src/backend/catalog/catalog.c | 2 --
src/backend/replication/logical/origin.c | 24 ++++++++++++++++++++
src/include/catalog/pg_replication_origin.h | 2 --
src/include/replication/origin.h | 7 ++++++
src/test/regress/expected/misc_functions.out | 4 ++++
src/test/regress/expected/misc_sanity.out | 8 ++++++-
src/test/regress/sql/misc_functions.sql | 3 +++
src/test/regress/sql/misc_sanity.sql | 5 ++++
9 files changed, 51 insertions(+), 5 deletions(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 574a544d9fa..593e45495be 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29941,6 +29941,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
<para>
Creates a replication origin with the given external
name, and returns the internal ID assigned to it.
+ The name must be no longer than 512 bytes.
</para></entry>
</row>
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 60000bd0bc7..59caae8f1bc 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -346,8 +346,6 @@ IsSharedRelation(Oid relationId)
relationId == PgDbRoleSettingToastIndex ||
relationId == PgParameterAclToastTable ||
relationId == PgParameterAclToastIndex ||
- relationId == PgReplicationOriginToastTable ||
- relationId == PgReplicationOriginToastIndex ||
relationId == PgShdescriptionToastTable ||
relationId == PgShdescriptionToastIndex ||
relationId == PgShseclabelToastTable ||
diff --git a/src/backend/replication/logical/origin.c b/src/backend/replication/logical/origin.c
index 6583dd497da..9e5a09e63e1 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -287,6 +287,18 @@ replorigin_create(const char *roname)
rel = table_open(ReplicationOriginRelationId, ExclusiveLock);
+ /*
+ * We want to be able to access pg_replication_origin without setting up a
+ * snapshot. To make that safe, it needs to not have a TOAST table, since
+ * TOASTed data cannot be fetched without a snapshot. As of this writing,
+ * its TOAST table would only be used for replication origin names, which
+ * we avoid with a reasonable length restriction. If you add a TOAST
+ * table to this catalog, be sure to set up a snapshot everywhere it might
+ * be needed. For more information, see
+ * https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan.
+ */
+ Assert(!OidIsValid(rel->rd_rel->reltoastrelid));
+
for (roident = InvalidOid + 1; roident < PG_UINT16_MAX; roident++)
{
bool nulls[Natts_pg_replication_origin];
@@ -1296,6 +1308,18 @@ pg_replication_origin_create(PG_FUNCTION_ARGS)
elog(WARNING, "replication origins created by regression test cases should have names starting with \"regress_\"");
#endif
+ /*
+ * To avoid needing a TOAST table for pg_replication_origin, we restrict
+ * replication origin names to 512 bytes. This should be more than enough
+ * for all practical use.
+ */
+ if (strlen(name) > MAX_RONAME_LEN)
+ ereport(ERROR,
+ (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+ errmsg("replication origin name is too long"),
+ errdetail("Replication origin names must be no longer than %d bytes.",
+ MAX_RONAME_LEN)));
+
roident = replorigin_create(name);
pfree(name);
diff --git a/src/include/catalog/pg_replication_origin.h b/src/include/catalog/pg_replication_origin.h
index deb43065fe9..7ade8bbda39 100644
--- a/src/include/catalog/pg_replication_origin.h
+++ b/src/include/catalog/pg_replication_origin.h
@@ -54,8 +54,6 @@ CATALOG(pg_replication_origin,6000,ReplicationOriginRelationId) BKI_SHARED_RELAT
typedef FormData_pg_replication_origin *Form_pg_replication_origin;
-DECLARE_TOAST_WITH_MACRO(pg_replication_origin, 4181, 4182, PgReplicationOriginToastTable, PgReplicationOriginToastIndex);
-
DECLARE_UNIQUE_INDEX_PKEY(pg_replication_origin_roiident_index, 6001, ReplicationOriginIdentIndex, pg_replication_origin, btree(roident oid_ops));
DECLARE_UNIQUE_INDEX(pg_replication_origin_roname_index, 6002, ReplicationOriginNameIndex, pg_replication_origin, btree(roname text_ops));
diff --git a/src/include/replication/origin.h b/src/include/replication/origin.h
index 9cb2248fa9f..29724c0dcae 100644
--- a/src/include/replication/origin.h
+++ b/src/include/replication/origin.h
@@ -33,6 +33,13 @@ typedef struct xl_replorigin_drop
#define InvalidRepOriginId 0
#define DoNotReplicateId PG_UINT16_MAX
+/*
+ * To avoid needing a TOAST table for pg_replication_origin, we restrict
+ * replication origin names to 512 bytes. This should be more than enough for
+ * all practical use.
+ */
+#define MAX_RONAME_LEN 512
+
extern PGDLLIMPORT RepOriginId replorigin_session_origin;
extern PGDLLIMPORT XLogRecPtr replorigin_session_origin_lsn;
extern PGDLLIMPORT TimestampTz replorigin_session_origin_timestamp;
diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index d3f5d16a672..3de32c3c7ef 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -914,3 +914,7 @@ SELECT test_relpath();
(1 row)
+-- pg_replication_origin.roname length restriction
+SELECT pg_replication_origin_create('regress_' || repeat('a', 505));
+ERROR: replication origin name is too long
+DETAIL: Replication origin names must be no longer than 512 bytes.
diff --git a/src/test/regress/expected/misc_sanity.out b/src/test/regress/expected/misc_sanity.out
index b032a3f4761..21297fc6164 100644
--- a/src/test/regress/expected/misc_sanity.out
+++ b/src/test/regress/expected/misc_sanity.out
@@ -42,6 +42,11 @@ WHERE refclassid = 0 OR refobjid = 0 OR
-- as user data by pg_upgrade, which would cause failures.
-- 3. pg_authid, since its toast table cannot be accessed when it would be
-- needed, i.e., during authentication before we've selected a database.
+-- 4. pg_replication_origin, since we want to be able to access that catalog
+-- without setting up a snapshot. To make that safe, it needs to not have a
+-- toast table, since toasted data cannot be fetched without a snapshot. As of
+-- this writing, its toast table would only be used for replication origin
+-- names, which we can avoid with a reasonable length restriction.
SELECT relname, attname, atttypid::regtype
FROM pg_class c JOIN pg_attribute a ON c.oid = attrelid
WHERE c.oid < 16384 AND
@@ -61,7 +66,8 @@ ORDER BY 1, 2;
pg_class | relpartbound | pg_node_tree
pg_largeobject | data | bytea
pg_largeobject_metadata | lomacl | aclitem[]
-(10 rows)
+ pg_replication_origin | roname | text
+(11 rows)
-- system catalogs without primary keys
--
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index aaebb298330..ea72a62dab7 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -411,3 +411,6 @@ CREATE FUNCTION test_relpath()
AS :'regresslib'
LANGUAGE C;
SELECT test_relpath();
+
+-- pg_replication_origin.roname length restriction
+SELECT pg_replication_origin_create('regress_' || repeat('a', 505));
diff --git a/src/test/regress/sql/misc_sanity.sql b/src/test/regress/sql/misc_sanity.sql
index 135793871b4..8ff898843e9 100644
--- a/src/test/regress/sql/misc_sanity.sql
+++ b/src/test/regress/sql/misc_sanity.sql
@@ -45,6 +45,11 @@ WHERE refclassid = 0 OR refobjid = 0 OR
-- as user data by pg_upgrade, which would cause failures.
-- 3. pg_authid, since its toast table cannot be accessed when it would be
-- needed, i.e., during authentication before we've selected a database.
+-- 4. pg_replication_origin, since we want to be able to access that catalog
+-- without setting up a snapshot. To make that safe, it needs to not have a
+-- toast table, since toasted data cannot be fetched without a snapshot. As of
+-- this writing, its toast table would only be used for replication origin
+-- names, which we can avoid with a reasonable length restriction.
SELECT relname, attname, atttypid::regtype
FROM pg_class c JOIN pg_attribute a ON c.oid = attrelid
--
2.39.5 (Apple Git-154)
Nathan Bossart <nathandbossart@gmail.com> writes:
Thanks. I'll stash this away for July unless someone wants to argue that
it's fair game for v18. IMHO this isn't nearly urgent enough for that, and
the bugs will continue to exist on older major versions regardless.
I'm inclined to argue that it's a bug fix and therefore still in-scope
for v18. The fact that we can't back-patch such a change is all the
more reason to not let it slide another year. But probably the RMT
should make the call.
regards, tom lane
On Tue, Apr 29, 2025 at 02:01:54PM -0400, Tom Lane wrote:
Nathan Bossart <nathandbossart@gmail.com> writes:
Thanks. I'll stash this away for July unless someone wants to argue that
it's fair game for v18. IMHO this isn't nearly urgent enough for that, and
the bugs will continue to exist on older major versions regardless.I'm inclined to argue that it's a bug fix and therefore still in-scope
for v18. The fact that we can't back-patch such a change is all the
more reason to not let it slide another year. But probably the RMT
should make the call.
Tomas, Heikki: Thoughts on removing pg_replication_origin's TOAST table
post-feature-freeze? The proposed commit message explains what's going on:
A few places that access this catalog do not set up an active
snapshot before potentially accessing its TOAST table, which is
unsafe. However, roname (the replication origin name) is the only
varlena column, so this is only ever a problem if the name requires
out-of-line storage, which seems unlikely. This commit removes its
TOAST table to avoid needing to set up a snapshot, and it
establishes a length restriction of 512 bytes for replication
origin names. Even without this restriction, names that would
require out-of-line storage would fail with a "row is too big"
error; the additional length restriction just provides a more
specific error message.
--
nathan
On Tue, Apr 29, 2025 at 02:01:54PM -0400, Tom Lane wrote:
I'm inclined to argue that it's a bug fix and therefore still in-scope
for v18. The fact that we can't back-patch such a change is all the
more reason to not let it slide another year.
Not on the RMT. I have looked at the patch, and I would agree with
doing that now in v18 rather than wait for v19 to open up.
But probably the RMT should make the call.
Yes.
This has been mentioned upthread, but I am questioning the wisdom of
putting the restriction based on MAX_RONAME_LEN only in
pg_replication_origin_create(), while replorigin_create() is the one
that actually matters. While it is true that these restrictions are
enforced for the current callers in core, it could also be called by
stuff outside of core. It seems important to me to let these
potential callers know about the restriction in place.
--
Michael
On Wed, Apr 30, 2025 at 09:57:46AM +0900, Michael Paquier wrote:
On Tue, Apr 29, 2025 at 02:01:54PM -0400, Tom Lane wrote:
I'm inclined to argue that it's a bug fix and therefore still in-scope
for v18. The fact that we can't back-patch such a change is all the
more reason to not let it slide another year.Not on the RMT. I have looked at the patch, and I would agree with
doing that now in v18 rather than wait for v19 to open up.But probably the RMT should make the call.
Yes.
I brought this up with the RMT, and everyone seemed okay with committing it
for v18.
This has been mentioned upthread, but I am questioning the wisdom of
putting the restriction based on MAX_RONAME_LEN only in
pg_replication_origin_create(), while replorigin_create() is the one
that actually matters. While it is true that these restrictions are
enforced for the current callers in core, it could also be called by
stuff outside of core. It seems important to me to let these
potential callers know about the restriction in place.
I can move it back to replorigin_create(). I don't have a strong opinion
here.
--
nathan
On Tue, May 06, 2025 at 11:41:49AM -0500, Nathan Bossart wrote:
I brought this up with the RMT, and everyone seemed okay with committing it
for v18.
Cool, thanks for the update.
I can move it back to replorigin_create(). I don't have a strong opinion
here.
I think that I would the check there, as that's safer in the long-run
to enforce the rule to all potential callers of this API. If the
votes balance in favor of keeping it in the SQL function, that's OK by
me as well, so feel free to ignore me if you feel strongly about it
overall.
--
Michael
On Wed, May 07, 2025 at 08:00:39AM +0900, Michael Paquier wrote:
On Tue, May 06, 2025 at 11:41:49AM -0500, Nathan Bossart wrote:
I brought this up with the RMT, and everyone seemed okay with committing it
for v18.Cool, thanks for the update.
I can move it back to replorigin_create(). I don't have a strong opinion
here.I think that I would the check there, as that's safer in the long-run
to enforce the rule to all potential callers of this API. If the
votes balance in favor of keeping it in the SQL function, that's OK by
me as well, so feel free to ignore me if you feel strongly about it
overall.
Committed with that change. That takes care of a good chunk of these TOAST
snapshot problems. I think there are about 4 others left, unless something
has changed since I last checked. I hope to look into those again soon.
--
nathan
On Wed, May 07, 2025 at 02:55:49PM -0500, Nathan Bossart wrote:
Committed with that change. That takes care of a good chunk of these TOAST
snapshot problems. I think there are about 4 others left, unless something
has changed since I last checked. I hope to look into those again soon.
Thanks, for both things.
--
Michael
On Thu, May 08, 2025 at 08:55:08AM +0900, Michael Paquier wrote:
On Wed, May 07, 2025 at 02:55:49PM -0500, Nathan Bossart wrote:
Committed with that change. That takes care of a good chunk of these TOAST
snapshot problems. I think there are about 4 others left, unless something
has changed since I last checked. I hope to look into those again soon.Thanks, for both things.
Here is what I have staged for commit for the others. I'm hoping to push
these in the next couple of days.
--
nathan
Attachments:
v6-0001-Ensure-we-have-a-snapshot-when-updating-va.patch.mastertext/plain; charset=us-asciiDownload
From ac77516715c7d05e94f585a08253e43c64a91382 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Tue, 27 May 2025 14:58:37 -0500
Subject: [PATCH v6 1/1] Ensure we have a snapshot when updating various system
catalogs.
A few places that access system catalogs don't set up an active
snapshot before potentially accessing their TOAST tables. To fix,
push an active snapshot just before each section of code that might
require accessing one of these TOAST tables, and pop it shortly
afterwards. While at it, this commit adds some rather strict
assertions in an attempt to prevent such issues in the future.
Commit 16bf24e0e4 recently removed pg_replication_origin's TOAST
table in order to fix the same problem for that catalog. On the
back-branches, those bugs are left in place. We cannot easily
remove a catalog's TOAST table on released major versions, and only
replication origins with extremely long names are affected. Given
the low severity of the issue, it didn't seem worth the trouble of
significantly modifying the patch on the back-branches.
Also, on v13 and v14, the aforementioned strict assertions have
been omitted because commit 2776922201, which added
HaveRegisteredOrActiveSnapshot(), was not backpatched. While we
could probably back-patch it now, I've opted against it because it
seems unlikely that new TOAST snapshot issues will be introduced in
the oldest supported versions.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org
Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
Backpatch-through: 13
---
src/backend/access/heap/heapam.c | 29 ++++++++++++++++++++++++
src/backend/commands/indexcmds.c | 2 +-
src/backend/commands/tablecmds.c | 8 +++++++
src/backend/postmaster/autovacuum.c | 7 ++++++
src/backend/replication/logical/worker.c | 24 ++++++++++++++++++++
5 files changed, 69 insertions(+), 1 deletion(-)
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 9ec8cda1c68..2be7f817c78 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -213,6 +213,27 @@ static const int MultiXactStatusLock[MaxMultiXactStatus + 1] =
#define TUPLOCK_from_mxstatus(status) \
(MultiXactStatusLock[(status)])
+/*
+ * Check that we have a valid snapshot if we might need TOAST access.
+ */
+static inline void
+AssertHasSnapshotForToast(Relation rel)
+{
+#ifdef USE_ASSERT_CHECKING
+
+ /* bootstrap mode in particular breaks this rule */
+ if (!IsNormalProcessingMode())
+ return;
+
+ /* if the relation doesn't have a TOAST table, we are good */
+ if (!OidIsValid(rel->rd_rel->reltoastrelid))
+ return;
+
+ Assert(HaveRegisteredOrActiveSnapshot());
+
+#endif /* USE_ASSERT_CHECKING */
+}
+
/* ----------------------------------------------------------------
* heap support routines
* ----------------------------------------------------------------
@@ -2066,6 +2087,8 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
Assert(HeapTupleHeaderGetNatts(tup->t_data) <=
RelationGetNumberOfAttributes(relation));
+ AssertHasSnapshotForToast(relation);
+
/*
* Fill in tuple header fields and toast the tuple if necessary.
*
@@ -2343,6 +2366,8 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
/* currently not needed (thus unsupported) for heap_multi_insert() */
Assert(!(options & HEAP_INSERT_NO_LOGICAL));
+ AssertHasSnapshotForToast(relation);
+
needwal = RelationNeedsWAL(relation);
saveFreeSpace = RelationGetTargetPageFreeSpace(relation,
HEAP_DEFAULT_FILLFACTOR);
@@ -2765,6 +2790,8 @@ heap_delete(Relation relation, ItemPointer tid,
Assert(ItemPointerIsValid(tid));
+ AssertHasSnapshotForToast(relation);
+
/*
* Forbid this during a parallel operation, lest it allocate a combo CID.
* Other workers might need that combo CID for visibility checks, and we
@@ -3260,6 +3287,8 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
Assert(HeapTupleHeaderGetNatts(newtup->t_data) <=
RelationGetNumberOfAttributes(relation));
+ AssertHasSnapshotForToast(relation);
+
/*
* Forbid this during a parallel operation, lest it allocate a combo CID.
* Other workers might need that combo CID for visibility checks, and we
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index d962fe392cd..c3ec2076a52 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -4226,7 +4226,7 @@ ReindexRelationConcurrently(const ReindexStmt *stmt, Oid relationOid, const Rein
false);
/*
- * Updating pg_index might involve TOAST table access, so ensure we
+ * Swapping the indexes might involve TOAST table access, so ensure we
* have a valid snapshot.
*/
PushActiveSnapshot(GetTransactionSnapshot());
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 54ad38247aa..acf11e83c04 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -20964,9 +20964,17 @@ ATExecDetachPartition(List **wqueue, AlteredTableInfo *tab, Relation rel,
tab->rel = rel;
}
+ /*
+ * Detaching the partition might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/* Do the final part of detaching */
DetachPartitionFinalize(rel, partRel, concurrent, defaultPartOid);
+ PopActiveSnapshot();
+
ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partRel));
/* keep our lock until commit */
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 981be42e3af..451fb90a610 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2234,6 +2234,12 @@ do_autovacuum(void)
get_namespace_name(classForm->relnamespace),
NameStr(classForm->relname))));
+ /*
+ * Deletion might involve TOAST table access, so ensure we have a
+ * valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
object.classId = RelationRelationId;
object.objectId = relid;
object.objectSubId = 0;
@@ -2246,6 +2252,7 @@ do_autovacuum(void)
* To commit the deletion, end current transaction and start a new
* one. Note this also releases the locks we took.
*/
+ PopActiveSnapshot();
CommitTransactionCommand();
StartTransactionCommand();
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 4151a4b2a96..a23262957ac 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -4626,8 +4626,16 @@ run_apply_worker()
walrcv_startstreaming(LogRepWorkerWalRcvConn, &options);
StartTransactionCommand();
+
+ /*
+ * Updating pg_subscription might involve TOAST table access, so
+ * ensure we have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
UpdateTwoPhaseState(MySubscription->oid, LOGICALREP_TWOPHASE_STATE_ENABLED);
MySubscription->twophasestate = LOGICALREP_TWOPHASE_STATE_ENABLED;
+ PopActiveSnapshot();
CommitTransactionCommand();
}
else
@@ -4843,7 +4851,15 @@ DisableSubscriptionAndExit(void)
/* Disable the subscription */
StartTransactionCommand();
+
+ /*
+ * Updating pg_subscription might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
DisableSubscription(MySubscription->oid);
+ PopActiveSnapshot();
CommitTransactionCommand();
/* Ensure we remove no-longer-useful entry for worker's start time */
@@ -4947,6 +4963,12 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
started_tx = true;
}
+ /*
+ * Updating pg_subscription might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Protect subskiplsn of pg_subscription from being concurrently updated
* while clearing it.
@@ -5005,6 +5027,8 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
heap_freetuple(tup);
table_close(rel, NoLock);
+ PopActiveSnapshot();
+
if (started_tx)
CommitTransactionCommand();
}
--
2.39.5 (Apple Git-154)
v6-0001-Ensure-we-have-a-snapshot-when-updating-vario.patch.v13text/plain; charset=us-asciiDownload
From 04177a63cffeecb0004f84575df5935f2d1ab3ee Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Tue, 27 May 2025 14:58:37 -0500
Subject: [PATCH v6 1/1] Ensure we have a snapshot when updating various system
catalogs.
A few places that access system catalogs don't set up an active
snapshot before potentially accessing their TOAST tables. To fix,
push an active snapshot just before each section of code that might
require accessing one of these TOAST tables, and pop it shortly
afterwards. While at it, this commit adds some rather strict
assertions in an attempt to prevent such issues in the future.
Commit 16bf24e0e4 recently removed pg_replication_origin's TOAST
table in order to fix the same problem for that catalog. On the
back-branches, those bugs are left in place. We cannot easily
remove a catalog's TOAST table on released major versions, and only
replication origins with extremely long names are affected. Given
the low severity of the issue, it didn't seem worth the trouble of
significantly modifying the patch on the back-branches.
Also, on v13 and v14, the aforementioned strict assertions have
been omitted because commit 2776922201, which added
HaveRegisteredOrActiveSnapshot(), was not backpatched. While we
could probably back-patch it now, I've opted against it because it
seems unlikely that new TOAST snapshot issues will be introduced in
the oldest supported versions.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org
Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
Backpatch-through: 13
---
src/backend/commands/indexcmds.c | 8 ++++++++
src/backend/postmaster/autovacuum.c | 7 +++++++
2 files changed, 15 insertions(+)
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index e0295d7b8ae..f7c09a63242 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -3519,12 +3519,20 @@ ReindexRelationConcurrently(Oid relationOid, int options)
get_rel_namespace(heapId),
false);
+ /*
+ * Swapping the indexes might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Swap old index with the new one. This also marks the new one as
* valid and the old one as not valid.
*/
index_concurrently_swap(newIndexId, oldIndexId, oldName);
+ PopActiveSnapshot();
+
/*
* Invalidate the relcache for the table, so that after this commit
* all sessions will refresh any cached plans that might reference the
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 59fcc8b93f0..23cca675f00 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2294,6 +2294,12 @@ do_autovacuum(void)
get_namespace_name(classForm->relnamespace),
NameStr(classForm->relname))));
+ /*
+ * Deletion might involve TOAST table access, so ensure we have a
+ * valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
object.classId = RelationRelationId;
object.objectId = relid;
object.objectSubId = 0;
@@ -2306,6 +2312,7 @@ do_autovacuum(void)
* To commit the deletion, end current transaction and start a new
* one. Note this also releases the locks we took.
*/
+ PopActiveSnapshot();
CommitTransactionCommand();
StartTransactionCommand();
--
2.39.5 (Apple Git-154)
v6-0001-Ensure-we-have-a-snapshot-when-updating-vario.patch.v14text/plain; charset=us-asciiDownload
From 56dd3903c2503a4ae5727107489b2c5082e62ef4 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Tue, 27 May 2025 14:58:37 -0500
Subject: [PATCH v6 1/1] Ensure we have a snapshot when updating various system
catalogs.
A few places that access system catalogs don't set up an active
snapshot before potentially accessing their TOAST tables. To fix,
push an active snapshot just before each section of code that might
require accessing one of these TOAST tables, and pop it shortly
afterwards. While at it, this commit adds some rather strict
assertions in an attempt to prevent such issues in the future.
Commit 16bf24e0e4 recently removed pg_replication_origin's TOAST
table in order to fix the same problem for that catalog. On the
back-branches, those bugs are left in place. We cannot easily
remove a catalog's TOAST table on released major versions, and only
replication origins with extremely long names are affected. Given
the low severity of the issue, it didn't seem worth the trouble of
significantly modifying the patch on the back-branches.
Also, on v13 and v14, the aforementioned strict assertions have
been omitted because commit 2776922201, which added
HaveRegisteredOrActiveSnapshot(), was not backpatched. While we
could probably back-patch it now, I've opted against it because it
seems unlikely that new TOAST snapshot issues will be introduced in
the oldest supported versions.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org
Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
Backpatch-through: 13
---
src/backend/commands/indexcmds.c | 8 ++++++++
src/backend/commands/tablecmds.c | 8 ++++++++
src/backend/postmaster/autovacuum.c | 7 +++++++
3 files changed, 23 insertions(+)
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index c2904cdb1a7..b658642ee93 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -4005,12 +4005,20 @@ ReindexRelationConcurrently(Oid relationOid, ReindexParams *params)
get_rel_namespace(oldidx->tableId),
false);
+ /*
+ * Swapping the indexes might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Swap old index with the new one. This also marks the new one as
* valid and the old one as not valid.
*/
index_concurrently_swap(newidx->indexId, oldidx->indexId, oldName);
+ PopActiveSnapshot();
+
/*
* Invalidate the relcache for the table, so that after this commit
* all sessions will refresh any cached plans that might reference the
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index b411b87b91c..ba3cc478016 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -18220,9 +18220,17 @@ ATExecDetachPartition(List **wqueue, AlteredTableInfo *tab, Relation rel,
tab->rel = rel;
}
+ /*
+ * Detaching the partition might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/* Do the final part of detaching */
DetachPartitionFinalize(rel, partRel, concurrent, defaultPartOid);
+ PopActiveSnapshot();
+
ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partRel));
/* keep our lock until commit */
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index e70f32d0ab4..b4e529efd82 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2321,6 +2321,12 @@ do_autovacuum(void)
get_namespace_name(classForm->relnamespace),
NameStr(classForm->relname))));
+ /*
+ * Deletion might involve TOAST table access, so ensure we have a
+ * valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
object.classId = RelationRelationId;
object.objectId = relid;
object.objectSubId = 0;
@@ -2333,6 +2339,7 @@ do_autovacuum(void)
* To commit the deletion, end current transaction and start a new
* one. Note this also releases the locks we took.
*/
+ PopActiveSnapshot();
CommitTransactionCommand();
StartTransactionCommand();
--
2.39.5 (Apple Git-154)
v6-0001-Ensure-we-have-a-snapshot-when-updating-vario.patch.v15text/plain; charset=us-asciiDownload
From 405825223e8a2d896e0f8cda5a5fe0c978a8c711 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Tue, 27 May 2025 14:58:37 -0500
Subject: [PATCH v6 1/1] Ensure we have a snapshot when updating various system
catalogs.
A few places that access system catalogs don't set up an active
snapshot before potentially accessing their TOAST tables. To fix,
push an active snapshot just before each section of code that might
require accessing one of these TOAST tables, and pop it shortly
afterwards. While at it, this commit adds some rather strict
assertions in an attempt to prevent such issues in the future.
Commit 16bf24e0e4 recently removed pg_replication_origin's TOAST
table in order to fix the same problem for that catalog. On the
back-branches, those bugs are left in place. We cannot easily
remove a catalog's TOAST table on released major versions, and only
replication origins with extremely long names are affected. Given
the low severity of the issue, it didn't seem worth the trouble of
significantly modifying the patch on the back-branches.
Also, on v13 and v14, the aforementioned strict assertions have
been omitted because commit 2776922201, which added
HaveRegisteredOrActiveSnapshot(), was not backpatched. While we
could probably back-patch it now, I've opted against it because it
seems unlikely that new TOAST snapshot issues will be introduced in
the oldest supported versions.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org
Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
Backpatch-through: 13
---
src/backend/access/heap/heapam.c | 41 ++++++++++++++++++++++++
src/backend/commands/indexcmds.c | 8 +++++
src/backend/commands/tablecmds.c | 8 +++++
src/backend/postmaster/autovacuum.c | 7 ++++
src/backend/replication/logical/worker.c | 24 ++++++++++++++
5 files changed, 88 insertions(+)
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 8aaddcbb30b..ba737c5b2ee 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -54,6 +54,7 @@
#include "catalog/catalog.h"
#include "catalog/pg_database.h"
#include "catalog/pg_database_d.h"
+#include "catalog/pg_replication_origin.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "port/atomics.h"
@@ -227,6 +228,38 @@ static const int MultiXactStatusLock[MaxMultiXactStatus + 1] =
#define TUPLOCK_from_mxstatus(status) \
(MultiXactStatusLock[(status)])
+/*
+ * Check that we have a valid snapshot if we might need TOAST access.
+ */
+static inline void
+AssertHasSnapshotForToast(Relation rel)
+{
+#ifdef USE_ASSERT_CHECKING
+
+ /* bootstrap mode in particular breaks this rule */
+ if (!IsNormalProcessingMode())
+ return;
+
+ /* if the relation doesn't have a TOAST table, we are good */
+ if (!OidIsValid(rel->rd_rel->reltoastrelid))
+ return;
+
+ /*
+ * Commit 16bf24e fixed accesses to pg_replication_origin without a
+ * an active snapshot by removing its TOAST table. On older branches,
+ * these bugs are left in place. Its only varlena column is roname (the
+ * replication origin name), so this is only a problem if the name
+ * requires out-of-line storage, which seems unlikely. In any case,
+ * fixing it doesn't seem worth extra code churn on the back-branches.
+ */
+ if (RelationGetRelid(rel) == ReplicationOriginRelationId)
+ return;
+
+ Assert(HaveRegisteredOrActiveSnapshot());
+
+#endif /* USE_ASSERT_CHECKING */
+}
+
/* ----------------------------------------------------------------
* heap support routines
* ----------------------------------------------------------------
@@ -2047,6 +2080,8 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
Assert(HeapTupleHeaderGetNatts(tup->t_data) <=
RelationGetNumberOfAttributes(relation));
+ AssertHasSnapshotForToast(relation);
+
/*
* Fill in tuple header fields and toast the tuple if necessary.
*
@@ -2294,6 +2329,8 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
/* currently not needed (thus unsupported) for heap_multi_insert() */
AssertArg(!(options & HEAP_INSERT_NO_LOGICAL));
+ AssertHasSnapshotForToast(relation);
+
needwal = RelationNeedsWAL(relation);
saveFreeSpace = RelationGetTargetPageFreeSpace(relation,
HEAP_DEFAULT_FILLFACTOR);
@@ -2697,6 +2734,8 @@ heap_delete(Relation relation, ItemPointer tid,
Assert(ItemPointerIsValid(tid));
+ AssertHasSnapshotForToast(relation);
+
/*
* Forbid this during a parallel operation, lest it allocate a combo CID.
* Other workers might need that combo CID for visibility checks, and we
@@ -3188,6 +3227,8 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
Assert(HeapTupleHeaderGetNatts(newtup->t_data) <=
RelationGetNumberOfAttributes(relation));
+ AssertHasSnapshotForToast(relation);
+
/*
* Forbid this during a parallel operation, lest it allocate a combo CID.
* Other workers might need that combo CID for visibility checks, and we
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 77aa99496fa..1e3af0093b4 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -3993,12 +3993,20 @@ ReindexRelationConcurrently(Oid relationOid, ReindexParams *params)
get_rel_namespace(oldidx->tableId),
false);
+ /*
+ * Swapping the indexes might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Swap old index with the new one. This also marks the new one as
* valid and the old one as not valid.
*/
index_concurrently_swap(newidx->indexId, oldidx->indexId, oldName);
+ PopActiveSnapshot();
+
/*
* Invalidate the relcache for the table, so that after this commit
* all sessions will refresh any cached plans that might reference the
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index ab3ab794083..4f7be0247c2 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -18810,9 +18810,17 @@ ATExecDetachPartition(List **wqueue, AlteredTableInfo *tab, Relation rel,
tab->rel = rel;
}
+ /*
+ * Detaching the partition might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/* Do the final part of detaching */
DetachPartitionFinalize(rel, partRel, concurrent, defaultPartOid);
+ PopActiveSnapshot();
+
ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partRel));
/* keep our lock until commit */
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c511a132263..94c93c6e4b5 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2315,6 +2315,12 @@ do_autovacuum(void)
get_namespace_name(classForm->relnamespace),
NameStr(classForm->relname))));
+ /*
+ * Deletion might involve TOAST table access, so ensure we have a
+ * valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
object.classId = RelationRelationId;
object.objectId = relid;
object.objectSubId = 0;
@@ -2327,6 +2333,7 @@ do_autovacuum(void)
* To commit the deletion, end current transaction and start a new
* one. Note this also releases the locks we took.
*/
+ PopActiveSnapshot();
CommitTransactionCommand();
StartTransactionCommand();
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index dcc3fdf6c76..155bf7c44ee 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3794,8 +3794,16 @@ ApplyWorkerMain(Datum main_arg)
walrcv_startstreaming(LogRepWorkerWalRcvConn, &options);
StartTransactionCommand();
+
+ /*
+ * Updating pg_subscription might involve TOAST table access, so
+ * ensure we have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
UpdateTwoPhaseState(MySubscription->oid, LOGICALREP_TWOPHASE_STATE_ENABLED);
MySubscription->twophasestate = LOGICALREP_TWOPHASE_STATE_ENABLED;
+ PopActiveSnapshot();
CommitTransactionCommand();
}
else
@@ -3848,7 +3856,15 @@ DisableSubscriptionAndExit(void)
/* Disable the subscription */
StartTransactionCommand();
+
+ /*
+ * Updating pg_subscription might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
DisableSubscription(MySubscription->oid);
+ PopActiveSnapshot();
CommitTransactionCommand();
/* Notify the subscription has been disabled and exit */
@@ -3939,6 +3955,12 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
started_tx = true;
}
+ /*
+ * Updating pg_subscription might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Protect subskiplsn of pg_subscription from being concurrently updated
* while clearing it.
@@ -3997,6 +4019,8 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
heap_freetuple(tup);
table_close(rel, NoLock);
+ PopActiveSnapshot();
+
if (started_tx)
CommitTransactionCommand();
}
--
2.39.5 (Apple Git-154)
v6-0001-Ensure-we-have-a-snapshot-when-updating-vario.patch.v16text/plain; charset=us-asciiDownload
From c06dfa22c789b177cdc6a33530b08ffd671ea84b Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Tue, 27 May 2025 14:58:37 -0500
Subject: [PATCH v6 1/1] Ensure we have a snapshot when updating various system
catalogs.
A few places that access system catalogs don't set up an active
snapshot before potentially accessing their TOAST tables. To fix,
push an active snapshot just before each section of code that might
require accessing one of these TOAST tables, and pop it shortly
afterwards. While at it, this commit adds some rather strict
assertions in an attempt to prevent such issues in the future.
Commit 16bf24e0e4 recently removed pg_replication_origin's TOAST
table in order to fix the same problem for that catalog. On the
back-branches, those bugs are left in place. We cannot easily
remove a catalog's TOAST table on released major versions, and only
replication origins with extremely long names are affected. Given
the low severity of the issue, it didn't seem worth the trouble of
significantly modifying the patch on the back-branches.
Also, on v13 and v14, the aforementioned strict assertions have
been omitted because commit 2776922201, which added
HaveRegisteredOrActiveSnapshot(), was not backpatched. While we
could probably back-patch it now, I've opted against it because it
seems unlikely that new TOAST snapshot issues will be introduced in
the oldest supported versions.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org
Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
Backpatch-through: 13
---
src/backend/access/heap/heapam.c | 41 ++++++++++++++++++++++++
src/backend/commands/indexcmds.c | 8 +++++
src/backend/commands/tablecmds.c | 8 +++++
src/backend/postmaster/autovacuum.c | 7 ++++
src/backend/replication/logical/worker.c | 24 ++++++++++++++
5 files changed, 88 insertions(+)
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 429dc2cebe3..6e7e07929d2 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -54,6 +54,7 @@
#include "catalog/catalog.h"
#include "catalog/pg_database.h"
#include "catalog/pg_database_d.h"
+#include "catalog/pg_replication_origin.h"
#include "commands/vacuum.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -231,6 +232,38 @@ static const int MultiXactStatusLock[MaxMultiXactStatus + 1] =
#define TUPLOCK_from_mxstatus(status) \
(MultiXactStatusLock[(status)])
+/*
+ * Check that we have a valid snapshot if we might need TOAST access.
+ */
+static inline void
+AssertHasSnapshotForToast(Relation rel)
+{
+#ifdef USE_ASSERT_CHECKING
+
+ /* bootstrap mode in particular breaks this rule */
+ if (!IsNormalProcessingMode())
+ return;
+
+ /* if the relation doesn't have a TOAST table, we are good */
+ if (!OidIsValid(rel->rd_rel->reltoastrelid))
+ return;
+
+ /*
+ * Commit 16bf24e fixed accesses to pg_replication_origin without a
+ * an active snapshot by removing its TOAST table. On older branches,
+ * these bugs are left in place. Its only varlena column is roname (the
+ * replication origin name), so this is only a problem if the name
+ * requires out-of-line storage, which seems unlikely. In any case,
+ * fixing it doesn't seem worth extra code churn on the back-branches.
+ */
+ if (RelationGetRelid(rel) == ReplicationOriginRelationId)
+ return;
+
+ Assert(HaveRegisteredOrActiveSnapshot());
+
+#endif /* USE_ASSERT_CHECKING */
+}
+
/* ----------------------------------------------------------------
* heap support routines
* ----------------------------------------------------------------
@@ -1858,6 +1891,8 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
Assert(HeapTupleHeaderGetNatts(tup->t_data) <=
RelationGetNumberOfAttributes(relation));
+ AssertHasSnapshotForToast(relation);
+
/*
* Fill in tuple header fields and toast the tuple if necessary.
*
@@ -2135,6 +2170,8 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
/* currently not needed (thus unsupported) for heap_multi_insert() */
Assert(!(options & HEAP_INSERT_NO_LOGICAL));
+ AssertHasSnapshotForToast(relation);
+
needwal = RelationNeedsWAL(relation);
saveFreeSpace = RelationGetTargetPageFreeSpace(relation,
HEAP_DEFAULT_FILLFACTOR);
@@ -2557,6 +2594,8 @@ heap_delete(Relation relation, ItemPointer tid,
Assert(ItemPointerIsValid(tid));
+ AssertHasSnapshotForToast(relation);
+
/*
* Forbid this during a parallel operation, lest it allocate a combo CID.
* Other workers might need that combo CID for visibility checks, and we
@@ -3052,6 +3091,8 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
Assert(HeapTupleHeaderGetNatts(newtup->t_data) <=
RelationGetNumberOfAttributes(relation));
+ AssertHasSnapshotForToast(relation);
+
/*
* Forbid this during a parallel operation, lest it allocate a combo CID.
* Other workers might need that combo CID for visibility checks, and we
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index ea3967b7c28..cf6ef3a955d 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -4064,12 +4064,20 @@ ReindexRelationConcurrently(Oid relationOid, ReindexParams *params)
get_rel_namespace(oldidx->tableId),
false);
+ /*
+ * Swapping the indexes might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Swap old index with the new one. This also marks the new one as
* valid and the old one as not valid.
*/
index_concurrently_swap(newidx->indexId, oldidx->indexId, oldName);
+ PopActiveSnapshot();
+
/*
* Invalidate the relcache for the table, so that after this commit
* all sessions will refresh any cached plans that might reference the
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 633a9c8317c..cfb64768303 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -18789,9 +18789,17 @@ ATExecDetachPartition(List **wqueue, AlteredTableInfo *tab, Relation rel,
tab->rel = rel;
}
+ /*
+ * Detaching the partition might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/* Do the final part of detaching */
DetachPartitionFinalize(rel, partRel, concurrent, defaultPartOid);
+ PopActiveSnapshot();
+
ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partRel));
/* keep our lock until commit */
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 4a5df488860..256c25af3b1 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2372,6 +2372,12 @@ do_autovacuum(void)
get_namespace_name(classForm->relnamespace),
NameStr(classForm->relname))));
+ /*
+ * Deletion might involve TOAST table access, so ensure we have a
+ * valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
object.classId = RelationRelationId;
object.objectId = relid;
object.objectSubId = 0;
@@ -2384,6 +2390,7 @@ do_autovacuum(void)
* To commit the deletion, end current transaction and start a new
* one. Note this also releases the locks we took.
*/
+ PopActiveSnapshot();
CommitTransactionCommand();
StartTransactionCommand();
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 363085b46a1..4df0a594dc9 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -4725,8 +4725,16 @@ ApplyWorkerMain(Datum main_arg)
walrcv_startstreaming(LogRepWorkerWalRcvConn, &options);
StartTransactionCommand();
+
+ /*
+ * Updating pg_subscription might involve TOAST table access, so
+ * ensure we have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
UpdateTwoPhaseState(MySubscription->oid, LOGICALREP_TWOPHASE_STATE_ENABLED);
MySubscription->twophasestate = LOGICALREP_TWOPHASE_STATE_ENABLED;
+ PopActiveSnapshot();
CommitTransactionCommand();
}
else
@@ -4779,7 +4787,15 @@ DisableSubscriptionAndExit(void)
/* Disable the subscription */
StartTransactionCommand();
+
+ /*
+ * Updating pg_subscription might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
DisableSubscription(MySubscription->oid);
+ PopActiveSnapshot();
CommitTransactionCommand();
/* Ensure we remove no-longer-useful entry for worker's start time */
@@ -4883,6 +4899,12 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
started_tx = true;
}
+ /*
+ * Updating pg_subscription might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Protect subskiplsn of pg_subscription from being concurrently updated
* while clearing it.
@@ -4941,6 +4963,8 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
heap_freetuple(tup);
table_close(rel, NoLock);
+ PopActiveSnapshot();
+
if (started_tx)
CommitTransactionCommand();
}
--
2.39.5 (Apple Git-154)
v6-0001-Ensure-we-have-a-snapshot-when-updating-vario.patch.v17text/plain; charset=us-asciiDownload
From f7bff0b37a00c34167898f31dd2ef0405b8af1da Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Tue, 27 May 2025 14:58:37 -0500
Subject: [PATCH v6 1/1] Ensure we have a snapshot when updating various system
catalogs.
A few places that access system catalogs don't set up an active
snapshot before potentially accessing their TOAST tables. To fix,
push an active snapshot just before each section of code that might
require accessing one of these TOAST tables, and pop it shortly
afterwards. While at it, this commit adds some rather strict
assertions in an attempt to prevent such issues in the future.
Commit 16bf24e0e4 recently removed pg_replication_origin's TOAST
table in order to fix the same problem for that catalog. On the
back-branches, those bugs are left in place. We cannot easily
remove a catalog's TOAST table on released major versions, and only
replication origins with extremely long names are affected. Given
the low severity of the issue, it didn't seem worth the trouble of
significantly modifying the patch on the back-branches.
Also, on v13 and v14, the aforementioned strict assertions have
been omitted because commit 2776922201, which added
HaveRegisteredOrActiveSnapshot(), was not backpatched. While we
could probably back-patch it now, I've opted against it because it
seems unlikely that new TOAST snapshot issues will be introduced in
the oldest supported versions.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org
Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
Backpatch-through: 13
---
src/backend/access/heap/heapam.c | 41 ++++++++++++++++++++++++
src/backend/commands/indexcmds.c | 8 +++++
src/backend/commands/tablecmds.c | 8 +++++
src/backend/postmaster/autovacuum.c | 7 ++++
src/backend/replication/logical/worker.c | 24 ++++++++++++++
5 files changed, 88 insertions(+)
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 95e3be524a7..ccf151f548b 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -53,6 +53,7 @@
#include "catalog/catalog.h"
#include "catalog/pg_database.h"
#include "catalog/pg_database_d.h"
+#include "catalog/pg_replication_origin.h"
#include "commands/vacuum.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -230,6 +231,38 @@ static const int MultiXactStatusLock[MaxMultiXactStatus + 1] =
#define TUPLOCK_from_mxstatus(status) \
(MultiXactStatusLock[(status)])
+/*
+ * Check that we have a valid snapshot if we might need TOAST access.
+ */
+static inline void
+AssertHasSnapshotForToast(Relation rel)
+{
+#ifdef USE_ASSERT_CHECKING
+
+ /* bootstrap mode in particular breaks this rule */
+ if (!IsNormalProcessingMode())
+ return;
+
+ /* if the relation doesn't have a TOAST table, we are good */
+ if (!OidIsValid(rel->rd_rel->reltoastrelid))
+ return;
+
+ /*
+ * Commit 16bf24e fixed accesses to pg_replication_origin without a
+ * an active snapshot by removing its TOAST table. On older branches,
+ * these bugs are left in place. Its only varlena column is roname (the
+ * replication origin name), so this is only a problem if the name
+ * requires out-of-line storage, which seems unlikely. In any case,
+ * fixing it doesn't seem worth extra code churn on the back-branches.
+ */
+ if (RelationGetRelid(rel) == ReplicationOriginRelationId)
+ return;
+
+ Assert(HaveRegisteredOrActiveSnapshot());
+
+#endif /* USE_ASSERT_CHECKING */
+}
+
/* ----------------------------------------------------------------
* heap support routines
* ----------------------------------------------------------------
@@ -2015,6 +2048,8 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
Assert(HeapTupleHeaderGetNatts(tup->t_data) <=
RelationGetNumberOfAttributes(relation));
+ AssertHasSnapshotForToast(relation);
+
/*
* Fill in tuple header fields and toast the tuple if necessary.
*
@@ -2292,6 +2327,8 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
/* currently not needed (thus unsupported) for heap_multi_insert() */
Assert(!(options & HEAP_INSERT_NO_LOGICAL));
+ AssertHasSnapshotForToast(relation);
+
needwal = RelationNeedsWAL(relation);
saveFreeSpace = RelationGetTargetPageFreeSpace(relation,
HEAP_DEFAULT_FILLFACTOR);
@@ -2714,6 +2751,8 @@ heap_delete(Relation relation, ItemPointer tid,
Assert(ItemPointerIsValid(tid));
+ AssertHasSnapshotForToast(relation);
+
/*
* Forbid this during a parallel operation, lest it allocate a combo CID.
* Other workers might need that combo CID for visibility checks, and we
@@ -3209,6 +3248,8 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
Assert(HeapTupleHeaderGetNatts(newtup->t_data) <=
RelationGetNumberOfAttributes(relation));
+ AssertHasSnapshotForToast(relation);
+
/*
* Forbid this during a parallel operation, lest it allocate a combo CID.
* Other workers might need that combo CID for visibility checks, and we
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index f7fe1fae91c..08a056bc0f8 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -4094,12 +4094,20 @@ ReindexRelationConcurrently(const ReindexStmt *stmt, Oid relationOid, const Rein
get_rel_namespace(oldidx->tableId),
false);
+ /*
+ * Swapping the indexes might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Swap old index with the new one. This also marks the new one as
* valid and the old one as not valid.
*/
index_concurrently_swap(newidx->indexId, oldidx->indexId, oldName);
+ PopActiveSnapshot();
+
/*
* Invalidate the relcache for the table, so that after this commit
* all sessions will refresh any cached plans that might reference the
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index e6606e5e57a..a4e0fd06781 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -19268,9 +19268,17 @@ ATExecDetachPartition(List **wqueue, AlteredTableInfo *tab, Relation rel,
tab->rel = rel;
}
+ /*
+ * Detaching the partition might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/* Do the final part of detaching */
DetachPartitionFinalize(rel, partRel, concurrent, defaultPartOid);
+ PopActiveSnapshot();
+
ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partRel));
/* keep our lock until commit */
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 0e30a7fda66..ec5699e48e8 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2220,6 +2220,12 @@ do_autovacuum(void)
get_namespace_name(classForm->relnamespace),
NameStr(classForm->relname))));
+ /*
+ * Deletion might involve TOAST table access, so ensure we have a
+ * valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
object.classId = RelationRelationId;
object.objectId = relid;
object.objectSubId = 0;
@@ -2232,6 +2238,7 @@ do_autovacuum(void)
* To commit the deletion, end current transaction and start a new
* one. Note this also releases the locks we took.
*/
+ PopActiveSnapshot();
CommitTransactionCommand();
StartTransactionCommand();
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index db09978697f..1bff6c92dda 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -4550,8 +4550,16 @@ run_apply_worker()
walrcv_startstreaming(LogRepWorkerWalRcvConn, &options);
StartTransactionCommand();
+
+ /*
+ * Updating pg_subscription might involve TOAST table access, so
+ * ensure we have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
UpdateTwoPhaseState(MySubscription->oid, LOGICALREP_TWOPHASE_STATE_ENABLED);
MySubscription->twophasestate = LOGICALREP_TWOPHASE_STATE_ENABLED;
+ PopActiveSnapshot();
CommitTransactionCommand();
}
else
@@ -4767,7 +4775,15 @@ DisableSubscriptionAndExit(void)
/* Disable the subscription */
StartTransactionCommand();
+
+ /*
+ * Updating pg_subscription might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
DisableSubscription(MySubscription->oid);
+ PopActiveSnapshot();
CommitTransactionCommand();
/* Ensure we remove no-longer-useful entry for worker's start time */
@@ -4871,6 +4887,12 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
started_tx = true;
}
+ /*
+ * Updating pg_subscription might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
/*
* Protect subskiplsn of pg_subscription from being concurrently updated
* while clearing it.
@@ -4929,6 +4951,8 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
heap_freetuple(tup);
table_close(rel, NoLock);
+ PopActiveSnapshot();
+
if (started_tx)
CommitTransactionCommand();
}
--
2.39.5 (Apple Git-154)
On Tue, May 27, 2025 at 03:02:52PM -0500, Nathan Bossart wrote:
Here is what I have staged for commit for the others. I'm hoping to push
these in the next couple of days.
Thanks for the refreshed versions. Looks sensible to me overall.
+static inline void
+AssertHasSnapshotForToast(Relation rel)
[...]
+ /*
+ * Commit 16bf24e fixed accesses to pg_replication_origin without a
+ * an active snapshot by removing its TOAST table. On older branches,
+ * these bugs are left in place. Its only varlena column is roname (the
+ * replication origin name), so this is only a problem if the name
+ * requires out-of-line storage, which seems unlikely. In any case,
+ * fixing it doesn't seem worth extra code churn on the back-branches.
+ */
+ if (RelationGetRelid(rel) == ReplicationOriginRelationId)
+ return;
As of the back-branches but not HEAD, this shortcut makes sense.
--
Michael