v13: CLUSTER segv with wal_level=minimal and parallel index creation
Following a bulk load, a CLUSTER command run by a maintenance script crashed.
This is currently reproducible on that instance, so please suggest if I can
provide more info.
< 2020-09-06 15:44:16.369 MDT >LOG: background worker "parallel worker" (PID 2576) was terminated by signal 6: Aborted
< 2020-09-06 15:44:16.369 MDT >DETAIL: Failed process was running: CLUSTER pg_attribute USING pg_attribute_relid_attnam_index
The crash happens during:
ts=# REINDEX INDEX pg_attribute_relid_attnum_index;
..but not:
ts=# REINDEX INDEX pg_attribute_relid_attnam_index ;
pg_catalog | pg_attribute_relid_attnam_index | index | postgres | pg_attribute | permanent | 31 MB |
pg_catalog | pg_attribute_relid_attnum_index | index | postgres | pg_attribute | permanent | 35 MB |
I suspect
|commit c6b92041d Skip WAL for new relfilenodes, under wal_level=minimal.
In fact, I set wal_level=minimal for the bulk load. Note also:
override | data_checksums | on
configuration file | checkpoint_timeout | 60
configuration file | maintenance_work_mem | 1048576
configuration file | max_wal_senders | 0
configuration file | wal_compression | on
configuration file | wal_level | minimal
configuration file | fsync | off
configuration file | full_page_writes | off
default | server_version | 13beta3
(gdb) bt
#0 0x00007ff9999ad387 in raise () from /lib64/libc.so.6
#1 0x00007ff9999aea78 in abort () from /lib64/libc.so.6
#2 0x0000000000921da5 in ExceptionalCondition (conditionName=conditionName@entry=0xad4078 "relcache_verdict == RelFileNodeSkippingWAL(relation->rd_node)", errorType=errorType@entry=0x977f49 "FailedAssertion",
fileName=fileName@entry=0xad3068 "relcache.c", lineNumber=lineNumber@entry=2976) at assert.c:67
#3 0x000000000091a08b in AssertPendingSyncConsistency (relation=0x7ff99c2a70b8) at relcache.c:2976
#4 AssertPendingSyncs_RelationCache () at relcache.c:3036
#5 0x000000000058e591 in smgrDoPendingSyncs (isCommit=isCommit@entry=true, isParallelWorker=isParallelWorker@entry=true) at storage.c:685
#6 0x000000000053b1a4 in CommitTransaction () at xact.c:2118
#7 0x000000000053b826 in EndParallelWorkerTransaction () at xact.c:5300
#8 0x000000000052fcf7 in ParallelWorkerMain (main_arg=<optimized out>) at parallel.c:1479
#9 0x000000000076047a in StartBackgroundWorker () at bgworker.c:813
#10 0x000000000076d88d in do_start_bgworker (rw=0x23ac110) at postmaster.c:5865
#11 maybe_start_bgworkers () at postmaster.c:6091
#12 0x000000000076e43e in sigusr1_handler (postgres_signal_arg=<optimized out>) at postmaster.c:5260
#13 <signal handler called>
#14 0x00007ff999a6c983 in __select_nocancel () from /lib64/libc.so.6
#15 0x00000000004887bc in ServerLoop () at postmaster.c:1691
#16 0x000000000076fb45 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x237d280) at postmaster.c:1400
#17 0x000000000048a83d in main (argc=3, argv=0x237d280) at main.c:210
(gdb) bt f
...
#4 AssertPendingSyncs_RelationCache () at relcache.c:3036
status = {hashp = 0x23cba50, curBucket = 449, curEntry = 0x0}
locallock = <optimized out>
rels = 0x23ff018
maxrels = <optimized out>
nrels = 0
idhentry = <optimized out>
i = <optimized out>
#5 0x000000000058e591 in smgrDoPendingSyncs (isCommit=isCommit@entry=true, isParallelWorker=isParallelWorker@entry=true) at storage.c:685
pending = <optimized out>
nrels = 0
maxrels = 0
srels = 0x0
scan = {hashp = 0x23edf60, curBucket = 9633000, curEntry = 0xe01600 <TopTransactionStateData>}
pendingsync = <optimized out>
#6 0x000000000053b1a4 in CommitTransaction () at xact.c:2118
s = 0xe01600 <TopTransactionStateData>
latestXid = <optimized out>
is_parallel_worker = true
__func__ = "CommitTransaction"
This is easily reproduced, at least on pg_attribute
[pryzbyj@localhost ~]$ /usr/pgsql-13/bin/initdb -D pgsql13.dat
[pryzbyj@localhost ~]$ /usr/pgsql-13/bin/postgres -D pgsql13.dat -c logging_collector=off -c port=5678 -c unix_socket_directories=/tmp -c wal-level=minimal -c max_wal_senders=0&
[pryzbyj@localhost ~]$ psql -h /tmp -p 5678 postgres -c "SET min_parallel_table_scan_size=0" -c "SET client_min_messages=debug" -c "REINDEX INDEX pg_attribute_relid_attnum_index"
SET
SET
DEBUG: building index "pg_attribute_relid_attnum_index" on table "pg_attribute" with request for 1 parallel worker
TRAP: FailedAssertion("relcache_verdict == RelFileNodeSkippingWAL(relation->rd_node)", File: "relcache.c", Line: 2976)
postgres: parallel worker for PID 9637 (ExceptionalCondition+0x66)[0x921d86]
postgres: parallel worker for PID 9637 (AssertPendingSyncs_RelationCache+0x1db)[0x91a08b]
postgres: parallel worker for PID 9637 (smgrDoPendingSyncs+0x71)[0x58e591]
postgres: parallel worker for PID 9637 [0x53b1a4]
postgres: parallel worker for PID 9637 (EndParallelWorkerTransaction+0x16)[0x53b826]
postgres: parallel worker for PID 9637 (ParallelWorkerMain+0x437)[0x52fcf7]
postgres: parallel worker for PID 9637 (StartBackgroundWorker+0x25a)[0x76047a]
postgres: parallel worker for PID 9637 [0x76d88d]
postgres: parallel worker for PID 9637 [0x76e43e]
/lib64/libpthread.so.0(+0xf630)[0x7f46d4b47630]
/lib64/libc.so.6(__select+0x13)[0x7f46d26ba983]
postgres: parallel worker for PID 9637 [0x4887bc]
postgres: parallel worker for PID 9637 (PostmasterMain+0x1165)[0x76fb45]
postgres: parallel worker for PID 9637 (main+0x70d)[0x48a83d]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f46d25e7555]
--
Justin
Justin Pryzby <pryzby@telsasoft.com> writes:
This is easily reproduced, at least on pg_attribute
Yeah, duplicated here. The assertion (not SIGSEGV) is in code
added by c6b92041d, so it seems clearly an open issue for v13.
I did not check to see what happens in a non-assert build.
regards, tom lane
At Mon, 07 Sep 2020 00:33:03 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in
Justin Pryzby <pryzby@telsasoft.com> writes:
This is easily reproduced, at least on pg_attribute
Thanks for the repro.
Yeah, duplicated here. The assertion (not SIGSEGV) is in code
added by c6b92041d, so it seems clearly an open issue for v13.I did not check to see what happens in a non-assert build.
The cause is that the worker had received pending-sync entry correctly
but not never created a relcache entry for the relation using
RelationBuildDesc. So the rd_firstRelfilenodeSubid is not correctly
set.
I'm investigating it.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
At Mon, 07 Sep 2020 13:45:28 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
The cause is that the worker had received pending-sync entry correctly
but not never created a relcache entry for the relation using
RelationBuildDesc. So the rd_firstRelfilenodeSubid is not correctly
set.I'm investigating it.
Relcaches are loaded from a file with old content at parallel worker
startup. The relcache entry is corrected by invalidation at taking a
lock but pending syncs are not considered.
Since parallel workers don't access the files so we can just ignore
the assertion safely, but I want to rd_firstRelfilenodeSubid flag at
invalidation, as attached PoC patch.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
fix_para_reindex_on_mapped_rels_opc.patchtext/x-patch; charset=us-asciiDownload
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 96ecad02dd..d66e914345 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1322,10 +1322,27 @@ RelationInitPhysicalAddr(Relation relation)
}
else
{
+ Oid oldnode = relation->rd_node.relNode;
+
/* Consult the relation mapper */
relation->rd_node.relNode =
RelationMapOidToFilenode(relation->rd_id,
relation->rd_rel->relisshared);
+
+ /*
+ * The leader process might have replaced storage before staring this
+ * worker. We need to set rd_firstRelfilenodeSubid according to pending
+ * sync hash in that case. Currently the information is not actually
+ * used in worker processes, but make things tidy.
+ */
+ if (IsParallelWorker() && oldnode != relation->rd_node.relNode)
+ {
+ if (RelFileNodeSkippingWAL(relation->rd_node))
+ relation->rd_firstRelfilenodeSubid = TopSubTransactionId;
+ else
+ relation->rd_firstRelfilenodeSubid = InvalidTransactionId;
+ }
+
if (!OidIsValid(relation->rd_node.relNode))
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
RelationGetRelationName(relation), relation->rd_id);
On Mon, Sep 07, 2020 at 05:40:36PM +0900, Kyotaro Horiguchi wrote:
At Mon, 07 Sep 2020 13:45:28 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
The cause is that the worker had received pending-sync entry correctly
but not never created a relcache entry for the relation using
RelationBuildDesc. So the rd_firstRelfilenodeSubid is not correctly
set.I'm investigating it.
Relcaches are loaded from a file with old content at parallel worker
startup. The relcache entry is corrected by invalidation at taking a
lock but pending syncs are not considered.Since parallel workers don't access the files so we can just ignore
the assertion safely, but I want to rd_firstRelfilenodeSubid flag at
invalidation, as attached PoC patch.
[patch: When RelationInitPhysicalAddr() handles a mapped relation, re-fill
rd_firstRelfilenodeSubid from RelFileNodeSkippingWAL(), like
RelationBuildDesc() would do.]
As a PoC, this looks promising. Thanks. Would you add a test case such that
the following demonstrates the bug in the absence of your PoC?
printf '%s\n%s\n%s\n' 'log_statement = all' 'wal_level = minimal' 'max_wal_senders = 0' >/tmp/minimal.conf
make check TEMP_CONFIG=/tmp/minimal.conf
Please have the test try both a nailed-and-mapped relation and a "nailed, but
not mapped" relation. I am fairly confident that your PoC fixes the former
case, but the latter may need additional code.
At Mon, 7 Sep 2020 02:32:55 -0700, Noah Misch <noah@leadboat.com> wrote in
On Mon, Sep 07, 2020 at 05:40:36PM +0900, Kyotaro Horiguchi wrote:
At Mon, 07 Sep 2020 13:45:28 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
The cause is that the worker had received pending-sync entry correctly
but not never created a relcache entry for the relation using
RelationBuildDesc. So the rd_firstRelfilenodeSubid is not correctly
set.I'm investigating it.
Relcaches are loaded from a file with old content at parallel worker
startup. The relcache entry is corrected by invalidation at taking a
lock but pending syncs are not considered.Since parallel workers don't access the files so we can just ignore
the assertion safely, but I want to rd_firstRelfilenodeSubid flag at
invalidation, as attached PoC patch.[patch: When RelationInitPhysicalAddr() handles a mapped relation, re-fill
rd_firstRelfilenodeSubid from RelFileNodeSkippingWAL(), like
RelationBuildDesc() would do.]As a PoC, this looks promising. Thanks. Would you add a test case such that
the following demonstrates the bug in the absence of your PoC?printf '%s\n%s\n%s\n' 'log_statement = all' 'wal_level = minimal' 'max_wal_senders = 0' >/tmp/minimal.conf
make check TEMP_CONFIG=/tmp/minimal.conf
Mmm. I was close to add some tests to 018_wal_optimize.pl but your
suggestion seems better. I added several ines to create_index.sql.
Please have the test try both a nailed-and-mapped relation and a "nailed, but
not mapped" relation. I am fairly confident that your PoC fixes the former
case, but the latter may need additional code.
Mmm. You're right. I choosed pg_amproc_fam_proc_index as
nailed-but-not-mapped index.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
0001-Fix-assertion-failure-during-reindex-while-wal_level.patchtext/x-patch; charset=us-asciiDownload
From c4d53acd64011f1ed5652177923d010f4c50fa89 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Mon, 7 Sep 2020 20:30:30 +0900
Subject: [PATCH] Fix assertion failure during reindex while wal_level=minimal.
When wal_level=minimal, nailed relations are forgotten to set
rd_firstRelfilenodeSubid correctly on parallel workers. Fix it.
---
src/backend/utils/cache/relcache.c | 16 ++++++++++++++++
src/test/regress/expected/create_index.out | 5 +++++
src/test/regress/sql/create_index.sql | 6 ++++++
3 files changed, 27 insertions(+)
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 96ecad02dd..cd6c8bbc70 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1273,6 +1273,8 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
static void
RelationInitPhysicalAddr(Relation relation)
{
+ Oid oldnode = relation->rd_node.relNode;
+
/* these relations kinds never have storage */
if (!RELKIND_HAS_STORAGE(relation->rd_rel->relkind))
return;
@@ -1330,6 +1332,20 @@ RelationInitPhysicalAddr(Relation relation)
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
RelationGetRelationName(relation), relation->rd_id);
}
+
+ /*
+ * Our leader process might have replaced storage before staring this
+ * worker. We need to set rd_firstRelfilenodeSubid according to pending
+ * sync hash in that case. Currently the information is not actually used
+ * in worker processes, but make things tidy.
+ */
+ if (IsParallelWorker() && oldnode != relation->rd_node.relNode)
+ {
+ if (RelFileNodeSkippingWAL(relation->rd_node))
+ relation->rd_firstRelfilenodeSubid = TopSubTransactionId;
+ else
+ relation->rd_firstRelfilenodeSubid = InvalidTransactionId;
+ }
}
/*
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 814416d936..6468418639 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -2598,6 +2598,11 @@ REINDEX TABLE pg_toast.pg_toast_1260;
ERROR: must be owner of table pg_toast_1260
REINDEX INDEX pg_toast.pg_toast_1260_index;
ERROR: must be owner of index pg_toast_1260_index
+-- parallel reindex
+RESET ROLE;
+SET min_parallel_table_scan_size = 0;
+REINDEX INDEX pg_attribute_relid_attnum_index; -- nailed and mapped
+REINDEX INDEX pg_amproc_fam_proc_index; -- nailed but not mapped
-- Clean up
RESET ROLE;
REVOKE USAGE ON SCHEMA pg_toast FROM regress_reindexuser;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index f3667bacdc..d9cb3f14f6 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -1115,6 +1115,12 @@ SET SESSION ROLE regress_reindexuser;
REINDEX TABLE pg_toast.pg_toast_1260;
REINDEX INDEX pg_toast.pg_toast_1260_index;
+-- parallel reindex
+RESET ROLE;
+SET min_parallel_table_scan_size = 0;
+REINDEX INDEX pg_attribute_relid_attnum_index; -- nailed and mapped
+REINDEX INDEX pg_amproc_fam_proc_index; -- nailed but not mapped
+
-- Clean up
RESET ROLE;
REVOKE USAGE ON SCHEMA pg_toast FROM regress_reindexuser;
--
2.18.4
At Tue, 08 Sep 2020 09:13:53 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
At Mon, 7 Sep 2020 02:32:55 -0700, Noah Misch <noah@leadboat.com> wrote in
As a PoC, this looks promising. Thanks. Would you add a test case such that
the following demonstrates the bug in the absence of your PoC?printf '%s\n%s\n%s\n' 'log_statement = all' 'wal_level = minimal' 'max_wal_senders = 0' >/tmp/minimal.conf
make check TEMP_CONFIG=/tmp/minimal.confMmm. I was close to add some tests to 018_wal_optimize.pl but your
suggestion seems better. I added several ines to create_index.sql.Please have the test try both a nailed-and-mapped relation and a "nailed, but
not mapped" relation. I am fairly confident that your PoC fixes the former
case, but the latter may need additional code.Mmm. You're right. I choosed pg_amproc_fam_proc_index as
nailed-but-not-mapped index.
I fixed a typo (s/staring/starting/).
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
v2-0001-Fix-assertion-failure-during-reindex-while-wal_le.patchtext/x-patch; charset=us-asciiDownload
From 44512f8672395dc40a001eb92b7ffca060eb411d Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Mon, 7 Sep 2020 20:30:30 +0900
Subject: [PATCH v2] Fix assertion failure during reindex while
wal_level=minimal.
When wal_level=minimal, nailed relations are forgotten to set
rd_firstRelfilenodeSubid correctly on parallel workers. Fix it.
---
src/backend/utils/cache/relcache.c | 16 ++++++++++++++++
src/test/regress/expected/create_index.out | 5 +++++
src/test/regress/sql/create_index.sql | 6 ++++++
3 files changed, 27 insertions(+)
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 96ecad02dd..8a61fd674e 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1273,6 +1273,8 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
static void
RelationInitPhysicalAddr(Relation relation)
{
+ Oid oldnode = relation->rd_node.relNode;
+
/* these relations kinds never have storage */
if (!RELKIND_HAS_STORAGE(relation->rd_rel->relkind))
return;
@@ -1330,6 +1332,20 @@ RelationInitPhysicalAddr(Relation relation)
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
RelationGetRelationName(relation), relation->rd_id);
}
+
+ /*
+ * Our leader process might have replaced storage before starting this
+ * worker. We need to set rd_firstRelfilenodeSubid according to pending
+ * sync hash in that case. Currently the information is not actually used
+ * in worker processes, but make things tidy.
+ */
+ if (IsParallelWorker() && oldnode != relation->rd_node.relNode)
+ {
+ if (RelFileNodeSkippingWAL(relation->rd_node))
+ relation->rd_firstRelfilenodeSubid = TopSubTransactionId;
+ else
+ relation->rd_firstRelfilenodeSubid = InvalidTransactionId;
+ }
}
/*
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 64c0c66859..66ec1465e1 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -2732,6 +2732,11 @@ REINDEX TABLE pg_toast.pg_toast_1260;
ERROR: must be owner of table pg_toast_1260
REINDEX INDEX pg_toast.pg_toast_1260_index;
ERROR: must be owner of index pg_toast_1260_index
+-- parallel reindex
+RESET ROLE;
+SET min_parallel_table_scan_size = 0;
+REINDEX INDEX pg_attribute_relid_attnum_index; -- nailed and mapped
+REINDEX INDEX pg_amproc_fam_proc_index; -- nailed but not mapped
-- Clean up
RESET ROLE;
REVOKE USAGE ON SCHEMA pg_toast FROM regress_reindexuser;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 37f7259da9..6b104a2389 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -1191,6 +1191,12 @@ SET SESSION ROLE regress_reindexuser;
REINDEX TABLE pg_toast.pg_toast_1260;
REINDEX INDEX pg_toast.pg_toast_1260_index;
+-- parallel reindex
+RESET ROLE;
+SET min_parallel_table_scan_size = 0;
+REINDEX INDEX pg_attribute_relid_attnum_index; -- nailed and mapped
+REINDEX INDEX pg_amproc_fam_proc_index; -- nailed but not mapped
+
-- Clean up
RESET ROLE;
REVOKE USAGE ON SCHEMA pg_toast FROM regress_reindexuser;
--
2.18.4
On Tue, Sep 08, 2020 at 10:43:32AM +0900, Kyotaro Horiguchi wrote:
At Tue, 08 Sep 2020 09:13:53 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
At Mon, 7 Sep 2020 02:32:55 -0700, Noah Misch <noah@leadboat.com> wrote in
As a PoC, this looks promising. Thanks. Would you add a test case such that
the following demonstrates the bug in the absence of your PoC?printf '%s\n%s\n%s\n' 'log_statement = all' 'wal_level = minimal' 'max_wal_senders = 0' >/tmp/minimal.conf
make check TEMP_CONFIG=/tmp/minimal.confMmm. I was close to add some tests to 018_wal_optimize.pl but your
suggestion seems better. I added several ines to create_index.sql.Please have the test try both a nailed-and-mapped relation and a "nailed, but
not mapped" relation. I am fairly confident that your PoC fixes the former
case, but the latter may need additional code.Mmm. You're right. I choosed pg_amproc_fam_proc_index as
nailed-but-not-mapped index.I fixed a typo (s/staring/starting/).
At a glance, this looks reasonable. If a closer look doesn't reveal problems,
I'll push this.
On Mon, Sep 07, 2020 at 07:47:09PM -0700, Noah Misch wrote:
On Tue, Sep 08, 2020 at 10:43:32AM +0900, Kyotaro Horiguchi wrote:
At Tue, 08 Sep 2020 09:13:53 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
At Mon, 7 Sep 2020 02:32:55 -0700, Noah Misch <noah@leadboat.com> wrote in
As a PoC, this looks promising. Thanks. Would you add a test case such that
the following demonstrates the bug in the absence of your PoC?printf '%s\n%s\n%s\n' 'log_statement = all' 'wal_level = minimal' 'max_wal_senders = 0' >/tmp/minimal.conf
make check TEMP_CONFIG=/tmp/minimal.confMmm. I was close to add some tests to 018_wal_optimize.pl but your
suggestion seems better. I added several ines to create_index.sql.
After looking closer, I've moved the test to reindex_catalog.sql; see that
file's header comment for the reasons. One now needs this command:
printf '%s\n%s\n%s\n' 'log_statement = all' 'wal_level = minimal' 'max_wal_senders = 0' >/tmp/minimal.conf
make check-tests TESTS=reindex_catalog TEMP_CONFIG=/tmp/minimal.conf
Please have the test try both a nailed-and-mapped relation and a "nailed, but
not mapped" relation. I am fairly confident that your PoC fixes the former
case, but the latter may need additional code.Mmm. You're right. I choosed pg_amproc_fam_proc_index as
nailed-but-not-mapped index.I fixed a typo (s/staring/starting/).
At a glance, this looks reasonable. If a closer look doesn't reveal problems,
I'll push this.
RelationBuildDesc() calls RelationInitPhysicalAddr(), so RelationBuildDesc()
can stop calling RelFileNodeSkippingWAL(). The attached version makes it so,
and I plan to push it.
Attachments:
skip-wal-nailed-v3.patchtext/plain; charset=us-asciiDownload
Author: Noah Misch <noah@leadboat.com>
Commit: Noah Misch <noah@leadboat.com>
Fix rd_firstRelfilenodeSubid for nailed relations, in parallel workers.
Move applicable code out of RelationBuildDesc(), which nailed relations
bypass. Non-assert builds experienced no known problems. Back-patch to
v13, where commit c6b92041d38512a4176ed76ad06f713d2e6c01a8 introduced
rd_firstRelfilenodeSubid.
Kyotaro Horiguchi. Reported by Justin Pryzby.
Discussion: https://postgr.es/m/20200907023737.GA7158@telsasoft.com
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 96ecad0..9061af8 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1243,14 +1243,6 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
if (insertIt)
RelationCacheInsert(relation, true);
- /*
- * For RelationNeedsWAL() to answer correctly on parallel workers, restore
- * rd_firstRelfilenodeSubid. No subtransactions start or end while in
- * parallel mode, so the specific SubTransactionId does not matter.
- */
- if (IsParallelWorker() && RelFileNodeSkippingWAL(relation->rd_node))
- relation->rd_firstRelfilenodeSubid = TopSubTransactionId;
-
/* It's fully valid */
relation->rd_isvalid = true;
@@ -1273,6 +1265,8 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
static void
RelationInitPhysicalAddr(Relation relation)
{
+ Oid oldnode = relation->rd_node.relNode;
+
/* these relations kinds never have storage */
if (!RELKIND_HAS_STORAGE(relation->rd_rel->relkind))
return;
@@ -1330,6 +1324,19 @@ RelationInitPhysicalAddr(Relation relation)
elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
RelationGetRelationName(relation), relation->rd_id);
}
+
+ /*
+ * For RelationNeedsWAL() to answer correctly on parallel workers, restore
+ * rd_firstRelfilenodeSubid. No subtransactions start or end while in
+ * parallel mode, so the specific SubTransactionId does not matter.
+ */
+ if (IsParallelWorker() && oldnode != relation->rd_node.relNode)
+ {
+ if (RelFileNodeSkippingWAL(relation->rd_node))
+ relation->rd_firstRelfilenodeSubid = TopSubTransactionId;
+ else
+ relation->rd_firstRelfilenodeSubid = InvalidSubTransactionId;
+ }
}
/*
diff --git a/src/test/regress/expected/reindex_catalog.out b/src/test/regress/expected/reindex_catalog.out
index 4b5fba4..204f056 100644
--- a/src/test/regress/expected/reindex_catalog.out
+++ b/src/test/regress/expected/reindex_catalog.out
@@ -36,3 +36,13 @@ REINDEX INDEX pg_index_indexrelid_index; -- non-mapped, non-shared, critical
REINDEX INDEX pg_index_indrelid_index; -- non-mapped, non-shared, non-critical
REINDEX INDEX pg_database_oid_index; -- mapped, shared, critical
REINDEX INDEX pg_shdescription_o_c_index; -- mapped, shared, non-critical
+-- Check the same REINDEX INDEX statements under parallelism.
+BEGIN;
+SET min_parallel_table_scan_size = 0;
+REINDEX INDEX pg_class_oid_index; -- mapped, non-shared, critical
+REINDEX INDEX pg_class_relname_nsp_index; -- mapped, non-shared, non-critical
+REINDEX INDEX pg_index_indexrelid_index; -- non-mapped, non-shared, critical
+REINDEX INDEX pg_index_indrelid_index; -- non-mapped, non-shared, non-critical
+REINDEX INDEX pg_database_oid_index; -- mapped, shared, critical
+REINDEX INDEX pg_shdescription_o_c_index; -- mapped, shared, non-critical
+ROLLBACK;
diff --git a/src/test/regress/sql/reindex_catalog.sql b/src/test/regress/sql/reindex_catalog.sql
index 87ecf52..8203641 100644
--- a/src/test/regress/sql/reindex_catalog.sql
+++ b/src/test/regress/sql/reindex_catalog.sql
@@ -39,3 +39,14 @@ REINDEX INDEX pg_index_indexrelid_index; -- non-mapped, non-shared, critical
REINDEX INDEX pg_index_indrelid_index; -- non-mapped, non-shared, non-critical
REINDEX INDEX pg_database_oid_index; -- mapped, shared, critical
REINDEX INDEX pg_shdescription_o_c_index; -- mapped, shared, non-critical
+
+-- Check the same REINDEX INDEX statements under parallelism.
+BEGIN;
+SET min_parallel_table_scan_size = 0;
+REINDEX INDEX pg_class_oid_index; -- mapped, non-shared, critical
+REINDEX INDEX pg_class_relname_nsp_index; -- mapped, non-shared, non-critical
+REINDEX INDEX pg_index_indexrelid_index; -- non-mapped, non-shared, critical
+REINDEX INDEX pg_index_indrelid_index; -- non-mapped, non-shared, non-critical
+REINDEX INDEX pg_database_oid_index; -- mapped, shared, critical
+REINDEX INDEX pg_shdescription_o_c_index; -- mapped, shared, non-critical
+ROLLBACK;
Thank you for the checking.
At Tue, 8 Sep 2020 21:05:43 -0700, Noah Misch <noah@leadboat.com> wrote in
On Mon, Sep 07, 2020 at 07:47:09PM -0700, Noah Misch wrote:
After looking closer, I've moved the test to reindex_catalog.sql; see that
file's header comment for the reasons. One now needs this command:printf '%s\n%s\n%s\n' 'log_statement = all' 'wal_level = minimal' 'max_wal_senders = 0' >/tmp/minimal.conf
make check-tests TESTS=reindex_catalog TEMP_CONFIG=/tmp/minimal.conf
I missed the file. It is surely the best place.
RelationBuildDesc() calls RelationInitPhysicalAddr(), so RelationBuildDesc()
can stop calling RelFileNodeSkippingWAL(). The attached version makes it so,
and I plan to push it.
Sure. relNode is filled with zeros so RelationInitPhysicalAddr() works
correctly for RelationBuildDesc().
Both changes look good to me.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
At Wed, 09 Sep 2020 14:40:37 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
Sure. relNode is filled with zeros so RelationInitPhysicalAddr() works
correctly for RelationBuildDesc().Both changes look good to me.
Thank you for committing this!
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center