A assert failure when initdb with track_commit_timestamp=on
Hi,
When working with the commit_ts module, I find the following issue:
After configure with --enable-cassert option, then initdb with:
initdb -D x2 -c track_commit_timestamp=on
Then we can get the following core dump:
0 in raise of /lib/x86_64-linux-gnu/libc.so.6
1 in abort of /lib/x86_64-linux-gnu/libc.so.6
2 in ExceptionalCondition of assert.c:66
3 in TransactionIdSetCommitTs of commit_ts.c:257
4 in SetXidCommitTsInPage of commit_ts.c:236
5 in TransactionTreeSetCommitTsData of commit_ts.c:192
6 in RecordTransactionCommit of xact.c:1468
7 in CommitTransaction of xact.c:2365
8 in CommitTransactionCommandInternal of xact.c:3202
9 in CommitTransactionCommand of xact.c:3163
10 in BootstrapModeMain of bootstrap.c:390
11 in main of main.c:210
The reason are TransactionIdSetCommitTs think the given xid must be
normal
static void
TransactionIdSetCommitTs(TransactionId xid, TimestampTz ts,
RepOriginId nodeid, int slotno)
{
...
Assert(TransactionIdIsNormal(xid));
}
However this is not true in BootstrapMode, this failure is masked by
default because TransactionTreeSetCommitTsData returns fast when
track_commit_timestamp is off.
void
TransactionTreeSetCommitTsData(TransactionId xid, int nsubxids,
TransactionId *subxids, TimestampTz timestamp,
RepOriginId nodeid)
{
/*
* No-op if the module is not active.
*
*/
if (!commitTsShared->commitTsActive)
return;
..
}
I can't think out a meaningful reason to record the commit timestamp for a
BootstrapTransactionId or FrozenTransactionId, so I think bypass it in
TransactionTreeSetCommitTsData could be a solution. Another solution is
just removing the Assert in TransactionIdSetCommitTs, it works during
initdb test at least.
I include both fixes in the attachment, I think just one of them could
be adopted however.
--
Best Regards
Andy Fan
Attachments:
v1-0001-Fix-the-Assert-failure-when-initdb-with-track_com.patchtext/x-diffDownload
From b5e921ef42763dbb1126d60313b25ae40f8ec140 Mon Sep 17 00:00:00 2001
From: Andy Fan <zhihuifan1213@163.com>
Date: Tue, 1 Jul 2025 23:50:37 +0000
Subject: [PATCH v1 1/1] Fix the Assert failure when initdb with
track_commit_timestamp=on
the real commit message depends on which solution is adopted.
---
src/backend/access/transam/commit_ts.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 113fae1437a..da8bfb7167c 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -157,6 +157,13 @@ TransactionTreeSetCommitTsData(TransactionId xid, int nsubxids,
if (!commitTsShared->commitTsActive)
return;
+ /*
+ * There is no point to record a commit_ts for BootStrapCommitTs or
+ * FrozenTransactionId.
+ */
+ if (unlikely(xid == BootstrapTransactionId || xid == FrozenTransactionId))
+ return;
+
/*
* Figure out the latest Xid in this batch: either the last subxid if
* there's any, otherwise the parent xid.
@@ -252,8 +259,6 @@ TransactionIdSetCommitTs(TransactionId xid, TimestampTz ts,
int entryno = TransactionIdToCTsEntry(xid);
CommitTimestampEntry entry;
- Assert(TransactionIdIsNormal(xid));
-
entry.time = ts;
entry.nodeid = nodeid;
--
2.45.1
On Wed, Jul 02, 2025 at 12:38:01AM +0000, Andy Fan wrote:
However this is not true in BootstrapMode, this failure is masked by
default because TransactionTreeSetCommitTsData returns fast when
track_commit_timestamp is off.
Agreed that there is no point in registering a commit timestamp in
the cases of a frozen and bootstrap XIDs. I would recommend to keep
the assertion in TransactionIdSetCommitTs(), though, that still looks
useful to me for the many callers of this routine, at least as a
sanity check.
I did not check, but usually we apply filters based on
IsBootstrapProcessingMode() for code paths that we do not want to
reach while in bootstrap mode. Could the same be done here?
--
Michael
Michael Paquier <michael@paquier.xyz> writes:
Hi,
On Wed, Jul 02, 2025 at 12:38:01AM +0000, Andy Fan wrote:
However this is not true in BootstrapMode, this failure is masked by
default because TransactionTreeSetCommitTsData returns fast when
track_commit_timestamp is off.Agreed that there is no point in registering a commit timestamp in
the cases of a frozen and bootstrap XIDs. I would recommend to keep
the assertion in TransactionIdSetCommitTs(), though, that still looks
useful to me for the many callers of this routine, at least as a
sanity check.
Yes, The assert also guard the InvalidTransactionId. So I removed this
solution in v2. Another reason for this is: if we allowed
BooststrapTransactionId in the commit_ts, it introduces something new to
this module when initdb with track_commit_timestamp=on. This risk might
be very low, but it can be avoided easily with the another solution.
I did not check, but usually we apply filters based on
IsBootstrapProcessingMode() for code paths that we do not want to
reach while in bootstrap mode. Could the same be done here?
I think you are right. so I used IsBootstrapProcessingMode in v2.
--
Best Regards
Andy Fan
Attachments:
v2-0001-Don-t-record-commit_ts-in-bootstarp-mode.patchtext/x-diffDownload
From def54f2c6a73aec74aebde9b687612f65a8bcdd5 Mon Sep 17 00:00:00 2001
From: Andy Fan <zhihuifan1213@163.com>
Date: Wed, 2 Jul 2025 01:23:34 +0000
Subject: [PATCH v2 1/1] Don't record commit_ts in bootstarp mode.
This case would raise Assert failure in TransactionIdSetCommitTs when
initdb with track_commit_timestamp=on.
Another option might be allowing BooststrapTransactionId in
TransactionIdSetCommitTs, but this adds a new change in
commit_ts module, which risk could be avoided easily with current
solution.
---
src/backend/access/transam/commit_ts.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 113fae1437a..1aa2dc450f5 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -157,6 +157,12 @@ TransactionTreeSetCommitTsData(TransactionId xid, int nsubxids,
if (!commitTsShared->commitTsActive)
return;
+ /*
+ * Don't bother to record commit_ts for Booststrap mode.
+ */
+ if (IsBootstrapProcessingMode())
+ return;
+
/*
* Figure out the latest Xid in this batch: either the last subxid if
* there's any, otherwise the parent xid.
--
2.45.1
Hi,
On Wed, Jul 02, 2025 at 01:38:18AM +0000, Andy Fan wrote:
Michael Paquier <michael@paquier.xyz> writes:
Hi,
On Wed, Jul 02, 2025 at 12:38:01AM +0000, Andy Fan wrote:
However this is not true in BootstrapMode, this failure is masked by
default because TransactionTreeSetCommitTsData returns fast when
track_commit_timestamp is off.Agreed that there is no point in registering a commit timestamp in
the cases of a frozen and bootstrap XIDs. I would recommend to keep
the assertion in TransactionIdSetCommitTs(), though, that still looks
useful to me for the many callers of this routine, at least as a
sanity check.Yes, The assert also guard the InvalidTransactionId. So I removed this
solution in v2. Another reason for this is: if we allowed
BooststrapTransactionId in the commit_ts, it introduces something new to
this module when initdb with track_commit_timestamp=on. This risk might
be very low, but it can be avoided easily with the another solution.I did not check, but usually we apply filters based on
IsBootstrapProcessingMode() for code paths that we do not want to
reach while in bootstrap mode. Could the same be done here?I think you are right. so I used IsBootstrapProcessingMode in v2.
Thanks for the report and the patch.
Yeah, I also think that making use of IsBootstrapProcessingMode() is the
right thing to do here.
=== 1
+ * Don't bother to record commit_ts for Booststrap mode.
typo: s/Booststrap/Bootstrap/
Also, grep on "Bootstrap mode" and "bootstrap mode" gives much more occurrences
for the later, so maybe use "bootstrap mode" instead?
=== 2
The FrozenTransactionId case is missing in v2, is that on purpose?
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Wed, Jul 02, 2025 at 09:03:35AM +0000, Bertrand Drouvot wrote:
+ * Don't bother to record commit_ts for Booststrap mode.
typo: s/Booststrap/Bootstrap/
Also, grep on "Bootstrap mode" and "bootstrap mode" gives much more occurrences
for the later, so maybe use "bootstrap mode" instead?=== 2
The FrozenTransactionId case is missing in v2, is that on purpose?
There may be an argument about bypassing the frozen case as well, but
I cannot get excited about that because it can also be very useful to
detect problems as calling this routine with a frozen XID would be
really wrong. The assert in TransactionIdSetCommitTs() is enough for
this purpose, no need to add something else. As of now
TransactionTreeSetCommitTsData() is only called with an active XID at
commit, but if somebody has too much imagination in a module..
As a whole I think that the addition of IsBootstrapProcessingMode() in
TransactionTreeSetCommitTsData() is enough. There may be an argument
about backpatching this change. However, nobody has complained about
the idea of enabling the commit_ts module while bootstrapping, so I'd
keep it as a HEAD-only thing and be conservative.
A second thing I would suggest is some test coverage, with the
addition of extra => [ '-c', "track_commit_timestamp=on=on" ] in an
init() method called in one of the script tests of
src/test/modules/commit_ts/t. Just changing 001_base.pl should be
enough to trigger your original failure, making sure that this never
breaks again.
--
Michael
On 2025/07/02 9:38, Andy Fan wrote:
Hi,
When working with the commit_ts module, I find the following issue:
After configure with --enable-cassert option, then initdb with:
initdb -D x2 -c track_commit_timestamp=on
Then we can get the following core dump:
Is this the same issue that was discussed in [1]/messages/by-id/OSCPR01MB14966FF9E4C4145F37B937E52F5102@OSCPR01MB14966.jpnprd01.prod.outlook.com?
Regards,
[1]: /messages/by-id/OSCPR01MB14966FF9E4C4145F37B937E52F5102@OSCPR01MB14966.jpnprd01.prod.outlook.com
--
Fujii Masao
NTT DATA Japan Corporation
On Thu, Jul 03, 2025 at 09:48:40AM +0900, Fujii Masao wrote:
Is this the same issue that was discussed in [1]?
[1] /messages/by-id/OSCPR01MB14966FF9E4C4145F37B937E52F5102@OSCPR01MB14966.jpnprd01.prod.outlook.com
Ah, indeed, so it was reported a couple of months ago. I am not sure
that the argument about all the other GUCs potentially impacted holds
much value; we are talking about a specific code path.
--
Michael
Dear Michael, Fujii-san,
Ah, indeed, so it was reported a couple of months ago. I am not sure
that the argument about all the other GUCs potentially impacted holds
much value; we are talking about a specific code path.
Yeah, I did report but sadly it was missed by others :-(. To clarify,
The current patch looks good to me.
Best regards,
Hayato Kuroda
FUJITSU LIMITED
"Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> writes:
Dear Michael, Fujii-san,
Ah, indeed, so it was reported a couple of months ago. I am not sure
that the argument about all the other GUCs potentially impacted holds
much value; we are talking about a specific code path.Yeah, I did report but sadly it was missed by others :-(. To clarify,
The current patch looks good to me.
Then I'd thank Michael to watch the maillist closely this time.
I checked the fix suggested by Hayato, I think his patch is better than
me because his patch checks at the startup time while my patch checks at
each time of RecordTransactionCommit. So v3 takes his patch. v3 also
added the testcase suggested by Michael for test coverage, it clearly
proves the bug is fixed now.
--
Best Regards
Andy Fan
Attachments:
v3-0001-Avoid-activating-commit_ts-while-bootstrap.patchtext/x-diffDownload
From dab43830af019896ef1811aeeeb5d1e4e2bcccea Mon Sep 17 00:00:00 2001
From: Andy Fan <zhihuifan1213@163.com>
Date: Thu, 3 Jul 2025 13:04:02 +0000
Subject: [PATCH v3 1/1] Avoid activating commit_ts while bootstrap.
TransactionIdSetCommitTs had an Assert failure when initdb with
track_commit_timestamp=on before this commit because it thought all the
provided xids are normal xid. So avoiding the activating commit_ts in
bootstrap mode could fix this issue. Test case is also enhanced to
discover this issue.
Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Author: Andy Fan <zhihuifan1213@163.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Discussion: https://postgr.es/m/OSCPR01MB14966FF9E4C4145F37B937E52F5102@OSCPR01MB14966.jpnprd01.prod.outlook.com
Discussion: https://postgr.es/m/87plejmnpy.fsf%40163.com
---
src/backend/access/transam/commit_ts.c | 7 +++++++
src/test/modules/commit_ts/t/001_base.pl | 2 +-
2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 113fae1437a..15078698cf9 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -707,6 +707,13 @@ ActivateCommitTs(void)
TransactionId xid;
int64 pageno;
+ /*
+ * commit_ts assumes that we are not in the bootstrap mode: skip the
+ * activation.
+ */
+ if (IsBootstrapProcessingMode())
+ return;
+
/* If we've done this already, there's nothing to do */
LWLockAcquire(CommitTsLock, LW_EXCLUSIVE);
if (commitTsShared->commitTsActive)
diff --git a/src/test/modules/commit_ts/t/001_base.pl b/src/test/modules/commit_ts/t/001_base.pl
index 1953b18f6b3..91983fc5179 100644
--- a/src/test/modules/commit_ts/t/001_base.pl
+++ b/src/test/modules/commit_ts/t/001_base.pl
@@ -11,7 +11,7 @@ use Test::More;
use PostgreSQL::Test::Cluster;
my $node = PostgreSQL::Test::Cluster->new('foxtrot');
-$node->init;
+$node->init(extra => ['-c', "track_commit_timestamp=on"]);
$node->append_conf('postgresql.conf', 'track_commit_timestamp = on');
$node->start;
--
2.45.1
On 2025/07/03 22:31, Andy Fan wrote:
"Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> writes:
Dear Michael, Fujii-san,
Ah, indeed, so it was reported a couple of months ago. I am not sure
that the argument about all the other GUCs potentially impacted holds
much value; we are talking about a specific code path.Yeah, I did report but sadly it was missed by others :-(. To clarify,
The current patch looks good to me.Then I'd thank Michael to watch the maillist closely this time.
I checked the fix suggested by Hayato, I think his patch is better than
me because his patch checks at the startup time while my patch checks at
each time of RecordTransactionCommit. So v3 takes his patch. v3 also
added the testcase suggested by Michael for test coverage, it clearly
proves the bug is fixed now.
The patch looks good to me.
By the way, although it's a separate issue, I noticed that running
initdb -c transaction_timeout=1 causes an assertion failure:
running bootstrap script ... TRAP: failed Assert("all_timeouts_initialized"), File: "timeout.c", Line: 164, PID: 22057
0 postgres 0x00000001105d9d02 ExceptionalCondition + 178
1 postgres 0x0000000110612af7 enable_timeout + 55
2 postgres 0x0000000110612aa9 enable_timeout_after + 73
3 postgres 0x000000010fead8e0 StartTransaction + 816
4 postgres 0x000000010fead4a1 StartTransactionCommand + 65
5 postgres 0x000000010fef01de BootstrapModeMain + 1518
6 postgres 0x0000000110167ef4 main + 676
7 dyld 0x00007ff805092530 start + 3056
child process was terminated by signal 6: Abort trap: 6
This happens because enable_timeout() tries to activate the transaction
timeout before InitializeTimeouts() has been called — in other words,
the timeout system hasn't been initialized yet. To fix this, we might
need to forcibly set transaction_timeout to 0 during bootstrap.
Regards,
--
Fujii Masao
NTT DATA Japan Corporation
On Fri, Jul 04, 2025 at 01:26:19AM +0900, Fujii Masao wrote:
On 2025/07/03 22:31, Andy Fan wrote:
I checked the fix suggested by Hayato, I think his patch is better than
me because his patch checks at the startup time while my patch checks at
each time of RecordTransactionCommit. So v3 takes his patch. v3 also
added the testcase suggested by Michael for test coverage, it clearly
proves the bug is fixed now.The patch looks good to me.
I was wondering if we should backpatch that, and decided towards a
yes here as it can be annoying. 3e51b278db6a has introduced the -c
switch in initdb in v16, but that could be reached as well if one
touches at some initialization path, perhaps in a fork. Done, down
to v13.
Let's tackle the rest separately.
--
Michael
Dear Fujii-san,
By the way, although it's a separate issue, I noticed that running
initdb -c transaction_timeout=1 causes an assertion failure:
I feel it may be able to discuss in other places but let me say one comment.
running bootstrap script ... TRAP: failed Assert("all_timeouts_initialized"), File:
"timeout.c", Line: 164, PID: 22057
0 postgres 0x00000001105d9d02
ExceptionalCondition + 178
1 postgres 0x0000000110612af7
enable_timeout + 55
2 postgres 0x0000000110612aa9
enable_timeout_after + 73
3 postgres 0x000000010fead8e0
StartTransaction + 816
4 postgres 0x000000010fead4a1
StartTransactionCommand + 65
5 postgres 0x000000010fef01de
BootstrapModeMain + 1518
6 postgres 0x0000000110167ef4 main + 676
7 dyld 0x00007ff805092530 start + 3056
child process was terminated by signal 6: Abort trap: 6This happens because enable_timeout() tries to activate the transaction
timeout before InitializeTimeouts() has been called ? in other words,
the timeout system hasn't been initialized yet. To fix this, we might
need to forcibly set transaction_timeout to 0 during bootstrap.
If more GUCs were found which cannot be set during the bootstrap mode, how about
introducing a new flag like GUC_DEFAULT_WHILE_BOOTSTRAPPING for GUC variables?
If the flag is set all setting can be ignored when
IsBootstrapProcessingMode() = true.
Best regards,
Hayato Kuroda
FUJITSU LIMITED
On 2025/07/04 16:29, Hayato Kuroda (Fujitsu) wrote:
Dear Fujii-san,
By the way, although it's a separate issue, I noticed that running
initdb -c transaction_timeout=1 causes an assertion failure:I feel it may be able to discuss in other places
OK, I've started a new thread for this issue at [1]/messages/by-id/a68fae7d-f45a-4c70-8d90-2a2cd3bdcfca@oss.nttdata.com.
If more GUCs were found which cannot be set during the bootstrap mode, how about
introducing a new flag like GUC_DEFAULT_WHILE_BOOTSTRAPPING for GUC variables?
If the flag is set all setting can be ignored when
IsBootstrapProcessingMode() = true.
If there are many GUCs that behave incorrectly during bootstrap,
a general mechanism like that might be worth considering. But if
only a few GUCs are affected, as I believe is the case,
then such a mechanism may be overkill.
In that case, IMO it should be sufficient to disable the problematic
GUCs individually, for example by calling SetConfigOption(..., PGC_S_OVERRIDE).
Regards,
[1]: /messages/by-id/a68fae7d-f45a-4c70-8d90-2a2cd3bdcfca@oss.nttdata.com
--
Fujii Masao
NTT DATA Japan Corporation
Fujii Masao <masao.fujii@oss.nttdata.com> writes:
On 2025/07/04 16:29, Hayato Kuroda (Fujitsu) wrote:
If more GUCs were found which cannot be set during the bootstrap mode, how about
introducing a new flag like GUC_DEFAULT_WHILE_BOOTSTRAPPING for GUC variables?
If the flag is set all setting can be ignored when
IsBootstrapProcessingMode() = true.
If there are many GUCs that behave incorrectly during bootstrap,
a general mechanism like that might be worth considering. But if
only a few GUCs are affected, as I believe is the case,
then such a mechanism may be overkill.
As I remarked in the other thread, I don't like inventing a different
solution for each GUC. So if there are even two that need something
done, I think Hayato-san's idea has merit.
The core of the patch could be as little as
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 667df448732..43f289924e6 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3464,6 +3464,15 @@ set_config_with_handle(const char *name, config_handle *handle,
return 0;
}
+ /*
+ * Certain GUCs aren't safe to enable during bootstrap mode. Silently
+ * ignore attempts to set them to non-default values.
+ */
+ if (unlikely(IsBootstrapProcessingMode()) &&
+ (record->flags & GUC_IGNORE_IN_BOOTSTRAP) &&
+ source != PGC_S_DEFAULT)
+ changeVal = false;
+
/*
* Check if the option can be set at this time. See guc.h for the precise
* rules.
If we went this way, we'd presumably revert 5a6c39b6d in favor
of marking track_commit_timestamp with this flag.
regards, tom lane
On 2025/07/05 0:30, Tom Lane wrote:
Fujii Masao <masao.fujii@oss.nttdata.com> writes:
On 2025/07/04 16:29, Hayato Kuroda (Fujitsu) wrote:
If more GUCs were found which cannot be set during the bootstrap mode, how about
introducing a new flag like GUC_DEFAULT_WHILE_BOOTSTRAPPING for GUC variables?
If the flag is set all setting can be ignored when
IsBootstrapProcessingMode() = true.If there are many GUCs that behave incorrectly during bootstrap,
a general mechanism like that might be worth considering. But if
only a few GUCs are affected, as I believe is the case,
then such a mechanism may be overkill.As I remarked in the other thread, I don't like inventing a different
solution for each GUC. So if there are even two that need something
done, I think Hayato-san's idea has merit.The core of the patch could be as little as
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 667df448732..43f289924e6 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -3464,6 +3464,15 @@ set_config_with_handle(const char *name, config_handle *handle, return 0; }+ /* + * Certain GUCs aren't safe to enable during bootstrap mode. Silently + * ignore attempts to set them to non-default values. + */ + if (unlikely(IsBootstrapProcessingMode()) && + (record->flags & GUC_IGNORE_IN_BOOTSTRAP) && + source != PGC_S_DEFAULT) + changeVal = false; + /* * Check if the option can be set at this time. See guc.h for the precise * rules.
This code seems to assume that the processing mode is switched to bootstrap before
GUC parameters are processed. But is that actually the case?
Regards,
--
Fujii Masao
NTT DATA Japan Corporation
Fujii Masao <masao.fujii@oss.nttdata.com> writes:
On 2025/07/05 0:30, Tom Lane wrote:
As I remarked in the other thread, I don't like inventing a different
solution for each GUC. So if there are even two that need something
done, I think Hayato-san's idea has merit.
This code seems to assume that the processing mode is switched to bootstrap before
GUC parameters are processed. But is that actually the case?
Oh, good point. But there doesn't seem to be any ill effect from
making BootstrapModeMain set BootstrapProcessing a bit earlier.
Attached is a proof-of-concept that I've actually tested.
However, what I find with this POC is that
initdb -c transaction_timeout=10s
goes through fine, but (at least on my machine)
initdb -c transaction_timeout=1
yields
...
running bootstrap script ... ok
performing post-bootstrap initialization ... 2025-07-04 13:08:04.225 EDT [261836] FATAL: terminating connection due to transaction timeout
child process exited with exit code 1
because 1ms is not enough time to complete the post-bootstrap run.
I would argue that that's pilot error and we did exactly what the
user demanded, but is there anyone who wants to say that we should
suppress such GUCs during post-bootstrap too?
regards, tom lane
Attachments:
poc-survive-transaction_timeout-in-initdb.patchtext/x-diff; charset=us-ascii; name=poc-survive-transaction_timeout-in-initdb.patchDownload
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index fc8638c1b61..facad43c74c 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -220,6 +220,9 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
argv++;
argc--;
+ SetProcessingMode(BootstrapProcessing);
+ IgnoreSystemIndexes = true;
+
while ((flag = getopt(argc, argv, "B:c:d:D:Fkr:X:-:")) != -1)
{
switch (flag)
@@ -321,9 +324,6 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
CreateDataDirLockFile(false);
- SetProcessingMode(BootstrapProcessing);
- IgnoreSystemIndexes = true;
-
InitializeMaxBackends();
/*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 667df448732..9555b363c34 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3464,6 +3464,15 @@ set_config_with_handle(const char *name, config_handle *handle,
return 0;
}
+ /*
+ * Certain GUCs aren't safe to enable during bootstrap mode. Silently
+ * ignore attempts to set them to non-default values.
+ */
+ if (unlikely(IsBootstrapProcessingMode()) &&
+ strcmp(record->name, "transaction_timeout") == 0 &&
+ source != PGC_S_DEFAULT)
+ changeVal = false;
+
/*
* Check if the option can be set at this time. See guc.h for the precise
* rules.
On Fri, Jul 04, 2025 at 11:30:17AM -0400, Tom Lane wrote:
As I remarked in the other thread, I don't like inventing a different
solution for each GUC. So if there are even two that need something
done, I think Hayato-san's idea has merit.+ /* + * Certain GUCs aren't safe to enable during bootstrap mode. Silently + * ignore attempts to set them to non-default values. + */ + if (unlikely(IsBootstrapProcessingMode()) && + (record->flags & GUC_IGNORE_IN_BOOTSTRAP) && + source != PGC_S_DEFAULT) + changeVal = false; + /* * Check if the option can be set at this time. See guc.h for the precise * rules.
This is assuming that the default value assigned to a GUC will always
take the right decision in the bootstrap case, which is perhaps OK
anyway in most cases, or we would know about that during initdb. I'd
be OK with something among these lines, yes. I have doubts if we can
really safely place the initial GUC loading to happen after the
postmaster is switched to bootstrap mode, and if it's a viable
strategy in the long-term..
If we went this way, we'd presumably revert 5a6c39b6d in favor
of marking track_commit_timestamp with this flag.
Makes sense, on HEAD.
--
Michael
Michael Paquier <michael@paquier.xyz> writes:
This is assuming that the default value assigned to a GUC will always
take the right decision in the bootstrap case, which is perhaps OK
anyway in most cases, or we would know about that during initdb.
Yeah, I've been wondering about whether the code ought to accept
source == PGC_S_DYNAMIC_DEFAULT. It doesn't matter until/unless
we need to set this flag on a GUC that has code to compute a
dynamic default, so any decision we made right now would be made
in a vacuum ... but perhaps the right guess is to allow it.
If we went this way, we'd presumably revert 5a6c39b6d in favor
of marking track_commit_timestamp with this flag.
Makes sense, on HEAD.
Well, you back-patched 5a6c39b6d, so it's not clear to me why
we wouldn't want to back-patch something to fix any other GUC
suffering from a comparable problem. I don't see that adding
another possible GUC flag is an ABI break, especially when it's
a flag that no extension could have a need to set.
regards, tom lane
On 2025/07/05 2:17, Tom Lane wrote:
Fujii Masao <masao.fujii@oss.nttdata.com> writes:
On 2025/07/05 0:30, Tom Lane wrote:
As I remarked in the other thread, I don't like inventing a different
solution for each GUC. So if there are even two that need something
done, I think Hayato-san's idea has merit.This code seems to assume that the processing mode is switched to bootstrap before
GUC parameters are processed. But is that actually the case?Oh, good point. But there doesn't seem to be any ill effect from
making BootstrapModeMain set BootstrapProcessing a bit earlier.
Maybe. But I noticed that your patch also moves the line "IgnoreSystemIndexes = true;"
earlier. Why did you make this change?
This could cause initdb to fail with a PANIC error when run with ignore_system_indexes=off,
like this:
$ initdb -D data -c ignore_system_indexes=off
...
FATAL: could not open relation with OID 2703
PANIC: cannot abort transaction 1, it was already committed
So perhaps "IgnoreSystemIndexes = true;" should be placed after GUCs are processed?
Or GUC ignore_system_indexes also should be treated in the same way
as transaction_timeout?
Regards,
--
Fujii Masao
NTT DATA Japan Corporation
Fujii Masao <masao.fujii@oss.nttdata.com> writes:
On 2025/07/05 2:17, Tom Lane wrote:
Oh, good point. But there doesn't seem to be any ill effect from
making BootstrapModeMain set BootstrapProcessing a bit earlier.
Maybe. But I noticed that your patch also moves the line "IgnoreSystemIndexes = true;"
earlier. Why did you make this change?
It just seemed to go with the bootstrap-mode setting. But your
example shows differently:
This could cause initdb to fail with a PANIC error when run with ignore_system_indexes=off,
like this:
$ initdb -D data -c ignore_system_indexes=off
...
FATAL: could not open relation with OID 2703
PANIC: cannot abort transaction 1, it was already committed
So perhaps "IgnoreSystemIndexes = true;" should be placed after GUCs are processed?
Yeah, we should do it like that (and probably also have a comment...)
Or GUC ignore_system_indexes also should be treated in the same way
as transaction_timeout?
Yes, I'd say we ought to mark that GUC as don't-accept-in-bootstrap
too. I've not done any research about what other GUCs can break
initdb, but now I'm starting to suspect there are several.
BTW, I now realize that this is only an issue starting from v16.
Before that initdb didn't have a -c switch, so there was not a
way for people to shove random settings into it.
regards, tom lane
I wrote:
Fujii Masao <masao.fujii@oss.nttdata.com> writes:
Or GUC ignore_system_indexes also should be treated in the same way
as transaction_timeout?
Yes, I'd say we ought to mark that GUC as don't-accept-in-bootstrap
too. I've not done any research about what other GUCs can break
initdb, but now I'm starting to suspect there are several.
Here's a fleshed-out implementation of Hayato-san's idea. I've
not done anything about reverting 5a6c39b6d, nor have I done any
checks to see if there are other GUCs we ought to mark similarly.
(But at this point I'd be prepared to bet that there are.)
regards, tom lane
Attachments:
v1-disallow-setting-some-GUCs-in-bootstrap.patchtext/x-diff; charset=us-ascii; name=v1-disallow-setting-some-GUCs-in-bootstrap.patchDownload
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index fc8638c1b61..f6ca9e8632c 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -213,6 +213,12 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
/* Set defaults, to be overridden by explicit options below */
InitializeGUCOptions();
+ /* Override ignore_system_indexes: we have no indexes during bootstrap */
+ IgnoreSystemIndexes = true;
+
+ /* Set bootstrap mode; note that this locks down values of some GUCs */
+ SetProcessingMode(BootstrapProcessing);
+
/* an initial --boot or --check should be present */
Assert(argc > 1
&& (strcmp(argv[1], "--boot") == 0
@@ -321,9 +327,6 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
CreateDataDirLockFile(false);
- SetProcessingMode(BootstrapProcessing);
- IgnoreSystemIndexes = true;
-
InitializeMaxBackends();
/*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 667df448732..215a20a1f06 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3464,6 +3464,15 @@ set_config_with_handle(const char *name, config_handle *handle,
return 0;
}
+ /*
+ * Certain GUCs aren't safe to enable during bootstrap mode. Silently
+ * ignore attempts to set them to non-default values.
+ */
+ if (unlikely(IsBootstrapProcessingMode()) &&
+ (record->flags & GUC_NOT_IN_BOOTSTRAP) &&
+ source > PGC_S_DYNAMIC_DEFAULT)
+ changeVal = false;
+
/*
* Check if the option can be set at this time. See guc.h for the precise
* rules.
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 511dc32d519..064c6ba09e2 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -1089,7 +1089,8 @@ struct config_bool ConfigureNamesBool[] =
{
{"track_commit_timestamp", PGC_POSTMASTER, REPLICATION_SENDING,
gettext_noop("Collects transaction commit time."),
- NULL
+ NULL,
+ GUC_NOT_IN_BOOTSTRAP
},
&track_commit_timestamp,
false,
@@ -1929,7 +1930,7 @@ struct config_bool ConfigureNamesBool[] =
gettext_noop("Disables reading from system indexes."),
gettext_noop("It does not prevent updating the indexes, so it is safe "
"to use. The worst consequence is slowness."),
- GUC_NOT_IN_SAMPLE
+ GUC_NOT_IN_SAMPLE | GUC_NOT_IN_BOOTSTRAP
},
&IgnoreSystemIndexes,
false,
@@ -2763,7 +2764,7 @@ struct config_int ConfigureNamesInt[] =
{"transaction_timeout", PGC_USERSET, CLIENT_CONN_STATEMENT,
gettext_noop("Sets the maximum allowed duration of any transaction within a session (not a prepared transaction)."),
gettext_noop("0 disables the timeout."),
- GUC_UNIT_MS
+ GUC_UNIT_MS | GUC_NOT_IN_BOOTSTRAP
},
&TransactionTimeout,
0, 0, INT_MAX,
diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index f619100467d..0e7e97dabf0 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -228,6 +228,7 @@ typedef enum
0x002000 /* can't set in PG_AUTOCONF_FILENAME */
#define GUC_RUNTIME_COMPUTED 0x004000 /* delay processing in 'postgres -C' */
#define GUC_ALLOW_IN_PARALLEL 0x008000 /* allow setting in parallel mode */
+#define GUC_NOT_IN_BOOTSTRAP 0x010000 /* can't set in bootstrap mode */
#define GUC_UNIT_KB 0x01000000 /* value is in kilobytes */
#define GUC_UNIT_BLOCKS 0x02000000 /* value is in blocks */
I wrote:
Fujii Masao <masao.fujii@oss.nttdata.com> writes:
Or GUC ignore_system_indexes also should be treated in the same way
as transaction_timeout?Yes, I'd say we ought to mark that GUC as don't-accept-in-bootstrap
too. I've not done any research about what other GUCs can break
initdb, but now I'm starting to suspect there are several.Here's a fleshed-out implementation of Hayato-san's idea. I've
not done anything about reverting 5a6c39b6d, nor have I done any
checks to see if there are other GUCs we ought to mark similarly.
(But at this point I'd be prepared to bet that there are.)
I pay my attention to two cases, both of them are good.
(1). Revert the old commit 5a6c39b6d first, and apply your patch. verify
initdb with -c transaction_timeout and track_commit_timestamp, both of
them works well. transaction_timeout with a smaller value raise
transaction_timeout error. and a biggger value works well.
(2). after (1), check values of transaction_timeout and
track_commit_timestamp in postgres.conf, both of them are good (with the
default value as the user provided in the initdb commandline).
So the patch looks good to me, thanks for paying attention to this
issue!
--
Best Regards
Andy Fan
On 2025/07/06 3:00, Tom Lane wrote:
I wrote:
Fujii Masao <masao.fujii@oss.nttdata.com> writes:
Or GUC ignore_system_indexes also should be treated in the same way
as transaction_timeout?Yes, I'd say we ought to mark that GUC as don't-accept-in-bootstrap
too. I've not done any research about what other GUCs can break
initdb, but now I'm starting to suspect there are several.Here's a fleshed-out implementation of Hayato-san's idea. I've
not done anything about reverting 5a6c39b6d, nor have I done any
checks to see if there are other GUCs we ought to mark similarly.
(But at this point I'd be prepared to bet that there are.)
Thanks for the patch! It looks good to me.
Shouldn't we also add a TAP test to verify that initdb works correctly
with GUCs marked as GUC_NOT_IN_BOOTSTRAP?
Regards,
--
Fujii Masao
NTT DATA Japan Corporation
On Sat, Jul 05, 2025 at 02:00:07PM -0400, Tom Lane wrote:
Here's a fleshed-out implementation of Hayato-san's idea. I've
not done anything about reverting 5a6c39b6d, nor have I done any
checks to see if there are other GUCs we ought to mark similarly.
(But at this point I'd be prepared to bet that there are.)
I've been reading through the patch (not tested), and no objections
with the concept here. I would have kept that a HEAD-only change due
to the proposed location for SetProcessingMode(), but, well, if I'm
in minority that's fine by me.
--
Michael
Fujii Masao <masao.fujii@oss.nttdata.com> writes:
Shouldn't we also add a TAP test to verify that initdb works correctly
with GUCs marked as GUC_NOT_IN_BOOTSTRAP?
After revert 5a6c39b6d, the test case could be as simply as below: I
also tested this change. Just FYI.
modified src/test/modules/commit_ts/t/001_base.pl
@@ -11,8 +11,7 @@ use Test::More;
use PostgreSQL::Test::Cluster;
my $node = PostgreSQL::Test::Cluster->new('foxtrot');
-$node->init;
-$node->append_conf('postgresql.conf', 'track_commit_timestamp = on');
+$node->init(extra => ['-c', 'track_commit_timestamp=on', "-c", "transaction_timeout=10s" ]);
$node->start;
# Create a table, compare "now()" to the commit TS of its xmin
--
Best Regards
Andy Fan
Dear Fujii-san, Andy,
Shouldn't we also add a TAP test to verify that initdb works correctly
with GUCs marked as GUC_NOT_IN_BOOTSTRAP?
Another place we can put the test is 001_initdb.pl, like:
```
--- a/src/bin/initdb/t/001_initdb.pl
+++ b/src/bin/initdb/t/001_initdb.pl
@@ -331,4 +331,15 @@ command_fails(
[ 'pg_checksums', '--pgdata' => $datadir_nochecksums ],
"pg_checksums fails with data checksum disabled");
+# Some GUCs like track_commit_timestamp cannot be set to non-default value in
+# bootstrap mode becasue they may cause crash. Ensure settings can be surely
+# ignored.
+command_ok(
+ [
+ 'initdb', "$tempdir/dataXX",
+ '-c' => 'track_commit_timestamp=on',
+ '-c' => 'transaction_timeout=200s'
+ ],
+ 'successful creation with ignored settings');
+
```
But both Andy's patch and mine assume that post-bootstrap transactions can be
finished within the specified time. Extremely long value is set above but I
cannot say all machine won't spend 200s here...
Best regards,
Hayato Kuroda
FUJITSU LIMITED
On Wed, Jul 09, 2025 at 02:39:22AM +0000, Hayato Kuroda (Fujitsu) wrote:
+# Some GUCs like track_commit_timestamp cannot be set to non-default value in +# bootstrap mode becasue they may cause crash. Ensure settings can be surely +# ignored. +command_ok( + [ + 'initdb', "$tempdir/dataXX", + '-c' => 'track_commit_timestamp=on', + '-c' => 'transaction_timeout=200s' + ], + 'successful creation with ignored settings'); + ```
I'd suggest to keep them separate across multiple scripts, where they
hold meaning, as one failure may get hidden by the other.
track_commit_timestamp makes sense in the test module commit_ts, we
should select a second location for transaction_timeout if we keep it
at the end.
But both Andy's patch and mine assume that post-bootstrap transactions can be
finished within the specified time. Extremely long value is set above but I
cannot say all machine won't spend 200s here...
A fresh initdb can be longer than this threshold under
CLOBBER_CACHE_ALWAYS, if my memory serves me well. There are some
machines with a valgrind setup, additionally, that can take some time,
but I am not sure about their timings when it comes to a bootstrap
setup.
--
Michael
Dear Michael,
I'd suggest to keep them separate across multiple scripts, where they
hold meaning, as one failure may get hidden by the other.
track_commit_timestamp makes sense in the test module commit_ts, we
should select a second location for transaction_timeout if we keep it
at the end.
OK, so track_commit_timestamp can be tested like what initially did:
```
--- a/src/test/modules/commit_ts/t/001_base.pl
+++ b/src/test/modules/commit_ts/t/001_base.pl
@@ -11,8 +11,7 @@ use Test::More;
use PostgreSQL::Test::Cluster;
my $node = PostgreSQL::Test::Cluster->new('foxtrot');
-$node->init;
-$node->append_conf('postgresql.conf', 'track_commit_timestamp = on');
+$node->init(extra => [ '-c', 'track_commit_timestamp=on' ]);
$node->start;
```
A fresh initdb can be longer than this threshold under
CLOBBER_CACHE_ALWAYS, if my memory serves me well. There are some
machines with a valgrind setup, additionally, that can take some time,
but I am not sure about their timings when it comes to a bootstrap
setup.
Hmm. So I felt that we should not add tests for transaction_timeout for such a slow
environment. Thought?
Best regards,
Hayato Kuroda
FUJITSU LIMITED