Do XID sequences need to be contiguous?
Hackers,
While working on the problem of XID wraparound within the LISTEN/NOTIFY
system, I tried to increment XIDs by more than one per transaction.
This leads to a number of test failures, many which look like:
+ERROR: could not access status of transaction 7485
+DETAIL: Could not read from file "pg_subtrans/0000" at offset 24576:
read too few bytes.
I might not have read the right documentation, but....
I do not see anything in src/backend/access/transam/README nor elsewhere
documenting a design decision or assumption that transaction IDs must
be assigned contiguously. I suppose this is such a fundamental
assumption that it is completely implicit and nobody thought to document
it, but I'd like to check for two reasons:
First, I'd like a good method of burning through transaction ids in
tests designed to check for problems in XID wrap-around.
Second, I'd like to add Asserts where appropriate regarding this
assumption. It seems strange to me that I should have gotten as far
as a failing read() without having tripped an Assert somewhere along the
way.
To duplicate the errors I hit, you can either apply this simple change:
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 33fd052156..360b7335bb 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -83,7 +83,7 @@ FullTransactionIdFromEpochAndXid(uint32 epoch,
TransactionId xid)
static inline void
FullTransactionIdAdvance(FullTransactionId *dest)
{
- dest->value++;
+ dest->value += 2;
while (XidFromFullTransactionId(*dest) < FirstNormalTransactionId)
dest->value++;
}
or apply the much larger WIP patch, attached, and then be sure to
provide the --enable-xidcheck flag to configure before building.
--
Mark Dilger
Attachments:
xidcheck.WIP.patch.1text/plain; charset=UTF-8; name=xidcheck.WIP.patch.1Download
diff --git a/configure b/configure
index 1d88983b34..909907127a 100755
--- a/configure
+++ b/configure
@@ -841,6 +841,7 @@ with_CC
with_llvm
enable_depend
enable_cassert
+enable_xidcheck
enable_thread_safety
with_icu
with_tcl
@@ -1521,6 +1522,7 @@ Optional Features:
--enable-tap-tests enable TAP tests (requires Perl and IPC::Run)
--enable-depend turn on automatic dependency tracking
--enable-cassert enable assertion checks (for debugging)
+ --enable-xidcheck enable transactionid checks (for debugging)
--disable-thread-safety disable thread-safety in client libraries
--disable-largefile omit support for large files
@@ -7197,6 +7199,36 @@ fi
+#
+# Enable transactionid checks
+#
+
+
+# Check whether --enable-xidcheck was given.
+if test "${enable_xidcheck+set}" = set; then :
+ enableval=$enable_xidcheck;
+ case $enableval in
+ yes)
+
+$as_echo "#define USE_XID_CHECKING 1" >>confdefs.h
+
+ ;;
+ no)
+ :
+ ;;
+ *)
+ as_fn_error $? "no argument expected for --enable-xidcheck option" "$LINENO" 5
+ ;;
+ esac
+
+else
+ enable_xidcheck=no
+
+fi
+
+
+
+
#
# Include directories
#
diff --git a/configure.in b/configure.in
index a2cb20b5e3..5aac6c24f1 100644
--- a/configure.in
+++ b/configure.in
@@ -680,6 +680,14 @@ PGAC_ARG_BOOL(enable, cassert, no, [enable assertion checks (for debugging)],
[Define to 1 to build with assertion checks. (--enable-cassert)])])
+#
+# Enable transactionid checks
+#
+PGAC_ARG_BOOL(enable, xidcheck, no, [enable transactionid checks (for debugging)],
+ [AC_DEFINE([USE_XID_CHECKING], 1,
+ [Define to 1 to build with transactionid checks. (--enable-xidcheck)])])
+
+
#
# Include directories
#
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d4d1fe45cc..a3dbe27640 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -9246,6 +9246,26 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
</listitem>
</varlistentry>
+ <varlistentry id="guc-debug-xidcheck" xreflabel="debug_xidcheck">
+ <term><varname>debug_xidcheck</varname> (<type>boolean</type>)
+ <indexterm>
+ <primary><varname>debug_xidcheck</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Reports whether <productname>PostgreSQL</productname> has been built
+ with transaction id checking enabled. That is the case if the
+ macro <symbol>USE_XID_CHECKING</symbol> is defined
+ when <productname>PostgreSQL</productname> is built (accomplished
+ e.g. by the <command>configure</command> option
+ <option>--enable-xidcheck</option>). By
+ default <productname>PostgreSQL</productname> is built without
+ transaction id checking.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-integer-datetimes" xreflabel="integer_datetimes">
<term><varname>integer_datetimes</varname> (<type>boolean</type>)
<indexterm>
diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index 9c10a897f1..eedfd021a0 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -1524,6 +1524,19 @@ build-postgresql:
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><option>--enable-xidcheck</option></term>
+ <listitem>
+ <para>
+ Enables <firstterm>transaction ID</firstterm> checks in the server, which test
+ the 64-bit transaction ID management. This may prove valuable for
+ code development purposes, but enabling this feature will consume a small
+ amount of shared memory and might slow down the server. Enabling this feature
+ will not improve the stability of your server.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><option>--enable-tap-tests</option></term>
<listitem>
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index b18eee42d4..d26ad71e8a 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -50,6 +50,9 @@ GetNewTransactionId(bool isSubXact)
{
FullTransactionId full_xid;
TransactionId xid;
+#if USE_XID_CHECKING
+ int64 xidstep;
+#endif
/*
* Workers synchronize transaction state at the beginning of each parallel
@@ -182,7 +185,18 @@ GetNewTransactionId(bool isSubXact)
* we want the next incoming transaction to try it again. We cannot
* assign more XIDs until there is CLOG space for them.
*/
+#if USE_XID_CHECKING
+ xidstep = ShmemVariableCache->xidAdvanceBy;
+ if (ShmemVariableCache->xidAdvanceRandom)
+ {
+ xidstep *= pg_erand48(0);
+ if (!xidstep)
+ xidstep = 1;
+ }
+ FullTransactionIdAdvance(&ShmemVariableCache->nextFullXid, xidstep);
+#else
FullTransactionIdAdvance(&ShmemVariableCache->nextFullXid);
+#endif
/*
* We must store the new XID into the shared ProcArray before releasing
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 6b104695c0..08287bebec 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -536,3 +536,68 @@ pg_nextoid(PG_FUNCTION_ARGS)
return newoid;
}
+
+/*
+ * SQL callable interface.
+ *
+ * Function is intentionally not documented in the user facing docs.
+ */
+Datum
+getxidstep(PG_FUNCTION_ARGS)
+{
+#if USE_XID_CHECKING
+ int64 result;
+
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to call getxidstep()")));
+
+ /*
+ * This lock probably does not need to be exclusive, and might not
+ * be necessary at all. But for sanity, and since this is just for
+ * regression testing anyway, play it safe and use a strong lock.
+ */
+ LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
+ result = ShmemVariableCache->xidAdvanceBy;
+ LWLockRelease(XidGenLock);
+
+ PG_RETURN_INT64(result);
+#else
+ PG_RETURN_INT64(1);
+#endif
+}
+
+/*
+ * SQL callable interface.
+ *
+ * Function is intentionally not documented in the user facing docs.
+ */
+Datum
+setxidstep(PG_FUNCTION_ARGS)
+{
+#if USE_XID_CHECKING
+ int64 oldstep;
+ int64 newstep = PG_GETARG_INT64(0);
+ bool newrand = PG_GETARG_BOOL(1);
+
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to call setxidstep()")));
+ if (newstep < 1)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("xid step value must be positive")));
+
+ LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
+ oldstep = ShmemVariableCache->xidAdvanceBy;
+ ShmemVariableCache->xidAdvanceBy = newstep;
+ ShmemVariableCache->xidAdvanceRandom = newrand;
+ LWLockRelease(XidGenLock);
+
+ PG_RETURN_INT64(oldstep);
+#else
+ PG_RETURN_INT64(1);
+#endif
+}
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index a1567d3559..b24770a2ca 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -144,6 +144,10 @@ InitShmemAllocation(void)
ShmemVariableCache = (VariableCache)
ShmemAlloc(sizeof(*ShmemVariableCache));
memset(ShmemVariableCache, 0, sizeof(*ShmemVariableCache));
+#if USE_XID_CHECKING
+ ShmemVariableCache->xidAdvanceBy = 1;
+ ShmemVariableCache->xidAdvanceRandom = false;
+#endif
}
/*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index ba4edde71a..aed2d7b9a4 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -582,6 +582,7 @@ static int wal_block_size;
static bool data_checksums;
static bool integer_datetimes;
static bool assert_enabled;
+static bool xidcheck_enabled;
static char *recovery_target_timeline_string;
static char *recovery_target_string;
static char *recovery_target_xid_string;
@@ -1271,6 +1272,20 @@ static struct config_bool ConfigureNamesBool[] =
#endif
NULL, NULL, NULL
},
+ {
+ {"debug_xidcheck", PGC_INTERNAL, PRESET_OPTIONS,
+ gettext_noop("Shows whether the running server has transaction id checks enabled."),
+ NULL,
+ GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ },
+ &xidcheck_enabled,
+#ifdef USE_XID_CHECKING
+ true,
+#else
+ false,
+#endif
+ NULL, NULL, NULL
+ },
{
{"exit_on_error", PGC_USERSET, ERROR_HANDLING_OPTIONS,
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 33fd052156..93e1b8f6c0 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -81,9 +81,15 @@ FullTransactionIdFromEpochAndXid(uint32 epoch, TransactionId xid)
/* advance a FullTransactionId variable, stepping over special XIDs */
static inline void
+#if USE_XID_CHECKING
+FullTransactionIdAdvance(FullTransactionId *dest, int64 step)
+{
+ dest->value += step;
+#else
FullTransactionIdAdvance(FullTransactionId *dest)
{
dest->value++;
+#endif
while (XidFromFullTransactionId(*dest) < FirstNormalTransactionId)
dest->value++;
}
@@ -163,6 +169,10 @@ typedef struct VariableCacheData
*/
FullTransactionId nextFullXid; /* next full XID to assign */
+#if USE_XID_CHECKING
+ int64 xidAdvanceBy; /* step value for advancing; typically just 1 */
+ bool xidAdvanceRandom; /* whether advancement should be randomized */
+#endif
TransactionId oldestXid; /* cluster-wide minimum datfrozenxid */
TransactionId xidVacLimit; /* start forcing autovacuums here */
TransactionId xidWarnLimit; /* start complaining here */
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index ac8f64b219..c93376418a 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -3210,6 +3210,16 @@
prorettype => 'oid', proargtypes => 'regclass name regclass',
prosrc => 'pg_nextoid' },
+{ oid => '9816', descr => 'return the step value for incrementing transactionid values',
+ proname => 'getxidstep', provolatile => 'v', proparallel => 'u',
+ prorettype => 'int8', proargtypes => '',
+ prosrc => 'getxidstep' },
+
+{ oid => '9817', descr => 'set the new step value and return the old step value for incrementing transactionid values',
+ proname => 'setxidstep', provolatile => 'v', proparallel => 'u',
+ prorettype => 'int8', proargtypes => 'int8 bool',
+ prosrc => 'setxidstep' },
+
{ oid => '1579', descr => 'I/O',
proname => 'varbit_in', prorettype => 'varbit',
proargtypes => 'cstring oid int4', prosrc => 'varbit_in' },
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index c208dcdfc7..cd3a7cb1c7 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -885,6 +885,9 @@
/* Define to 1 to build with assertion checks. (--enable-cassert) */
#undef USE_ASSERT_CHECKING
+/* Define to 1 to build with transactionid checks. (--enable-xidcheck) */
+#undef USE_XID_CHECKING
+
/* Define to 1 to build with Bonjour support. (--with-bonjour) */
#undef USE_BONJOUR
diff --git a/src/test/regress/expected/xidtestsetup.out b/src/test/regress/expected/xidtestsetup.out
new file mode 100644
index 0000000000..f546beeb8e
--- /dev/null
+++ b/src/test/regress/expected/xidtestsetup.out
@@ -0,0 +1,9 @@
+-- If configured with xidcheck, henceforth, advance the next TransactionId by a
+-- random fraction of 1 billion each time the transaction id counter is advanced,
+-- otherwise do nothing special.
+SELECT setxidstep(1000000000, true);
+ setxidstep
+------------
+ 1
+(1 row)
+
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index d33a4e143d..ceceb333fa 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -10,6 +10,12 @@
# interferes with crash-recovery testing.
test: tablespace
+# ----------
+# Set up transactionid checking, if --enable-xidtest was given, otherwise
+# do not much of anything
+# ----------
+test: xidtestsetup
+
# ----------
# The first group of parallel tests
# ----------
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index f86f5c5682..af709c6390 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -1,6 +1,7 @@
# src/test/regress/serial_schedule
# This should probably be in an order similar to parallel_schedule.
test: tablespace
+test: xidtestsetup
test: boolean
test: char
test: name
diff --git a/src/test/regress/sql/xidtestsetup.sql b/src/test/regress/sql/xidtestsetup.sql
new file mode 100644
index 0000000000..2b15db3509
--- /dev/null
+++ b/src/test/regress/sql/xidtestsetup.sql
@@ -0,0 +1,4 @@
+-- If configured with xidcheck, henceforth, advance the next TransactionId by a
+-- random fraction of 1 billion each time the transaction id counter is advanced,
+-- otherwise do nothing special.
+SELECT setxidstep(1000000000, true);
diff --git a/src/tools/msvc/Solution.pm b/src/tools/msvc/Solution.pm
index 5f72530c72..631dc50f32 100644
--- a/src/tools/msvc/Solution.pm
+++ b/src/tools/msvc/Solution.pm
@@ -842,6 +842,7 @@ sub GetFakeConfigure
my $cfg = '--enable-thread-safety';
$cfg .= ' --enable-cassert' if ($self->{options}->{asserts});
$cfg .= ' --enable-nls' if ($self->{options}->{nls});
+ $cfg .= ' --enable-xidcheck' if ($self->{options}->{xidcheck});
$cfg .= ' --enable-tap-tests' if ($self->{options}->{tap_tests});
$cfg .= ' --with-ldap' if ($self->{options}->{ldap});
$cfg .= ' --without-zlib' unless ($self->{options}->{zlib});
diff --git a/src/tools/msvc/config_default.pl b/src/tools/msvc/config_default.pl
index 2ef2cfc4e9..cb62e09982 100644
--- a/src/tools/msvc/config_default.pl
+++ b/src/tools/msvc/config_default.pl
@@ -12,6 +12,7 @@ our $config = {
gss => undef, # --with-gssapi=<path>
icu => undef, # --with-icu=<path>
nls => undef, # --enable-nls=<path>
+ xidchecks => undef, # --enable-xidcheck
tap_tests => undef, # --enable-tap-tests
tcl => undef, # --with-tcl=<path>
perl => undef, # --with-perl=<path>
Mark Dilger <hornschnorter@gmail.com> writes:
While working on the problem of XID wraparound within the LISTEN/NOTIFY
system, I tried to increment XIDs by more than one per transaction.
This leads to a number of test failures, many which look like:
IIRC, the XID-creation logic is designed to initialize the next clog
page whenever it allocates an exact-multiple-of-BLCKSZ*4 transaction
number. Skipping over such numbers would create trouble.
First, I'd like a good method of burning through transaction ids in
tests designed to check for problems in XID wrap-around.
Don't "burn through them". Stop the cluster and use pg_resetwal to
set the XID counter wherever you want it. (You might need to set it
just before a page or segment boundary; I'm not sure if pg_resetwal
has any logic of its own to initialize a new CLOG page/file when you
move the counter this way. Perhaps it's worth improving that.)
Second, I'd like to add Asserts where appropriate regarding this
assumption.
I'm not excited about that, and it's *certainly* not a problem that
justifies additional configure infrastructure.
regards, tom lane
On Fri, Nov 29, 2019 at 12:22 AM Mark Dilger <hornschnorter@gmail.com> wrote:
Hackers,
While working on the problem of XID wraparound within the LISTEN/NOTIFY
system, I tried to increment XIDs by more than one per transaction.
This leads to a number of test failures, many which look like:+ERROR: could not access status of transaction 7485 +DETAIL: Could not read from file "pg_subtrans/0000" at offset 24576: read too few bytes.I might not have read the right documentation, but....
I do not see anything in src/backend/access/transam/README nor elsewhere
documenting a design decision or assumption that transaction IDs must
be assigned contiguously. I suppose this is such a fundamental
assumption that it is completely implicit and nobody thought to document
it, but I'd like to check for two reasons:First, I'd like a good method of burning through transaction ids in
tests designed to check for problems in XID wrap-around.
As Tom pointed out and as mentioned in the comments "If we are
allocating the first XID of a new page of the commit log, zero out
that commit-log page before returning.", we need to take care of
extending the CLOG while advancing TransactionIds. I have some old
script for burning transactionid's which I am attaching here. It
might help you. I think this is provided long back by Jeff Janes.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachments:
transactionid_burner_v1.patchapplication/octet-stream; name=transactionid_burner_v1.patchDownload
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index cf3e964..d3d1c6d 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -33,6 +33,8 @@
/* pointer to "variable cache" in shared memory (set up by shmem.c) */
VariableCache ShmemVariableCache = NULL;
+int JJ_xid = 0;
+
/*
* Allocate the next XID for a new transaction or subtransaction.
@@ -168,17 +170,24 @@ GetNewTransactionId(bool isSubXact)
*
* Extend pg_subtrans and pg_commit_ts too.
*/
- ExtendCLOG(xid);
- ExtendCommitTs(xid);
- ExtendSUBTRANS(xid);
-
- /*
- * Now advance the nextXid counter. This must not happen until after we
- * have successfully completed ExtendCLOG() --- if that routine fails, we
- * want the next incoming transaction to try it again. We cannot assign
- * more XIDs until there is CLOG space for them.
- */
- TransactionIdAdvance(ShmemVariableCache->nextXid);
+ {
+ int incr;
+ for (incr=0; incr <=JJ_xid; incr++)
+ {
+ xid = ShmemVariableCache->nextXid;
+ ExtendCLOG(xid);
+ ExtendCommitTs(xid);
+ ExtendSUBTRANS(xid);
+
+ /*
+ * Now advance the nextXid counter. This must not happen until after we
+ * have successfully completed ExtendCLOG() --- if that routine fails, we
+ * want the next incoming transaction to try it again. We cannot assign
+ * more XIDs until there is CLOG space for them.
+ */
+ TransactionIdAdvance(ShmemVariableCache->nextXid);
+ }
+ }
/*
* We must store the new XID into the shared ProcArray before releasing
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index a185749..5b59e8c 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -111,6 +111,7 @@ extern char *default_tablespace;
extern char *temp_tablespaces;
extern bool ignore_checksum_failure;
extern bool synchronize_seqscans;
+extern int JJ_xid;
#ifdef TRACE_SYNCSCAN
extern bool trace_syncscan;
@@ -2276,6 +2277,15 @@ static struct config_int ConfigureNamesInt[] =
},
{
+ {"JJ_xid", PGC_USERSET, WAL_SETTINGS,
+ gettext_noop("Skip this many xid every time we acquire one"),
+ NULL
+ },
+ &JJ_xid,
+ 0, 0, 1000000, NULL, NULL
+ },
+
+ {
{"commit_siblings", PGC_USERSET, WAL_SETTINGS,
gettext_noop("Sets the minimum concurrent open transactions before performing "
"commit_delay."),