Usage of epoch in txid_current
Hi,
Currently, txid_current and friends export a 64-bit format of
transaction id that is extended with an “epoch” counter so that it
will not wrap around during the life of an installation. The epoch
value it uses is based on the epoch that is maintained by checkpoint
(aka only checkpoint increments it).
Now if epoch changes multiple times between two checkpoints
(practically the chances of this are bleak, but there is a theoretical
possibility), then won't the computation of xids will go wrong?
Basically, it can give the same value of txid after wraparound if the
checkpoint doesn't occur between the two calls to txid_current.
Am I missing something which ensures that epoch gets incremented at or
after wraparound?
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Hi!
On Tue, Dec 5, 2017 at 6:19 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
Currently, txid_current and friends export a 64-bit format of
transaction id that is extended with an “epoch” counter so that it
will not wrap around during the life of an installation. The epoch
value it uses is based on the epoch that is maintained by checkpoint
(aka only checkpoint increments it).Now if epoch changes multiple times between two checkpoints
(practically the chances of this are bleak, but there is a theoretical
possibility), then won't the computation of xids will go wrong?
Basically, it can give the same value of txid after wraparound if the
checkpoint doesn't occur between the two calls to txid_current.
AFAICS, yes, if epoch changes multiple times between two checkpoints, then
computation will go wrong. And it doesn't look like purely theoretical
possibility for me, because I think I know couple of instances of the edge
of this...
Am I missing something which ensures that epoch gets incremented at or
after wraparound?
I've checked the code, and it doesn't look for me that there is something
like this.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On Tue, Dec 5, 2017 at 2:49 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
On Tue, Dec 5, 2017 at 6:19 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
Currently, txid_current and friends export a 64-bit format of
transaction id that is extended with an “epoch” counter so that it
will not wrap around during the life of an installation. The epoch
value it uses is based on the epoch that is maintained by checkpoint
(aka only checkpoint increments it).Now if epoch changes multiple times between two checkpoints
(practically the chances of this are bleak, but there is a theoretical
possibility), then won't the computation of xids will go wrong?
Basically, it can give the same value of txid after wraparound if the
checkpoint doesn't occur between the two calls to txid_current.AFAICS, yes, if epoch changes multiple times between two checkpoints, then
computation will go wrong. And it doesn't look like purely theoretical
possibility for me, because I think I know couple of instances of the edge
of this...
Okay, it is quite strange that we haven't discovered this problem till
now. I think we should do something to fix it. One idea is that we
track epoch change in shared memory (probably in the same data
structure (VariableCacheData) where we track nextXid). We need to
increment it when the xid wraparound during xid allocation (in
GetNewTransactionId). Also, we need to make it persistent as which
means we need to log it in checkpoint xlog record and we need to write
a separate xlog record for the epoch change.
Thoughts?
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On 2017-12-05 16:21:27 +0530, Amit Kapila wrote:
On Tue, Dec 5, 2017 at 2:49 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:On Tue, Dec 5, 2017 at 6:19 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
Currently, txid_current and friends export a 64-bit format of
transaction id that is extended with an “epoch” counter so that it
will not wrap around during the life of an installation. The epoch
value it uses is based on the epoch that is maintained by checkpoint
(aka only checkpoint increments it).Now if epoch changes multiple times between two checkpoints
(practically the chances of this are bleak, but there is a theoretical
possibility), then won't the computation of xids will go wrong?
Basically, it can give the same value of txid after wraparound if the
checkpoint doesn't occur between the two calls to txid_current.AFAICS, yes, if epoch changes multiple times between two checkpoints, then
computation will go wrong. And it doesn't look like purely theoretical
possibility for me, because I think I know couple of instances of the edge
of this...
I think it's not terribly likely principle, due to the required WAL
size. You need at least a commit record for each of 4 billion
transactions. Each commit record is at least 24bytes long, and in a
non-artificial scenario you additionally would have a few hundred bytes
of actual content of WAL. So we're talking about a distance of at least
0.5-2TB within a single checkpoint here. Not impossible, but not likely
either.
Okay, it is quite strange that we haven't discovered this problem till
now. I think we should do something to fix it. One idea is that we
track epoch change in shared memory (probably in the same data
structure (VariableCacheData) where we track nextXid). We need to
increment it when the xid wraparound during xid allocation (in
GetNewTransactionId). Also, we need to make it persistent as which
means we need to log it in checkpoint xlog record and we need to write
a separate xlog record for the epoch change.
I think it makes a fair bit of sense to not do the current crufty
tracking of xid epochs. I don't really how we got there, but it doesn't
make terribly much sense. Don't think we need additional WAL logging
though - we should be able to piggyback this onto the already existing
clog logging.
I kinda wonder if we shouldn't just track nextXid as a 64bit integer
internally, instead of bothering with tracking the epoch
separately. Then we can "just" truncate it in the cases where it's
stored in space constrained places etc.
Greetings,
Andres Freund
Andres,
* Andres Freund (andres@anarazel.de) wrote:
On 2017-12-05 16:21:27 +0530, Amit Kapila wrote:
On Tue, Dec 5, 2017 at 2:49 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:On Tue, Dec 5, 2017 at 6:19 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
Currently, txid_current and friends export a 64-bit format of
transaction id that is extended with an “epoch” counter so that it
will not wrap around during the life of an installation. The epoch
value it uses is based on the epoch that is maintained by checkpoint
(aka only checkpoint increments it).Now if epoch changes multiple times between two checkpoints
(practically the chances of this are bleak, but there is a theoretical
possibility), then won't the computation of xids will go wrong?
Basically, it can give the same value of txid after wraparound if the
checkpoint doesn't occur between the two calls to txid_current.AFAICS, yes, if epoch changes multiple times between two checkpoints, then
computation will go wrong. And it doesn't look like purely theoretical
possibility for me, because I think I know couple of instances of the edge
of this...I think it's not terribly likely principle, due to the required WAL
size. You need at least a commit record for each of 4 billion
transactions. Each commit record is at least 24bytes long, and in a
non-artificial scenario you additionally would have a few hundred bytes
of actual content of WAL. So we're talking about a distance of at least
0.5-2TB within a single checkpoint here. Not impossible, but not likely
either.
At the bottom end, with a 30-minute checkpoint, that's about 300MB/s.
Certainly quite a bit and we might have trouble getting there for other
reasons, but definitely something that can be accomplished with even a
single SSD these days.
Okay, it is quite strange that we haven't discovered this problem till
now. I think we should do something to fix it. One idea is that we
track epoch change in shared memory (probably in the same data
structure (VariableCacheData) where we track nextXid). We need to
increment it when the xid wraparound during xid allocation (in
GetNewTransactionId). Also, we need to make it persistent as which
means we need to log it in checkpoint xlog record and we need to write
a separate xlog record for the epoch change.I think it makes a fair bit of sense to not do the current crufty
tracking of xid epochs. I don't really how we got there, but it doesn't
make terribly much sense. Don't think we need additional WAL logging
though - we should be able to piggyback this onto the already existing
clog logging.
Don't you mean xact logging? ;)
I kinda wonder if we shouldn't just track nextXid as a 64bit integer
internally, instead of bothering with tracking the epoch
separately. Then we can "just" truncate it in the cases where it's
stored in space constrained places etc.
This sounds reasonable to me, at least, but I've not been in these
depths much.
Thanks!
Stephen
On December 5, 2017 10:01:43 AM PST, Stephen Frost <sfrost@snowman.net> wrote:
Andres,
* Andres Freund (andres@anarazel.de) wrote:
I think it makes a fair bit of sense to not do the current crufty
tracking of xid epochs. I don't really how we got there, but itdoesn't
make terribly much sense. Don't think we need additional WAL logging
though - we should be able to piggyback this onto the alreadyexisting
clog logging.
Don't you mean xact logging? ;)
No. We log a WAL record at clog boundaries. Wraparounds have to be at one. We could just include the 64 bit xid there and would have reliable tracking.
Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
Stephen Frost <sfrost@snowman.net> writes:
* Andres Freund (andres@anarazel.de) wrote:
I kinda wonder if we shouldn't just track nextXid as a 64bit integer
internally, instead of bothering with tracking the epoch
separately. Then we can "just" truncate it in the cases where it's
stored in space constrained places etc.
This sounds reasonable to me, at least, but I've not been in these
depths much.
+1 ... I think the reason it's like that is simply that nobody's revisited
the XID generator since we decided to require 64-bit integer support.
We'd need this for support of true 64-bit XIDs, too, though I'm unsure
whether that project is going anywhere anytime soon. In any case it
seems like a separable subset of that work.
regards, tom lane
On Tue, Dec 5, 2017 at 11:15 PM, Andres Freund <andres@anarazel.de> wrote:
On 2017-12-05 16:21:27 +0530, Amit Kapila wrote:
Okay, it is quite strange that we haven't discovered this problem till
now. I think we should do something to fix it. One idea is that we
track epoch change in shared memory (probably in the same data
structure (VariableCacheData) where we track nextXid). We need to
increment it when the xid wraparound during xid allocation (in
GetNewTransactionId). Also, we need to make it persistent as which
means we need to log it in checkpoint xlog record and we need to write
a separate xlog record for the epoch change.I think it makes a fair bit of sense to not do the current crufty
tracking of xid epochs. I don't really how we got there, but it doesn't
make terribly much sense. Don't think we need additional WAL logging
though - we should be able to piggyback this onto the already existing
clog logging.I kinda wonder if we shouldn't just track nextXid as a 64bit integer
internally, instead of bothering with tracking the epoch
separately. Then we can "just" truncate it in the cases where it's
stored in space constrained places etc.
We are using ShmemVariableCache->nextXid at many places so always
converting/truncating it to 32-bit number before using seems slightly
awkward, so we can think of using a separate nextBigXid 64bit number
as well. Either way, it is not clear to me how we will keep it
updated after recovery. Right now, the mechanism is quite simple, at
the beginning of a recovery we take the value of nextXid from
checkpoint record and then if any xlog record indicates xid that
follows nextXid, we advance it. Here, the point to note is that we
take the xid from the WAL record (which means that it assumes xids are
non-contiguous or some xids are consumed without being logged) and
increment it. Unless we plan to change something in that logic (like
storing 64-bit xids in WAL records), it is not clear to me how to make
it work. OTOH, recovering value of epoch which increments only at
wraparound seems fairly straightforward as described in my previous
email.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Hi,
On 2017-12-06 17:39:09 +0530, Amit Kapila wrote:
We are using ShmemVariableCache->nextXid at many places so always
converting/truncating it to 32-bit number before using seems slightly
awkward, so we can think of using a separate nextBigXid 64bit number
as well.
-many
It's not actually that many places. And a lot of them would should be
updated anyway, to be epoch aware. Let's not add this kind of crummyness
to avoid a few truncating casts here and there.
Either way, it is not clear to me how we will keep it
updated after recovery. Right now, the mechanism is quite simple, at
the beginning of a recovery we take the value of nextXid from
checkpoint record and then if any xlog record indicates xid that
follows nextXid, we advance it. Here, the point to note is that we
take the xid from the WAL record (which means that it assumes xids are
non-contiguous or some xids are consumed without being logged) and
increment it. Unless we plan to change something in that logic (like
storing 64-bit xids in WAL records), it is not clear to me how to make
it work. OTOH, recovering value of epoch which increments only at
wraparound seems fairly straightforward as described in my previous
email.
I think it should be fairly simple if simply add the 64bit xid to the
existing clog extension WAL records.
Greetings,
Andres Freund
On Wed, Dec 6, 2017 at 11:26 PM, Andres Freund <andres@anarazel.de> wrote:
Either way, it is not clear to me how we will keep it
updated after recovery. Right now, the mechanism is quite simple, at
the beginning of a recovery we take the value of nextXid from
checkpoint record and then if any xlog record indicates xid that
follows nextXid, we advance it. Here, the point to note is that we
take the xid from the WAL record (which means that it assumes xids are
non-contiguous or some xids are consumed without being logged) and
increment it. Unless we plan to change something in that logic (like
storing 64-bit xids in WAL records), it is not clear to me how to make
it work. OTOH, recovering value of epoch which increments only at
wraparound seems fairly straightforward as described in my previous
email.I think it should be fairly simple if simply add the 64bit xid to the
existing clog extension WAL records.
IIUC, you mean to say that we should log the 64bit xid value in
CLOG_ZEROPAGE record while extending clog and that too we can do only
at wraparound. Now, maybe doing it every time also doesn't hurt, but
I think doing it at wraparound should be sufficient.
Just to be clear, I am not planning to pursue writing a patch for this
at the moment. So, if anybody else is interested or if Andres wants
to write it, I can help in the review.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On Fri, Dec 8, 2017 at 3:36 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Dec 6, 2017 at 11:26 PM, Andres Freund <andres@anarazel.de> wrote:
Either way, it is not clear to me how we will keep it
updated after recovery. Right now, the mechanism is quite simple, at
the beginning of a recovery we take the value of nextXid from
checkpoint record and then if any xlog record indicates xid that
follows nextXid, we advance it. Here, the point to note is that we
take the xid from the WAL record (which means that it assumes xids are
non-contiguous or some xids are consumed without being logged) and
increment it. Unless we plan to change something in that logic (like
storing 64-bit xids in WAL records), it is not clear to me how to make
it work. OTOH, recovering value of epoch which increments only at
wraparound seems fairly straightforward as described in my previous
email.I think it should be fairly simple if simply add the 64bit xid to the
existing clog extension WAL records.IIUC, you mean to say that we should log the 64bit xid value in
CLOG_ZEROPAGE record while extending clog and that too we can do only
at wraparound. Now, maybe doing it every time also doesn't hurt, but
I think doing it at wraparound should be sufficient.
Can you please elaborate on why clog redo in particular would need to
use 64 bit xids? Is the 64 bit xid not derivable from the 32 bit xid
in the WAL + the current value of a new 64 bit next xid?
Just to be clear, I am not planning to pursue writing a patch for this
at the moment. So, if anybody else is interested or if Andres wants
to write it, I can help in the review.
I played around with this idea yesterday. Experiment-grade patch
attached. Approach:
1. Introduce a new type BigTransactionId (better names welcome).
2. Change ShmemVariableCache->nextXid to ShmemVariableCache->nextBigXid.
3. Change checkpoints to use nextBigXid too.
4. Change ReadNewTransactionId() to ReadNextBigTransactionId().
5. Remove GetNextXidAndEpoch() as it's now redundant.
6. Everywhere that was reading ShmemVariableCache->nextXid or calling
ReadNewTransactionId() but actually needs an xid now uses an explicit
conversion macro XidFromBigTransactionId().
7. Everywhere that was writing ShmemVariableCache->nextXid but only
has an xid now goes through a new function
AdvanceNextBigTransactionIdPast(xid), to handle recovery (since WAL
records have xids in them as mentioned by Amit).
I think it's probably a good idea to make it very explicit when moving
between big and small transaction IDs, hence the including of the word
'big' in variable and function names and the use of a function-like
macro (rather than implicit conversion, which C doesn't give me a good
way to prevent). Otherwise there is a class of bug that is hidden for
the first 2^32 transactions.
The logic in CreateCheckPoint() that assumed that there could be only
one wraparound per checkpoint is gone (the problem reported in this
thread). Note that AdvanceNextBigTransactionIdPast() contains logic
that infers an epoch, which is in some ways similar to what the
removed code was doing, but in this case I think all callers must have
an xid from the same or next epoch.
I'm probably missing a few details... this is only a first swing at
the problem. It does pass check-world and various xid wraparound
tests I came up with. Clearly big xids could spread to more places
than I show here. Do you see problems with this approach or have
better ideas?
--
Thomas Munro
http://www.enterprisedb.com
Attachments:
0001-Track-the-next-xid-using-64-bits.patchapplication/octet-stream; name=0001-Track-the-next-xid-using-64-bits.patchDownload
From 1f9891ba666e50849570e5de0908200747054909 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@enterprisedb.com>
Date: Mon, 9 Jul 2018 21:54:03 +1200
Subject: [PATCH] Track the next xid using 64 bits.
Instead of tracking the epoch independently, start using a 64 bit transaction
ID in several places. This fix an unlikely bug where an epoch increment could
be missed if you managed to consume more than 2^32 transactions between
checkpoints.
Work in progress!
Discussion: https://postgr.es/m/CAA4eK1%2BMv%2Bmb0HFfWM9Srtc6MVe160WFurXV68iAFMcagRZ0dQ%40mail.gmail.com
---
src/backend/access/nbtree/nbtpage.c | 4 +-
src/backend/access/rmgrdesc/xlogdesc.c | 4 +-
src/backend/access/transam/clog.c | 4 +-
src/backend/access/transam/commit_ts.c | 5 +-
src/backend/access/transam/multixact.c | 9 +--
src/backend/access/transam/subtrans.c | 2 +-
src/backend/access/transam/twophase.c | 17 +---
src/backend/access/transam/varsup.c | 53 +++++++++---
src/backend/access/transam/xact.c | 21 +----
src/backend/access/transam/xlog.c | 103 ++++++------------------
src/backend/commands/vacuum.c | 10 +--
src/backend/postmaster/autovacuum.c | 4 +-
src/backend/replication/walreceiver.c | 5 +-
src/backend/replication/walsender.c | 26 +-----
src/backend/storage/ipc/procarray.c | 21 ++---
src/backend/storage/lmgr/predicate.c | 2 +-
src/backend/utils/adt/txid.c | 13 ++-
src/backend/utils/misc/pg_controldata.c | 5 +-
src/bin/pg_controldata/pg_controldata.c | 5 +-
src/bin/pg_resetwal/pg_resetwal.c | 19 +++--
src/include/access/transam.h | 20 ++++-
src/include/access/xlog.h | 1 -
src/include/c.h | 2 +
src/include/catalog/pg_control.h | 3 +-
src/include/storage/standby.h | 2 +-
25 files changed, 154 insertions(+), 206 deletions(-)
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 2e959da5f85..89a85b04430 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -1946,7 +1946,7 @@ _bt_unlink_halfdead_page(Relation rel, Buffer leafbuf, bool *rightsib_empty)
* Mark the page itself deleted. It can be recycled when all current
* transactions are gone. Storing GetTopTransactionId() would work, but
* we're in VACUUM and would not otherwise have an XID. Having already
- * updated links to the target, ReadNewTransactionId() suffices as an
+ * updated links to the target, ReadNextBigTransactionId() suffices as an
* upper bound. Any scan having retained a now-stale link is advertising
* in its PGXACT an xmin less than or equal to the value we read here. It
* will continue to do so, holding back RecentGlobalXmin, for the duration
@@ -1956,7 +1956,7 @@ _bt_unlink_halfdead_page(Relation rel, Buffer leafbuf, bool *rightsib_empty)
opaque = (BTPageOpaque) PageGetSpecialPointer(page);
opaque->btpo_flags &= ~BTP_HALF_DEAD;
opaque->btpo_flags |= BTP_DELETED;
- opaque->btpo.xact = ReadNewTransactionId();
+ opaque->btpo.xact = XidFromBigTransactionId(ReadNextBigTransactionId());
/* And update the metapage, if needed */
if (BufferIsValid(metabuf))
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 00741c7b09e..2c3ea10af94 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -14,6 +14,7 @@
*/
#include "postgres.h"
+#include "access/transam.h"
#include "access/xlog.h"
#include "access/xlog_internal.h"
#include "catalog/pg_control.h"
@@ -52,7 +53,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->ThisTimeLineID,
checkpoint->PrevTimeLineID,
checkpoint->fullPageWrites ? "true" : "false",
- checkpoint->nextXidEpoch, checkpoint->nextXid,
+ EpochFromBigTransactionId(checkpoint->nextBigXid),
+ XidFromBigTransactionId(checkpoint->nextBigXid),
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index 8b7ff5b0c24..fa2dfb50b7a 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -754,7 +754,7 @@ ZeroCLOGPage(int pageno, bool writeXlog)
void
StartupCLOG(void)
{
- TransactionId xid = ShmemVariableCache->nextXid;
+ TransactionId xid = XidFromBigTransactionId(ShmemVariableCache->nextBigXid);
int pageno = TransactionIdToPage(xid);
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
@@ -773,7 +773,7 @@ StartupCLOG(void)
void
TrimCLOG(void)
{
- TransactionId xid = ShmemVariableCache->nextXid;
+ TransactionId xid = XidFromBigTransactionId(ShmemVariableCache->nextBigXid);
int pageno = TransactionIdToPage(xid);
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 73fac1ba81d..5637ee4f162 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -644,7 +644,7 @@ ActivateCommitTs(void)
}
LWLockRelease(CommitTsLock);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromBigTransactionId(ShmemVariableCache->nextBigXid);
pageno = TransactionIdToCTsPage(xid);
/*
@@ -671,7 +671,8 @@ ActivateCommitTs(void)
if (ShmemVariableCache->oldestCommitTsXid == InvalidTransactionId)
{
ShmemVariableCache->oldestCommitTsXid =
- ShmemVariableCache->newestCommitTsXid = ReadNewTransactionId();
+ ShmemVariableCache->newestCommitTsXid =
+ XidFromBigTransactionId(ReadNextBigTransactionId());
}
LWLockRelease(CommitTsLock);
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index a9a51055e96..f008fdf11d4 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -3283,14 +3283,7 @@ multixact_redo(XLogReaderState *record)
* process doesn't need to hold a lock while checking this. We still
* acquire the lock to modify it, though.
*/
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextBigTransactionIdPast(max_xid, true);
}
else if (info == XLOG_MULTIXACT_TRUNCATE_ID)
{
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c
index 4faa21f5aef..4f225d56c86 100644
--- a/src/backend/access/transam/subtrans.c
+++ b/src/backend/access/transam/subtrans.c
@@ -261,7 +261,7 @@ StartupSUBTRANS(TransactionId oldestActiveXID)
LWLockAcquire(SubtransControlLock, LW_EXCLUSIVE);
startPage = TransactionIdToPage(oldestActiveXID);
- endPage = TransactionIdToPage(ShmemVariableCache->nextXid);
+ endPage = TransactionIdToPage(XidFromBigTransactionId(ShmemVariableCache->nextBigXid));
while (startPage != endPage)
{
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index e8d4e37fe30..3cdc5dfddeb 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -1879,7 +1879,7 @@ restoreTwoPhaseData(void)
TransactionId
PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
{
- TransactionId origNextXid = ShmemVariableCache->nextXid;
+ TransactionId origNextXid = XidFromBigTransactionId(ShmemVariableCache->nextBigXid);
TransactionId result = origNextXid;
TransactionId *xids = NULL;
int nxids = 0;
@@ -2104,7 +2104,7 @@ ProcessTwoPhaseBuffer(TransactionId xid,
bool fromdisk,
bool setParent, bool setNextXid)
{
- TransactionId origNextXid = ShmemVariableCache->nextXid;
+ TransactionId origNextXid = XidFromBigTransactionId(ShmemVariableCache->nextBigXid);
TransactionId *subxids;
char *buf;
TwoPhaseFileHeader *hdr;
@@ -2210,23 +2210,14 @@ ProcessTwoPhaseBuffer(TransactionId xid,
Assert(TransactionIdFollows(subxid, xid));
/* update nextXid if needed */
- if (setNextXid &&
- TransactionIdFollowsOrEquals(subxid,
- ShmemVariableCache->nextXid))
+ if (setNextXid)
{
/*
* We don't expect anyone else to modify nextXid, hence we don't
* need to hold a lock while examining it. We still acquire the
* lock to modify it, though, so we recheck.
*/
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- if (TransactionIdFollowsOrEquals(subxid,
- ShmemVariableCache->nextXid))
- {
- ShmemVariableCache->nextXid = subxid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- }
- LWLockRelease(XidGenLock);
+ AdvanceNextBigTransactionIdPast(subxid, true);
}
if (setParent)
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 394843f7e91..c84edf1ccde 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -73,7 +73,7 @@ GetNewTransactionId(bool isSubXact)
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromBigTransactionId(ShmemVariableCache->nextBigXid);
/*----------
* Check to see if it's safe to assign another XID. This protects against
@@ -156,7 +156,7 @@ GetNewTransactionId(bool isSubXact)
/* Re-acquire lock and start over */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromBigTransactionId(ShmemVariableCache->nextBigXid);
}
/*
@@ -173,12 +173,12 @@ GetNewTransactionId(bool isSubXact)
ExtendSUBTRANS(xid);
/*
- * Now advance the nextXid counter. This must not happen until after we
+ * Now advance the nextBigXid counter. This must not happen until after we
* have successfully completed ExtendCLOG() --- if that routine fails, we
* want the next incoming transaction to try it again. We cannot assign
* more XIDs until there is CLOG space for them.
*/
- TransactionIdAdvance(ShmemVariableCache->nextXid);
+ BigTransactionIdAdvance(ShmemVariableCache->nextBigXid);
/*
* We must store the new XID into the shared ProcArray before releasing
@@ -244,18 +244,47 @@ GetNewTransactionId(bool isSubXact)
}
/*
- * Read nextXid but don't allocate it.
+ * Read nextBigXid but don't allocate it.
*/
-TransactionId
-ReadNewTransactionId(void)
+BigTransactionId
+ReadNextBigTransactionId(void)
{
- TransactionId xid;
+ BigTransactionId bigXid;
LWLockAcquire(XidGenLock, LW_SHARED);
- xid = ShmemVariableCache->nextXid;
+ bigXid = ShmemVariableCache->nextBigXid;
LWLockRelease(XidGenLock);
- return xid;
+ return bigXid;
+}
+
+/*
+ * Advance nextBigXid to the value after a given xid. The epoch is inferred.
+ * If lock_free_check is true, then the caller must be sure that it's safe to
+ * read nextBigXid without holding XidGenLock (ie during recovery).
+ */
+void
+AdvanceNextBigTransactionIdPast(TransactionId xid, bool lock_free_check)
+{
+ TransactionId current_xid;
+ uint32 epoch;
+
+ if (lock_free_check &&
+ !TransactionIdFollowsOrEquals(xid,
+ XidFromBigTransactionId(ShmemVariableCache->nextBigXid)))
+ return;
+
+ LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
+ current_xid = XidFromBigTransactionId(ShmemVariableCache->nextBigXid);
+ if (TransactionIdFollowsOrEquals(xid, current_xid))
+ {
+ epoch = EpochFromBigTransactionId(ShmemVariableCache->nextBigXid);
+ if (xid < current_xid)
+ ++epoch; /* epoch wrapped */
+ ShmemVariableCache->nextBigXid = MakeBigTransactionId(epoch, xid);
+ BigTransactionIdAdvance(ShmemVariableCache->nextBigXid);
+ }
+ LWLockRelease(XidGenLock);
}
/*
@@ -359,7 +388,7 @@ SetTransactionIdLimit(TransactionId oldest_datfrozenxid, Oid oldest_datoid)
ShmemVariableCache->xidStopLimit = xidStopLimit;
ShmemVariableCache->xidWrapLimit = xidWrapLimit;
ShmemVariableCache->oldestXidDB = oldest_datoid;
- curXid = ShmemVariableCache->nextXid;
+ curXid = XidFromBigTransactionId(ShmemVariableCache->nextBigXid);
LWLockRelease(XidGenLock);
/* Log the info */
@@ -435,7 +464,7 @@ ForceTransactionIdLimitUpdate(void)
/* Locking is probably not really necessary, but let's be careful */
LWLockAcquire(XidGenLock, LW_SHARED);
- nextXid = ShmemVariableCache->nextXid;
+ nextXid = XidFromBigTransactionId(ShmemVariableCache->nextBigXid);
xidVacLimit = ShmemVariableCache->xidVacLimit;
oldestXid = ShmemVariableCache->oldestXid;
oldestXidDB = ShmemVariableCache->oldestXidDB;
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 1da1f13ef33..dcc3b47c445 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -468,7 +468,7 @@ GetStableLatestTransactionId(void)
lxid = MyProc->lxid;
stablexid = GetTopTransactionIdIfAny();
if (!TransactionIdIsValid(stablexid))
- stablexid = ReadNewTransactionId();
+ stablexid = XidFromBigTransactionId(ReadNextBigTransactionId());
}
Assert(TransactionIdIsValid(stablexid));
@@ -5529,14 +5529,7 @@ xact_redo_commit(xl_xact_parsed_commit *parsed,
* hold a lock while checking this. We still acquire the lock to modify
* it, though.
*/
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextBigTransactionIdPast(max_xid, true);
Assert(((parsed->xinfo & XACT_XINFO_HAS_ORIGIN) == 0) ==
(origin_id == InvalidRepOriginId));
@@ -5688,15 +5681,7 @@ xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid)
max_xid = TransactionIdLatest(xid,
parsed->nsubxacts,
parsed->subxacts);
-
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextBigTransactionIdPast(max_xid, true);
if (standbyState == STANDBY_DISABLED)
{
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 20b23cb3609..0237d877414 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -579,8 +579,7 @@ typedef struct XLogCtlData
/* Protected by info_lck: */
XLogwrtRqst LogwrtRqst;
XLogRecPtr RedoRecPtr; /* a recent copy of Insert->RedoRecPtr */
- uint32 ckptXidEpoch; /* nextXID & epoch of latest checkpoint */
- TransactionId ckptXid;
+ BigTransactionId ckptBigXid; /* nextXID & epoch of latest checkpoint */
XLogRecPtr asyncXactLSN; /* LSN of newest async commit/abort */
XLogRecPtr replicationSlotMinLSN; /* oldest LSN needed by any slot */
@@ -5102,8 +5101,7 @@ BootStrapXLOG(void)
checkPoint.ThisTimeLineID = ThisTimeLineID;
checkPoint.PrevTimeLineID = ThisTimeLineID;
checkPoint.fullPageWrites = fullPageWrites;
- checkPoint.nextXidEpoch = 0;
- checkPoint.nextXid = FirstNormalTransactionId;
+ checkPoint.nextBigXid = MakeBigTransactionId(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstBootstrapObjectId;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
@@ -5116,7 +5114,7 @@ BootStrapXLOG(void)
checkPoint.time = (pg_time_t) time(NULL);
checkPoint.oldestActiveXid = InvalidTransactionId;
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextBigXid = checkPoint.nextBigXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
@@ -6700,7 +6698,8 @@ StartupXLOG(void)
wasShutdown ? "true" : "false")));
ereport(DEBUG1,
(errmsg_internal("next transaction ID: %u:%u; next OID: %u",
- checkPoint.nextXidEpoch, checkPoint.nextXid,
+ EpochFromBigTransactionId(checkPoint.nextBigXid),
+ XidFromBigTransactionId(checkPoint.nextBigXid),
checkPoint.nextOid)));
ereport(DEBUG1,
(errmsg_internal("next MultiXactId: %u; next MultiXactOffset: %u",
@@ -6715,12 +6714,12 @@ StartupXLOG(void)
(errmsg_internal("commit timestamp Xid oldest/newest: %u/%u",
checkPoint.oldestCommitTsXid,
checkPoint.newestCommitTsXid)));
- if (!TransactionIdIsNormal(checkPoint.nextXid))
+ if (!TransactionIdIsNormal(XidFromBigTransactionId(checkPoint.nextBigXid)))
ereport(PANIC,
(errmsg("invalid next transaction ID")));
/* initialize shared memory variables from the checkpoint record */
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextBigXid = checkPoint.nextBigXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
@@ -6729,8 +6728,7 @@ StartupXLOG(void)
SetMultiXactIdLimit(checkPoint.oldestMulti, checkPoint.oldestMultiDB, true);
SetCommitTsLimit(checkPoint.oldestCommitTsXid,
checkPoint.newestCommitTsXid);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptBigXid = checkPoint.nextBigXid;
/*
* Initialize replication slots, before there's a chance to remove
@@ -7000,7 +6998,7 @@ StartupXLOG(void)
Assert(TransactionIdIsValid(oldestActiveXID));
/* Tell procarray about the range of xids it has to deal with */
- ProcArrayInitRecovery(ShmemVariableCache->nextXid);
+ ProcArrayInitRecovery(XidFromBigTransactionId(ShmemVariableCache->nextBigXid));
/*
* Startup commit log and subtrans only. MultiXact and commit
@@ -7030,9 +7028,9 @@ StartupXLOG(void)
running.xcnt = nxids;
running.subxcnt = 0;
running.subxid_overflow = false;
- running.nextXid = checkPoint.nextXid;
+ running.nextXid = XidFromBigTransactionId(checkPoint.nextBigXid);
running.oldestRunningXid = oldestActiveXID;
- latestCompletedXid = checkPoint.nextXid;
+ latestCompletedXid = XidFromBigTransactionId(checkPoint.nextBigXid);
TransactionIdRetreat(latestCompletedXid);
Assert(TransactionIdIsNormal(latestCompletedXid));
running.latestCompletedXid = latestCompletedXid;
@@ -7208,14 +7206,7 @@ StartupXLOG(void)
* don't need to hold a lock while examining it. We still
* acquire the lock to modify it, though.
*/
- if (TransactionIdFollowsOrEquals(record->xl_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = record->xl_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextBigTransactionIdPast(record->xl_xid, true);
/*
* Before replaying this record, check if this record causes
@@ -7782,7 +7773,7 @@ StartupXLOG(void)
/* also initialize latestCompletedXid, to nextXid - 1 */
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- ShmemVariableCache->latestCompletedXid = ShmemVariableCache->nextXid;
+ ShmemVariableCache->latestCompletedXid = XidFromBigTransactionId(ShmemVariableCache->nextBigXid);
TransactionIdRetreat(ShmemVariableCache->latestCompletedXid);
LWLockRelease(ProcArrayLock);
@@ -8375,41 +8366,6 @@ GetLastSegSwitchData(XLogRecPtr *lastSwitchLSN)
return result;
}
-/*
- * GetNextXidAndEpoch - get the current nextXid value and associated epoch
- *
- * This is exported for use by code that would like to have 64-bit XIDs.
- * We don't really support such things, but all XIDs within the system
- * can be presumed "close to" the result, and thus the epoch associated
- * with them can be determined.
- */
-void
-GetNextXidAndEpoch(TransactionId *xid, uint32 *epoch)
-{
- uint32 ckptXidEpoch;
- TransactionId ckptXid;
- TransactionId nextXid;
-
- /* Must read checkpoint info first, else have race condition */
- SpinLockAcquire(&XLogCtl->info_lck);
- ckptXidEpoch = XLogCtl->ckptXidEpoch;
- ckptXid = XLogCtl->ckptXid;
- SpinLockRelease(&XLogCtl->info_lck);
-
- /* Now fetch current nextXid */
- nextXid = ReadNewTransactionId();
-
- /*
- * nextXid is certainly logically later than ckptXid. So if it's
- * numerically less, it must have wrapped into the next epoch.
- */
- if (nextXid < ckptXid)
- ckptXidEpoch++;
-
- *xid = nextXid;
- *epoch = ckptXidEpoch;
-}
-
/*
* This must be called ONCE during postmaster or standalone-backend shutdown
*/
@@ -8819,7 +8775,7 @@ CreateCheckPoint(int flags)
* there.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- checkPoint.nextXid = ShmemVariableCache->nextXid;
+ checkPoint.nextBigXid = ShmemVariableCache->nextBigXid;
checkPoint.oldestXid = ShmemVariableCache->oldestXid;
checkPoint.oldestXidDB = ShmemVariableCache->oldestXidDB;
LWLockRelease(XidGenLock);
@@ -8829,11 +8785,6 @@ CreateCheckPoint(int flags)
checkPoint.newestCommitTsXid = ShmemVariableCache->newestCommitTsXid;
LWLockRelease(CommitTsLock);
- /* Increase XID epoch if we've wrapped around since last checkpoint */
- checkPoint.nextXidEpoch = ControlFile->checkPointCopy.nextXidEpoch;
- if (checkPoint.nextXid < ControlFile->checkPointCopy.nextXid)
- checkPoint.nextXidEpoch++;
-
LWLockAcquire(OidGenLock, LW_SHARED);
checkPoint.nextOid = ShmemVariableCache->nextOid;
if (!shutdown)
@@ -8977,8 +8928,7 @@ CreateCheckPoint(int flags)
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptBigXid = checkPoint.nextBigXid;
SpinLockRelease(&XLogCtl->info_lck);
/*
@@ -9733,7 +9683,7 @@ xlog_redo(XLogReaderState *record)
memcpy(&checkPoint, XLogRecGetData(record), sizeof(CheckPoint));
/* In a SHUTDOWN checkpoint, believe the counters exactly */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextBigXid = checkPoint.nextBigXid;
LWLockRelease(XidGenLock);
LWLockAcquire(OidGenLock, LW_EXCLUSIVE);
ShmemVariableCache->nextOid = checkPoint.nextOid;
@@ -9787,9 +9737,9 @@ xlog_redo(XLogReaderState *record)
running.xcnt = nxids;
running.subxcnt = 0;
running.subxid_overflow = false;
- running.nextXid = checkPoint.nextXid;
+ running.nextXid = XidFromBigTransactionId(checkPoint.nextBigXid);
running.oldestRunningXid = oldestActiveXID;
- latestCompletedXid = checkPoint.nextXid;
+ latestCompletedXid = XidFromBigTransactionId(checkPoint.nextBigXid);
TransactionIdRetreat(latestCompletedXid);
Assert(TransactionIdIsNormal(latestCompletedXid));
running.latestCompletedXid = latestCompletedXid;
@@ -9801,13 +9751,11 @@ xlog_redo(XLogReaderState *record)
}
/* ControlFile->checkPointCopy always tracks the latest ckpt XID */
- ControlFile->checkPointCopy.nextXidEpoch = checkPoint.nextXidEpoch;
- ControlFile->checkPointCopy.nextXid = checkPoint.nextXid;
+ ControlFile->checkPointCopy.nextBigXid = checkPoint.nextBigXid;
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptBigXid = checkPoint.nextBigXid;
SpinLockRelease(&XLogCtl->info_lck);
/*
@@ -9828,9 +9776,8 @@ xlog_redo(XLogReaderState *record)
memcpy(&checkPoint, XLogRecGetData(record), sizeof(CheckPoint));
/* In an ONLINE checkpoint, treat the XID counter as a minimum */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- if (TransactionIdPrecedes(ShmemVariableCache->nextXid,
- checkPoint.nextXid))
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ if (ShmemVariableCache->nextBigXid < checkPoint.nextBigXid)
+ ShmemVariableCache->nextBigXid = checkPoint.nextBigXid;
LWLockRelease(XidGenLock);
/*
@@ -9860,13 +9807,11 @@ xlog_redo(XLogReaderState *record)
SetTransactionIdLimit(checkPoint.oldestXid,
checkPoint.oldestXidDB);
/* ControlFile->checkPointCopy always tracks the latest ckpt XID */
- ControlFile->checkPointCopy.nextXidEpoch = checkPoint.nextXidEpoch;
- ControlFile->checkPointCopy.nextXid = checkPoint.nextXid;
+ ControlFile->checkPointCopy.nextBigXid = checkPoint.nextBigXid;
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptBigXid = checkPoint.nextBigXid;
SpinLockRelease(&XLogCtl->info_lck);
/* TLI should not change in an on-line checkpoint */
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index d90cb9a9022..fd3015af272 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -647,7 +647,7 @@ vacuum_set_xid_limits(Relation rel,
* autovacuum_freeze_max_age / 2 XIDs old), complain and force a minimum
* freeze age of zero.
*/
- safeLimit = ReadNewTransactionId() - autovacuum_freeze_max_age;
+ safeLimit = XidFromBigTransactionId(ReadNextBigTransactionId()) - autovacuum_freeze_max_age;
if (!TransactionIdIsNormal(safeLimit))
safeLimit = FirstNormalTransactionId;
@@ -725,7 +725,7 @@ vacuum_set_xid_limits(Relation rel,
* Compute XID limit causing a full-table vacuum, being careful not to
* generate a "permanent" XID.
*/
- limit = ReadNewTransactionId() - freezetable;
+ limit = XidFromBigTransactionId(ReadNextBigTransactionId()) - freezetable;
if (!TransactionIdIsNormal(limit))
limit = FirstNormalTransactionId;
@@ -944,7 +944,7 @@ vac_update_relstats(Relation relation,
if (TransactionIdIsNormal(frozenxid) &&
pgcform->relfrozenxid != frozenxid &&
(TransactionIdPrecedes(pgcform->relfrozenxid, frozenxid) ||
- TransactionIdPrecedes(ReadNewTransactionId(),
+ TransactionIdPrecedes(XidFromBigTransactionId(ReadNextBigTransactionId()),
pgcform->relfrozenxid)))
{
pgcform->relfrozenxid = frozenxid;
@@ -1021,7 +1021,7 @@ vac_update_datfrozenxid(void)
* validly see during the scan. These are conservative values, but it's
* not really worth trying to be more exact.
*/
- lastSaneFrozenXid = ReadNewTransactionId();
+ lastSaneFrozenXid = XidFromBigTransactionId(ReadNextBigTransactionId());
lastSaneMinMulti = ReadNextMultiXactId();
/*
@@ -1157,7 +1157,7 @@ vac_truncate_clog(TransactionId frozenXID,
TransactionId lastSaneFrozenXid,
MultiXactId lastSaneMinMulti)
{
- TransactionId nextXID = ReadNewTransactionId();
+ TransactionId nextXID = XidFromBigTransactionId(ReadNextBigTransactionId());
Relation relation;
HeapScanDesc scan;
HeapTuple tuple;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 02e6d8131e0..881e4327013 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1172,7 +1172,7 @@ do_start_worker(void)
* pass without forcing a vacuum. (This limit can be tightened for
* particular tables, but not loosened.)
*/
- recentXid = ReadNewTransactionId();
+ recentXid = XidFromBigTransactionId(ReadNextBigTransactionId());
xidForceLimit = recentXid - autovacuum_freeze_max_age;
/* ensure it's a "normal" XID, else TransactionIdPrecedes misbehaves */
/* this can cause the limit to go backwards by 3, but that's OK */
@@ -1703,7 +1703,7 @@ AutoVacWorkerMain(int argc, char *argv[])
pg_usleep(PostAuthDelay * 1000000L);
/* And do an appropriate amount of work */
- recentXid = ReadNewTransactionId();
+ recentXid = XidFromBigTransactionId(ReadNextBigTransactionId());
recentMulti = ReadNextMultiXactId();
do_autovacuum();
}
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 987bb84683c..09a869a3086 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -1194,6 +1194,7 @@ static void
XLogWalRcvSendHSFeedback(bool immed)
{
TimestampTz now;
+ BigTransactionId nextBigXid;
TransactionId nextXid;
uint32 xmin_epoch,
catalog_xmin_epoch;
@@ -1272,7 +1273,9 @@ XLogWalRcvSendHSFeedback(bool immed)
* Get epoch and adjust if nextXid and oldestXmin are different sides of
* the epoch boundary.
*/
- GetNextXidAndEpoch(&nextXid, &xmin_epoch);
+ nextBigXid = ReadNextBigTransactionId();
+ nextXid = XidFromBigTransactionId(nextBigXid);
+ xmin_epoch = EpochFromBigTransactionId(nextBigXid);
catalog_xmin_epoch = xmin_epoch;
if (nextXid < xmin)
xmin_epoch--;
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index e47ddca6bca..edf9d1226bf 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1888,35 +1888,17 @@ PhysicalReplicationSlotNewXmin(TransactionId feedbackXmin, TransactionId feedbac
* Check that the provided xmin/epoch are sane, that is, not in the future
* and not so far back as to be already wrapped around.
*
- * Epoch of nextXid should be same as standby, or if the counter has
- * wrapped, then one greater than standby.
- *
* This check doesn't care about whether clog exists for these xids
* at all.
*/
static bool
TransactionIdInRecentPast(TransactionId xid, uint32 epoch)
{
- TransactionId nextXid;
- uint32 nextEpoch;
-
- GetNextXidAndEpoch(&nextXid, &nextEpoch);
-
- if (xid <= nextXid)
- {
- if (epoch != nextEpoch)
- return false;
- }
- else
- {
- if (epoch + 1 != nextEpoch)
- return false;
- }
-
- if (!TransactionIdPrecedesOrEquals(xid, nextXid))
- return false; /* epoch OK, but it's wrapped around */
+ BigTransactionId nextBigXid = ReadNextBigTransactionId();
+ BigTransactionId bigXid = MakeBigTransactionId(epoch, xid);
- return true;
+ return bigXid <= nextBigXid &&
+ nextBigXid - bigXid < (BigTransactionId) 1 << 32;
}
/*
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index bd20497d81a..b97f15f2c28 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -883,15 +883,10 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
* it, though.
*/
nextXid = latestObservedXid;
+ AdvanceNextBigTransactionIdPast(nextXid, true);
TransactionIdAdvance(nextXid);
- if (TransactionIdFollows(nextXid, ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = nextXid;
- LWLockRelease(XidGenLock);
- }
- Assert(TransactionIdIsValid(ShmemVariableCache->nextXid));
+ Assert(TransactionIdIsValid(XidFromBigTransactionId(ShmemVariableCache->nextBigXid)));
KnownAssignedXidsDisplay(trace_recovery(DEBUG3));
if (standbyState == STANDBY_SNAPSHOT_READY)
@@ -1979,7 +1974,7 @@ GetRunningTransactionData(void)
latestCompletedXid = ShmemVariableCache->latestCompletedXid;
- oldestRunningXid = ShmemVariableCache->nextXid;
+ oldestRunningXid = XidFromBigTransactionId(ShmemVariableCache->nextBigXid);
/*
* Spin over procArray collecting all xids
@@ -2068,7 +2063,7 @@ GetRunningTransactionData(void)
CurrentRunningXacts->xcnt = count - subcount;
CurrentRunningXacts->subxcnt = subcount;
CurrentRunningXacts->subxid_overflow = suboverflowed;
- CurrentRunningXacts->nextXid = ShmemVariableCache->nextXid;
+ CurrentRunningXacts->nextXid = XidFromBigTransactionId(ShmemVariableCache->nextBigXid);
CurrentRunningXacts->oldestRunningXid = oldestRunningXid;
CurrentRunningXacts->latestCompletedXid = latestCompletedXid;
@@ -2113,7 +2108,7 @@ GetOldestActiveTransactionId(void)
* have already completed), when we spin over it.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- oldestRunningXid = ShmemVariableCache->nextXid;
+ oldestRunningXid = XidFromBigTransactionId(ShmemVariableCache->nextBigXid);
LWLockRelease(XidGenLock);
/*
@@ -2181,7 +2176,7 @@ GetOldestSafeDecodingTransactionId(bool catalogOnly)
* a safe, albeit pessimal, value.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- oldestSafeXid = ShmemVariableCache->nextXid;
+ oldestSafeXid = XidFromBigTransactionId(ShmemVariableCache->nextBigXid);
/*
* If there's already a slot pegging the xmin horizon, we can start with
@@ -3237,10 +3232,8 @@ RecordKnownAssignedTransactionIds(TransactionId xid)
/* ShmemVariableCache->nextXid must be beyond any observed xid */
next_expected_xid = latestObservedXid;
+ AdvanceNextBigTransactionIdPast(next_expected_xid, false);
TransactionIdAdvance(next_expected_xid);
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = next_expected_xid;
- LWLockRelease(XidGenLock);
}
}
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index e8390311d03..caf4cb1b915 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -3270,7 +3270,7 @@ ReleasePredicateLocks(bool isCommit)
* transaction to complete before freeing some RAM; correctness of visible
* behavior is not affected.
*/
- MySerializableXact->finishedBefore = ShmemVariableCache->nextXid;
+ MySerializableXact->finishedBefore = XidFromBigTransactionId(ShmemVariableCache->nextBigXid);
/*
* If it's not a commit it's a rollback, and we can clear our locks
diff --git a/src/backend/utils/adt/txid.c b/src/backend/utils/adt/txid.c
index 7974c0bd3d8..ac355684587 100644
--- a/src/backend/utils/adt/txid.c
+++ b/src/backend/utils/adt/txid.c
@@ -91,7 +91,10 @@ typedef struct
static void
load_xid_epoch(TxidEpoch *state)
{
- GetNextXidAndEpoch(&state->last_xid, &state->epoch);
+ BigTransactionId bigXid = ReadNextBigTransactionId();
+
+ state->last_xid = XidFromBigTransactionId(bigXid);
+ state->epoch = EpochFromBigTransactionId(bigXid);
}
/*
@@ -114,8 +117,11 @@ TransactionIdInRecentPast(uint64 xid_with_epoch, TransactionId *extracted_xid)
TransactionId xid = (TransactionId) xid_with_epoch;
uint32 now_epoch;
TransactionId now_epoch_last_xid;
+ BigTransactionId now_xid;
- GetNextXidAndEpoch(&now_epoch_last_xid, &now_epoch);
+ now_xid = ReadNextBigTransactionId();
+ now_epoch_last_xid = XidFromBigTransactionId(now_xid);
+ now_epoch = EpochFromBigTransactionId(now_xid);
if (extracted_xid != NULL)
*extracted_xid = xid;
@@ -128,8 +134,7 @@ TransactionIdInRecentPast(uint64 xid_with_epoch, TransactionId *extracted_xid)
return true;
/* If the transaction ID is in the future, throw an error. */
- if (xid_epoch > now_epoch
- || (xid_epoch == now_epoch && xid > now_epoch_last_xid))
+ if (xid_with_epoch > now_xid)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("transaction ID %s is in the future",
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 3fc8b6a8a84..461344f9dcc 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -16,6 +16,7 @@
#include "postgres.h"
#include "access/htup_details.h"
+#include "access/transam.h"
#include "access/xlog_internal.h"
#include "access/xlog.h"
#include "catalog/pg_control.h"
@@ -164,8 +165,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
nulls[5] = false;
values[6] = CStringGetTextDatum(psprintf("%u:%u",
- ControlFile->checkPointCopy.nextXidEpoch,
- ControlFile->checkPointCopy.nextXid));
+ EpochFromBigTransactionId(ControlFile->checkPointCopy.nextBigXid),
+ XidFromBigTransactionId(ControlFile->checkPointCopy.nextBigXid)));
nulls[6] = false;
values[7] = ObjectIdGetDatum(ControlFile->checkPointCopy.nextOid);
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index 895a51f89d5..f6ecfb850c9 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -20,6 +20,7 @@
#include <time.h>
+#include "access/transam.h"
#include "access/xlog.h"
#include "access/xlog_internal.h"
#include "catalog/pg_control.h"
@@ -256,8 +257,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's full_page_writes: %s\n"),
ControlFile->checkPointCopy.fullPageWrites ? _("on") : _("off"));
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
- ControlFile->checkPointCopy.nextXidEpoch,
- ControlFile->checkPointCopy.nextXid);
+ EpochFromBigTransactionId(ControlFile->checkPointCopy.nextBigXid),
+ XidFromBigTransactionId(ControlFile->checkPointCopy.nextBigXid));
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index 8cff5356925..485e04920ce 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -431,11 +431,15 @@ main(int argc, char *argv[])
* if any, includes these values.)
*/
if (set_xid_epoch != -1)
- ControlFile.checkPointCopy.nextXidEpoch = set_xid_epoch;
+ ControlFile.checkPointCopy.nextBigXid =
+ MakeBigTransactionId(set_xid_epoch,
+ XidFromBigTransactionId(ControlFile.checkPointCopy.nextBigXid));
if (set_xid != 0)
{
- ControlFile.checkPointCopy.nextXid = set_xid;
+ ControlFile.checkPointCopy.nextBigXid =
+ MakeBigTransactionId(EpochFromBigTransactionId(ControlFile.checkPointCopy.nextBigXid),
+ set_xid);
/*
* For the moment, just set oldestXid to a value that will force
@@ -705,8 +709,7 @@ GuessControlValues(void)
ControlFile.checkPointCopy.ThisTimeLineID = 1;
ControlFile.checkPointCopy.PrevTimeLineID = 1;
ControlFile.checkPointCopy.fullPageWrites = false;
- ControlFile.checkPointCopy.nextXidEpoch = 0;
- ControlFile.checkPointCopy.nextXid = FirstNormalTransactionId;
+ ControlFile.checkPointCopy.nextBigXid = MakeBigTransactionId(0, FirstNormalTransactionId);
ControlFile.checkPointCopy.nextOid = FirstBootstrapObjectId;
ControlFile.checkPointCopy.nextMulti = FirstMultiXactId;
ControlFile.checkPointCopy.nextMultiOffset = 0;
@@ -786,8 +789,8 @@ PrintControlValues(bool guessed)
printf(_("Latest checkpoint's full_page_writes: %s\n"),
ControlFile.checkPointCopy.fullPageWrites ? _("on") : _("off"));
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
- ControlFile.checkPointCopy.nextXidEpoch,
- ControlFile.checkPointCopy.nextXid);
+ EpochFromBigTransactionId(ControlFile.checkPointCopy.nextBigXid),
+ XidFromBigTransactionId(ControlFile.checkPointCopy.nextBigXid));
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile.checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
@@ -879,7 +882,7 @@ PrintNewControlValues(void)
if (set_xid != 0)
{
printf(_("NextXID: %u\n"),
- ControlFile.checkPointCopy.nextXid);
+ XidFromBigTransactionId(ControlFile.checkPointCopy.nextBigXid));
printf(_("OldestXID: %u\n"),
ControlFile.checkPointCopy.oldestXid);
printf(_("OldestXID's DB: %u\n"),
@@ -889,7 +892,7 @@ PrintNewControlValues(void)
if (set_xid_epoch != -1)
{
printf(_("NextXID epoch: %u\n"),
- ControlFile.checkPointCopy.nextXidEpoch);
+ EpochFromBigTransactionId(ControlFile.checkPointCopy.nextBigXid));
}
if (set_oldest_commit_ts_xid != 0)
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 83ec3f19797..af40efd079d 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -44,6 +44,10 @@
#define TransactionIdStore(xid, dest) (*(dest) = (xid))
#define StoreInvalidTransactionId(dest) (*(dest) = InvalidTransactionId)
+#define EpochFromBigTransactionId(x) ((uint32) ((x) >> 32))
+#define XidFromBigTransactionId(x) ((uint32) (x))
+#define MakeBigTransactionId(epoch, xid)((((uint64)(epoch)) << 32 ) | (xid))
+
/* advance a transaction ID variable, handling wraparound correctly */
#define TransactionIdAdvance(dest) \
do { \
@@ -52,6 +56,15 @@
(dest) = FirstNormalTransactionId; \
} while(0)
+/* advance a BigTransactionId lvalue, handling wraparound correctly */
+#define BigTransactionIdAdvance(dest) \
+ do { \
+ (dest)++; \
+ if (XidFromBigTransactionId(dest) < FirstNormalTransactionId) \
+ (dest) = MakeBigTransactionId(EpochFromBigTransactionId(dest), \
+ FirstNormalTransactionId); \
+ } while (0)
+
/* back up a transaction ID variable, handling wraparound correctly */
#define TransactionIdRetreat(dest) \
do { \
@@ -114,12 +127,12 @@ typedef struct VariableCacheData
/*
* These fields are protected by XidGenLock.
*/
- TransactionId nextXid; /* next XID to assign */
+ BigTransactionId nextBigXid; /* next XID to assign */
TransactionId oldestXid; /* cluster-wide minimum datfrozenxid */
TransactionId xidVacLimit; /* start forcing autovacuums here */
TransactionId xidWarnLimit; /* start complaining here */
- TransactionId xidStopLimit; /* refuse to advance nextXid beyond here */
+ TransactionId xidStopLimit; /* refuse to advance nextBigXid beyond here */
TransactionId xidWrapLimit; /* where the world ends */
Oid oldestXidDB; /* database with minimum datfrozenxid */
@@ -176,7 +189,8 @@ extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
/* in transam/varsup.c */
extern TransactionId GetNewTransactionId(bool isSubXact);
-extern TransactionId ReadNewTransactionId(void);
+extern void AdvanceNextBigTransactionIdPast(TransactionId xid, bool lock_free_check);
+extern BigTransactionId ReadNextBigTransactionId(void);
extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
Oid oldest_datoid);
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 421ba6d7755..3c9d3401df5 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -276,7 +276,6 @@ extern XLogRecPtr GetRedoRecPtr(void);
extern XLogRecPtr GetInsertRecPtr(void);
extern XLogRecPtr GetFlushRecPtr(void);
extern XLogRecPtr GetLastImportantRecPtr(void);
-extern void GetNextXidAndEpoch(TransactionId *xid, uint32 *epoch);
extern void RemovePromoteSignalFiles(void);
extern bool CheckPromoteSignal(void);
diff --git a/src/include/c.h b/src/include/c.h
index 1e50103095b..7f643799601 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -471,6 +471,8 @@ typedef double float8;
typedef Oid regproc;
typedef regproc RegProcedure;
+typedef uint64 BigTransactionId; /* epoch and xid as one value */
+
typedef uint32 TransactionId;
typedef uint32 LocalTransactionId;
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 773d9e6ebae..0d8e52c091e 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -39,8 +39,7 @@ typedef struct CheckPoint
TimeLineID PrevTimeLineID; /* previous TLI, if this record begins a new
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
- uint32 nextXidEpoch; /* higher-order bits of nextXid */
- TransactionId nextXid; /* next free XID */
+ BigTransactionId nextBigXid; /* next free XID */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index 1fcd8cf1b59..1c62202f4e9 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -72,7 +72,7 @@ typedef struct RunningTransactionsData
int xcnt; /* # of xact ids in xids[] */
int subxcnt; /* # of subxact ids in xids[] */
bool subxid_overflow; /* snapshot overflowed, subxids missing */
- TransactionId nextXid; /* copy of ShmemVariableCache->nextXid */
+ TransactionId nextXid; /* xid from ShmemVariableCache->nextBigXid */
TransactionId oldestRunningXid; /* *not* oldestXmin */
TransactionId latestCompletedXid; /* so we can set xmax */
--
2.17.0
On 2018-07-10 11:35:59 +1200, Thomas Munro wrote:
I played around with this idea yesterday. Experiment-grade patch
attached.
Cool!
I think it's probably a good idea to make it very explicit when moving
between big and small transaction IDs, hence the including of the word
'big' in variable and function names and the use of a function-like
macro (rather than implicit conversion, which C doesn't give me a good
way to prevent). Otherwise there is a class of bug that is hidden for
the first 2^32 transactions.
You could have BigTransactionId (maybe renamed to FullTransactionId?) be
a struct type. That'd prevent such issues. Most compilers these days
should be more than good enough to optimize passing around an 8byte
struct by value...
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
On 2018-07-10 11:35:59 +1200, Thomas Munro wrote:
I think it's probably a good idea to make it very explicit when moving
between big and small transaction IDs, hence the including of the word
'big' in variable and function names and the use of a function-like
macro (rather than implicit conversion, which C doesn't give me a good
way to prevent). Otherwise there is a class of bug that is hidden for
the first 2^32 transactions.
You could have BigTransactionId (maybe renamed to FullTransactionId?) be
a struct type. That'd prevent such issues. Most compilers these days
should be more than good enough to optimize passing around an 8byte
struct by value...
Or, perhaps, use a struct in assert builds and int64 otherwise?
You could hide the ensuing notational differences in macros.
regards, tom lane
Hi,
On 2018-07-09 19:56:25 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
On 2018-07-10 11:35:59 +1200, Thomas Munro wrote:
I think it's probably a good idea to make it very explicit when moving
between big and small transaction IDs, hence the including of the word
'big' in variable and function names and the use of a function-like
macro (rather than implicit conversion, which C doesn't give me a good
way to prevent). Otherwise there is a class of bug that is hidden for
the first 2^32 transactions.You could have BigTransactionId (maybe renamed to FullTransactionId?) be
a struct type. That'd prevent such issues. Most compilers these days
should be more than good enough to optimize passing around an 8byte
struct by value...Or, perhaps, use a struct in assert builds and int64 otherwise?
You could hide the ensuing notational differences in macros.
That should be doable. But I'd like to check if it's necessary
first. Optimizing passing an 8 byte struct shouldn't be hard for
compilers these days - especially when most things dealing with them are
inline functions. If we find that it's not a problem on contemporary
compilers, it might be worthwhile to use a bit more type safety in other
places too.
Here's what gcc does on O1:
#include <stdint.h>
typedef struct foo
{
uint64_t id;
} foo;
extern foo take_foo_struct(foo f, int i);
extern uint64_t take_foo_int(uint64_t id, int i);
foo take_foo_struct(foo f, int i)
{
f.id += i;
return f;
}
uint64_t take_foo_int(uint64_t id, int i)
{
id += i;
return id;
}
results in:
.file "test.c"
.text
.globl take_foo_struct
.type take_foo_struct, @function
take_foo_struct:
.LFB0:
.cfi_startproc
movslq %esi, %rax
addq %rdi, %rax
ret
.cfi_endproc
.LFE0:
.size take_foo_struct, .-take_foo_struct
.globl take_foo_int
.type take_foo_int, @function
take_foo_int:
.LFB1:
.cfi_startproc
movslq %esi, %rax
addq %rdi, %rax
ret
.cfi_endproc
.LFE1:
.size take_foo_int, .-take_foo_int
.ident "GCC: (Debian 7.3.0-24) 7.3.0"
.section .note.GNU-stack,"",@progbits
IOW, exactly the same code generated. Note that the compiler does *not*
see the callsites in this case, i.e. this is platform ABI conformant.
Greetings,
Andres Freund
On 2018-07-09 17:08:34 -0700, Andres Freund wrote:
Hi,
On 2018-07-09 19:56:25 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
On 2018-07-10 11:35:59 +1200, Thomas Munro wrote:
I think it's probably a good idea to make it very explicit when moving
between big and small transaction IDs, hence the including of the word
'big' in variable and function names and the use of a function-like
macro (rather than implicit conversion, which C doesn't give me a good
way to prevent). Otherwise there is a class of bug that is hidden for
the first 2^32 transactions.You could have BigTransactionId (maybe renamed to FullTransactionId?) be
a struct type. That'd prevent such issues. Most compilers these days
should be more than good enough to optimize passing around an 8byte
struct by value...Or, perhaps, use a struct in assert builds and int64 otherwise?
You could hide the ensuing notational differences in macros.That should be doable. But I'd like to check if it's necessary
first. Optimizing passing an 8 byte struct shouldn't be hard for
compilers these days - especially when most things dealing with them are
inline functions. If we find that it's not a problem on contemporary
compilers, it might be worthwhile to use a bit more type safety in other
places too.Here's what gcc does on O1:
#include <stdint.h>
typedef struct foo
{
uint64_t id;
} foo;extern foo take_foo_struct(foo f, int i);
extern uint64_t take_foo_int(uint64_t id, int i);foo take_foo_struct(foo f, int i)
{
f.id += i;
return f;
}uint64_t take_foo_int(uint64_t id, int i)
{
id += i;
return id;
}results in:
.file "test.c"
.text
.globl take_foo_struct
.type take_foo_struct, @function
take_foo_struct:
.LFB0:
.cfi_startproc
movslq %esi, %rax
addq %rdi, %rax
ret
.cfi_endproc
.LFE0:
.size take_foo_struct, .-take_foo_struct
.globl take_foo_int
.type take_foo_int, @function
take_foo_int:
.LFB1:
.cfi_startproc
movslq %esi, %rax
addq %rdi, %rax
ret
.cfi_endproc
.LFE1:
.size take_foo_int, .-take_foo_int
.ident "GCC: (Debian 7.3.0-24) 7.3.0"
.section .note.GNU-stack,"",@progbitsIOW, exactly the same code generated. Note that the compiler does *not*
see the callsites in this case, i.e. this is platform ABI conformant.
FWIW, this is required by the x86-64 SYSV ABI. See
https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf
3.2.3 Parameter Passing. Aggregates with scalar types up to "two
eightbytes" are passed via registers.
It's also the case for MSVC / windows
https://docs.microsoft.com/en-us/cpp/cpp/argument-passing-and-naming-conventions
https://docs.microsoft.com/en-us/cpp/build/parameter-passing
Small aggregates (8, 16, 32, or 64 bits) are passed in registers.
Greetings,
Andres Freund
On Tue, Jul 10, 2018 at 12:08 PM, Andres Freund <andres@anarazel.de> wrote:
On 2018-07-09 19:56:25 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
On 2018-07-10 11:35:59 +1200, Thomas Munro wrote:
I think it's probably a good idea to make it very explicit when moving
between big and small transaction IDs, hence the including of the word
'big' in variable and function names and the use of a function-like
macro (rather than implicit conversion, which C doesn't give me a good
way to prevent). Otherwise there is a class of bug that is hidden for
the first 2^32 transactions.You could have BigTransactionId (maybe renamed to FullTransactionId?) be
a struct type. That'd prevent such issues. Most compilers these days
should be more than good enough to optimize passing around an 8byte
struct by value...Or, perhaps, use a struct in assert builds and int64 otherwise?
You could hide the ensuing notational differences in macros.That should be doable. But I'd like to check if it's necessary
first. Optimizing passing an 8 byte struct shouldn't be hard for
compilers these days - especially when most things dealing with them are
inline functions. If we find that it's not a problem on contemporary
compilers, it might be worthwhile to use a bit more type safety in other
places too....
IOW, exactly the same code generated. Note that the compiler does *not*
see the callsites in this case, i.e. this is platform ABI conformant.
I like it. Here's a version that uses a struct named
FullTransactionId (yeah, that's a better name, thanks), defined in
transam.h because c.h didn't feel right.
Client code lost the ability to use operator < directly. I needed to
use a static inline function as a constructor. I lost the
interchangeability with the wide xids in txid.c, so I provided
U64FromFullTransactionId() (I think that'll be useful for
serialisation over the write too). I don't know what to think about
the encoding or meaning of non-normal xids in this thing.
--
Thomas Munro
http://www.enterprisedb.com
Attachments:
0001-Track-the-next-xid-using-64-bits-v2.patchapplication/octet-stream; name=0001-Track-the-next-xid-using-64-bits-v2.patchDownload
From 9bcaa2b73e7ab30813e3bc1f90528d60f7886d8a Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@enterprisedb.com>
Date: Mon, 9 Jul 2018 21:54:03 +1200
Subject: [PATCH] Track the next xid using 64 bits.
Instead of tracking the epoch independently, start using a 64 bit transaction
ID in several places. This fix an unlikely bug where an epoch increment could
be missed if you managed to consume more than 2^32 transactions between
checkpoints.
Work in progress!
Author: Thomas Munro
Reviewed-by: Andres Freund
Diagnosis-by: Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1%2BMv%2Bmb0HFfWM9Srtc6MVe160WFurXV68iAFMcagRZ0dQ%40mail.gmail.com
---
src/backend/access/nbtree/nbtpage.c | 4 +-
src/backend/access/rmgrdesc/xlogdesc.c | 4 +-
src/backend/access/transam/clog.c | 4 +-
src/backend/access/transam/commit_ts.c | 5 +-
src/backend/access/transam/multixact.c | 9 +-
src/backend/access/transam/subtrans.c | 2 +-
src/backend/access/transam/twophase.c | 17 +---
src/backend/access/transam/varsup.c | 53 +++++++++---
src/backend/access/transam/xact.c | 21 +----
src/backend/access/transam/xlog.c | 104 ++++++------------------
src/backend/commands/vacuum.c | 10 +--
src/backend/postmaster/autovacuum.c | 4 +-
src/backend/replication/walreceiver.c | 5 +-
src/backend/replication/walsender.c | 31 ++-----
src/backend/storage/ipc/procarray.c | 21 ++---
src/backend/storage/lmgr/predicate.c | 2 +-
src/backend/utils/adt/txid.c | 13 ++-
src/backend/utils/misc/pg_controldata.c | 5 +-
src/bin/pg_controldata/pg_controldata.c | 5 +-
src/bin/pg_resetwal/pg_resetwal.c | 19 +++--
src/include/access/transam.h | 42 +++++++++-
src/include/access/xlog.h | 1 -
src/include/catalog/pg_control.h | 4 +-
src/include/storage/standby.h | 2 +-
24 files changed, 180 insertions(+), 207 deletions(-)
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 2e959da5f85..3e1f3683734 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -1946,7 +1946,7 @@ _bt_unlink_halfdead_page(Relation rel, Buffer leafbuf, bool *rightsib_empty)
* Mark the page itself deleted. It can be recycled when all current
* transactions are gone. Storing GetTopTransactionId() would work, but
* we're in VACUUM and would not otherwise have an XID. Having already
- * updated links to the target, ReadNewTransactionId() suffices as an
+ * updated links to the target, ReadNextFullTransactionId() suffices as an
* upper bound. Any scan having retained a now-stale link is advertising
* in its PGXACT an xmin less than or equal to the value we read here. It
* will continue to do so, holding back RecentGlobalXmin, for the duration
@@ -1956,7 +1956,7 @@ _bt_unlink_halfdead_page(Relation rel, Buffer leafbuf, bool *rightsib_empty)
opaque = (BTPageOpaque) PageGetSpecialPointer(page);
opaque->btpo_flags &= ~BTP_HALF_DEAD;
opaque->btpo_flags |= BTP_DELETED;
- opaque->btpo.xact = ReadNewTransactionId();
+ opaque->btpo.xact = XidFromFullTransactionId(ReadNextFullTransactionId());
/* And update the metapage, if needed */
if (BufferIsValid(metabuf))
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 00741c7b09e..549f9dae305 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -14,6 +14,7 @@
*/
#include "postgres.h"
+#include "access/transam.h"
#include "access/xlog.h"
#include "access/xlog_internal.h"
#include "catalog/pg_control.h"
@@ -52,7 +53,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->ThisTimeLineID,
checkpoint->PrevTimeLineID,
checkpoint->fullPageWrites ? "true" : "false",
- checkpoint->nextXidEpoch, checkpoint->nextXid,
+ EpochFromFullTransactionId(checkpoint->nextFullXid),
+ XidFromFullTransactionId(checkpoint->nextFullXid),
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index 8b7ff5b0c24..4fccbc9516c 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -754,7 +754,7 @@ ZeroCLOGPage(int pageno, bool writeXlog)
void
StartupCLOG(void)
{
- TransactionId xid = ShmemVariableCache->nextXid;
+ TransactionId xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
int pageno = TransactionIdToPage(xid);
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
@@ -773,7 +773,7 @@ StartupCLOG(void)
void
TrimCLOG(void)
{
- TransactionId xid = ShmemVariableCache->nextXid;
+ TransactionId xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
int pageno = TransactionIdToPage(xid);
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 73fac1ba81d..da85904b3d6 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -644,7 +644,7 @@ ActivateCommitTs(void)
}
LWLockRelease(CommitTsLock);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
pageno = TransactionIdToCTsPage(xid);
/*
@@ -671,7 +671,8 @@ ActivateCommitTs(void)
if (ShmemVariableCache->oldestCommitTsXid == InvalidTransactionId)
{
ShmemVariableCache->oldestCommitTsXid =
- ShmemVariableCache->newestCommitTsXid = ReadNewTransactionId();
+ ShmemVariableCache->newestCommitTsXid =
+ XidFromFullTransactionId(ReadNextFullTransactionId());
}
LWLockRelease(CommitTsLock);
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index a9a51055e96..7478beed44d 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -3283,14 +3283,7 @@ multixact_redo(XLogReaderState *record)
* process doesn't need to hold a lock while checking this. We still
* acquire the lock to modify it, though.
*/
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPast(max_xid, true);
}
else if (info == XLOG_MULTIXACT_TRUNCATE_ID)
{
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c
index 4faa21f5aef..fa0847afc81 100644
--- a/src/backend/access/transam/subtrans.c
+++ b/src/backend/access/transam/subtrans.c
@@ -261,7 +261,7 @@ StartupSUBTRANS(TransactionId oldestActiveXID)
LWLockAcquire(SubtransControlLock, LW_EXCLUSIVE);
startPage = TransactionIdToPage(oldestActiveXID);
- endPage = TransactionIdToPage(ShmemVariableCache->nextXid);
+ endPage = TransactionIdToPage(XidFromFullTransactionId(ShmemVariableCache->nextFullXid));
while (startPage != endPage)
{
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index e8d4e37fe30..a6b1ca0e28d 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -1879,7 +1879,7 @@ restoreTwoPhaseData(void)
TransactionId
PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
{
- TransactionId origNextXid = ShmemVariableCache->nextXid;
+ TransactionId origNextXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
TransactionId result = origNextXid;
TransactionId *xids = NULL;
int nxids = 0;
@@ -2104,7 +2104,7 @@ ProcessTwoPhaseBuffer(TransactionId xid,
bool fromdisk,
bool setParent, bool setNextXid)
{
- TransactionId origNextXid = ShmemVariableCache->nextXid;
+ TransactionId origNextXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
TransactionId *subxids;
char *buf;
TwoPhaseFileHeader *hdr;
@@ -2210,23 +2210,14 @@ ProcessTwoPhaseBuffer(TransactionId xid,
Assert(TransactionIdFollows(subxid, xid));
/* update nextXid if needed */
- if (setNextXid &&
- TransactionIdFollowsOrEquals(subxid,
- ShmemVariableCache->nextXid))
+ if (setNextXid)
{
/*
* We don't expect anyone else to modify nextXid, hence we don't
* need to hold a lock while examining it. We still acquire the
* lock to modify it, though, so we recheck.
*/
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- if (TransactionIdFollowsOrEquals(subxid,
- ShmemVariableCache->nextXid))
- {
- ShmemVariableCache->nextXid = subxid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- }
- LWLockRelease(XidGenLock);
+ AdvanceNextFullTransactionIdPast(subxid, true);
}
if (setParent)
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 394843f7e91..13020f54d98 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -73,7 +73,7 @@ GetNewTransactionId(bool isSubXact)
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*----------
* Check to see if it's safe to assign another XID. This protects against
@@ -156,7 +156,7 @@ GetNewTransactionId(bool isSubXact)
/* Re-acquire lock and start over */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
}
/*
@@ -173,12 +173,12 @@ GetNewTransactionId(bool isSubXact)
ExtendSUBTRANS(xid);
/*
- * Now advance the nextXid counter. This must not happen until after we
+ * Now advance the nextFullXid counter. This must not happen until after we
* have successfully completed ExtendCLOG() --- if that routine fails, we
* want the next incoming transaction to try it again. We cannot assign
* more XIDs until there is CLOG space for them.
*/
- TransactionIdAdvance(ShmemVariableCache->nextXid);
+ FullTransactionIdAdvance(ShmemVariableCache->nextFullXid);
/*
* We must store the new XID into the shared ProcArray before releasing
@@ -244,18 +244,47 @@ GetNewTransactionId(bool isSubXact)
}
/*
- * Read nextXid but don't allocate it.
+ * Read nextFullXid but don't allocate it.
*/
-TransactionId
-ReadNewTransactionId(void)
+FullTransactionId
+ReadNextFullTransactionId(void)
{
- TransactionId xid;
+ FullTransactionId bigXid;
LWLockAcquire(XidGenLock, LW_SHARED);
- xid = ShmemVariableCache->nextXid;
+ bigXid = ShmemVariableCache->nextFullXid;
LWLockRelease(XidGenLock);
- return xid;
+ return bigXid;
+}
+
+/*
+ * Advance nextFullXid to the value after a given xid. The epoch is inferred.
+ * If lock_free_check is true, then the caller must be sure that it's safe to
+ * read nextFullXid without holding XidGenLock (ie during recovery).
+ */
+void
+AdvanceNextFullTransactionIdPast(TransactionId xid, bool lock_free_check)
+{
+ TransactionId current_xid;
+ uint32 epoch;
+
+ if (lock_free_check &&
+ !TransactionIdFollowsOrEquals(xid,
+ XidFromFullTransactionId(ShmemVariableCache->nextFullXid)))
+ return;
+
+ LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
+ current_xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
+ if (TransactionIdFollowsOrEquals(xid, current_xid))
+ {
+ epoch = EpochFromFullTransactionId(ShmemVariableCache->nextFullXid);
+ if (xid < current_xid)
+ ++epoch; /* epoch wrapped */
+ ShmemVariableCache->nextFullXid = MakeFullTransactionId(epoch, xid);
+ FullTransactionIdAdvance(ShmemVariableCache->nextFullXid);
+ }
+ LWLockRelease(XidGenLock);
}
/*
@@ -359,7 +388,7 @@ SetTransactionIdLimit(TransactionId oldest_datfrozenxid, Oid oldest_datoid)
ShmemVariableCache->xidStopLimit = xidStopLimit;
ShmemVariableCache->xidWrapLimit = xidWrapLimit;
ShmemVariableCache->oldestXidDB = oldest_datoid;
- curXid = ShmemVariableCache->nextXid;
+ curXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
LWLockRelease(XidGenLock);
/* Log the info */
@@ -435,7 +464,7 @@ ForceTransactionIdLimitUpdate(void)
/* Locking is probably not really necessary, but let's be careful */
LWLockAcquire(XidGenLock, LW_SHARED);
- nextXid = ShmemVariableCache->nextXid;
+ nextXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
xidVacLimit = ShmemVariableCache->xidVacLimit;
oldestXid = ShmemVariableCache->oldestXid;
oldestXidDB = ShmemVariableCache->oldestXidDB;
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 1da1f13ef33..6ccccc760d5 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -468,7 +468,7 @@ GetStableLatestTransactionId(void)
lxid = MyProc->lxid;
stablexid = GetTopTransactionIdIfAny();
if (!TransactionIdIsValid(stablexid))
- stablexid = ReadNewTransactionId();
+ stablexid = XidFromFullTransactionId(ReadNextFullTransactionId());
}
Assert(TransactionIdIsValid(stablexid));
@@ -5529,14 +5529,7 @@ xact_redo_commit(xl_xact_parsed_commit *parsed,
* hold a lock while checking this. We still acquire the lock to modify
* it, though.
*/
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPast(max_xid, true);
Assert(((parsed->xinfo & XACT_XINFO_HAS_ORIGIN) == 0) ==
(origin_id == InvalidRepOriginId));
@@ -5688,15 +5681,7 @@ xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid)
max_xid = TransactionIdLatest(xid,
parsed->nsubxacts,
parsed->subxacts);
-
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPast(max_xid, true);
if (standbyState == STANDBY_DISABLED)
{
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 20b23cb3609..8d52ca25977 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -579,8 +579,7 @@ typedef struct XLogCtlData
/* Protected by info_lck: */
XLogwrtRqst LogwrtRqst;
XLogRecPtr RedoRecPtr; /* a recent copy of Insert->RedoRecPtr */
- uint32 ckptXidEpoch; /* nextXID & epoch of latest checkpoint */
- TransactionId ckptXid;
+ FullTransactionId ckptFullXid; /* nextXID & epoch of latest checkpoint */
XLogRecPtr asyncXactLSN; /* LSN of newest async commit/abort */
XLogRecPtr replicationSlotMinLSN; /* oldest LSN needed by any slot */
@@ -5102,8 +5101,7 @@ BootStrapXLOG(void)
checkPoint.ThisTimeLineID = ThisTimeLineID;
checkPoint.PrevTimeLineID = ThisTimeLineID;
checkPoint.fullPageWrites = fullPageWrites;
- checkPoint.nextXidEpoch = 0;
- checkPoint.nextXid = FirstNormalTransactionId;
+ checkPoint.nextFullXid = MakeFullTransactionId(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstBootstrapObjectId;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
@@ -5116,7 +5114,7 @@ BootStrapXLOG(void)
checkPoint.time = (pg_time_t) time(NULL);
checkPoint.oldestActiveXid = InvalidTransactionId;
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
@@ -6700,7 +6698,8 @@ StartupXLOG(void)
wasShutdown ? "true" : "false")));
ereport(DEBUG1,
(errmsg_internal("next transaction ID: %u:%u; next OID: %u",
- checkPoint.nextXidEpoch, checkPoint.nextXid,
+ EpochFromFullTransactionId(checkPoint.nextFullXid),
+ XidFromFullTransactionId(checkPoint.nextFullXid),
checkPoint.nextOid)));
ereport(DEBUG1,
(errmsg_internal("next MultiXactId: %u; next MultiXactOffset: %u",
@@ -6715,12 +6714,12 @@ StartupXLOG(void)
(errmsg_internal("commit timestamp Xid oldest/newest: %u/%u",
checkPoint.oldestCommitTsXid,
checkPoint.newestCommitTsXid)));
- if (!TransactionIdIsNormal(checkPoint.nextXid))
+ if (!TransactionIdIsNormal(XidFromFullTransactionId(checkPoint.nextFullXid)))
ereport(PANIC,
(errmsg("invalid next transaction ID")));
/* initialize shared memory variables from the checkpoint record */
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
@@ -6729,8 +6728,7 @@ StartupXLOG(void)
SetMultiXactIdLimit(checkPoint.oldestMulti, checkPoint.oldestMultiDB, true);
SetCommitTsLimit(checkPoint.oldestCommitTsXid,
checkPoint.newestCommitTsXid);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
/*
* Initialize replication slots, before there's a chance to remove
@@ -7000,7 +6998,7 @@ StartupXLOG(void)
Assert(TransactionIdIsValid(oldestActiveXID));
/* Tell procarray about the range of xids it has to deal with */
- ProcArrayInitRecovery(ShmemVariableCache->nextXid);
+ ProcArrayInitRecovery(XidFromFullTransactionId(ShmemVariableCache->nextFullXid));
/*
* Startup commit log and subtrans only. MultiXact and commit
@@ -7030,9 +7028,9 @@ StartupXLOG(void)
running.xcnt = nxids;
running.subxcnt = 0;
running.subxid_overflow = false;
- running.nextXid = checkPoint.nextXid;
+ running.nextXid = XidFromFullTransactionId(checkPoint.nextFullXid);
running.oldestRunningXid = oldestActiveXID;
- latestCompletedXid = checkPoint.nextXid;
+ latestCompletedXid = XidFromFullTransactionId(checkPoint.nextFullXid);
TransactionIdRetreat(latestCompletedXid);
Assert(TransactionIdIsNormal(latestCompletedXid));
running.latestCompletedXid = latestCompletedXid;
@@ -7208,14 +7206,7 @@ StartupXLOG(void)
* don't need to hold a lock while examining it. We still
* acquire the lock to modify it, though.
*/
- if (TransactionIdFollowsOrEquals(record->xl_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = record->xl_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPast(record->xl_xid, true);
/*
* Before replaying this record, check if this record causes
@@ -7782,7 +7773,7 @@ StartupXLOG(void)
/* also initialize latestCompletedXid, to nextXid - 1 */
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- ShmemVariableCache->latestCompletedXid = ShmemVariableCache->nextXid;
+ ShmemVariableCache->latestCompletedXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
TransactionIdRetreat(ShmemVariableCache->latestCompletedXid);
LWLockRelease(ProcArrayLock);
@@ -8375,41 +8366,6 @@ GetLastSegSwitchData(XLogRecPtr *lastSwitchLSN)
return result;
}
-/*
- * GetNextXidAndEpoch - get the current nextXid value and associated epoch
- *
- * This is exported for use by code that would like to have 64-bit XIDs.
- * We don't really support such things, but all XIDs within the system
- * can be presumed "close to" the result, and thus the epoch associated
- * with them can be determined.
- */
-void
-GetNextXidAndEpoch(TransactionId *xid, uint32 *epoch)
-{
- uint32 ckptXidEpoch;
- TransactionId ckptXid;
- TransactionId nextXid;
-
- /* Must read checkpoint info first, else have race condition */
- SpinLockAcquire(&XLogCtl->info_lck);
- ckptXidEpoch = XLogCtl->ckptXidEpoch;
- ckptXid = XLogCtl->ckptXid;
- SpinLockRelease(&XLogCtl->info_lck);
-
- /* Now fetch current nextXid */
- nextXid = ReadNewTransactionId();
-
- /*
- * nextXid is certainly logically later than ckptXid. So if it's
- * numerically less, it must have wrapped into the next epoch.
- */
- if (nextXid < ckptXid)
- ckptXidEpoch++;
-
- *xid = nextXid;
- *epoch = ckptXidEpoch;
-}
-
/*
* This must be called ONCE during postmaster or standalone-backend shutdown
*/
@@ -8819,7 +8775,7 @@ CreateCheckPoint(int flags)
* there.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- checkPoint.nextXid = ShmemVariableCache->nextXid;
+ checkPoint.nextFullXid = ShmemVariableCache->nextFullXid;
checkPoint.oldestXid = ShmemVariableCache->oldestXid;
checkPoint.oldestXidDB = ShmemVariableCache->oldestXidDB;
LWLockRelease(XidGenLock);
@@ -8829,11 +8785,6 @@ CreateCheckPoint(int flags)
checkPoint.newestCommitTsXid = ShmemVariableCache->newestCommitTsXid;
LWLockRelease(CommitTsLock);
- /* Increase XID epoch if we've wrapped around since last checkpoint */
- checkPoint.nextXidEpoch = ControlFile->checkPointCopy.nextXidEpoch;
- if (checkPoint.nextXid < ControlFile->checkPointCopy.nextXid)
- checkPoint.nextXidEpoch++;
-
LWLockAcquire(OidGenLock, LW_SHARED);
checkPoint.nextOid = ShmemVariableCache->nextOid;
if (!shutdown)
@@ -8977,8 +8928,7 @@ CreateCheckPoint(int flags)
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/*
@@ -9733,7 +9683,7 @@ xlog_redo(XLogReaderState *record)
memcpy(&checkPoint, XLogRecGetData(record), sizeof(CheckPoint));
/* In a SHUTDOWN checkpoint, believe the counters exactly */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
LWLockRelease(XidGenLock);
LWLockAcquire(OidGenLock, LW_EXCLUSIVE);
ShmemVariableCache->nextOid = checkPoint.nextOid;
@@ -9787,9 +9737,9 @@ xlog_redo(XLogReaderState *record)
running.xcnt = nxids;
running.subxcnt = 0;
running.subxid_overflow = false;
- running.nextXid = checkPoint.nextXid;
+ running.nextXid = XidFromFullTransactionId(checkPoint.nextFullXid);
running.oldestRunningXid = oldestActiveXID;
- latestCompletedXid = checkPoint.nextXid;
+ latestCompletedXid = XidFromFullTransactionId(checkPoint.nextFullXid);
TransactionIdRetreat(latestCompletedXid);
Assert(TransactionIdIsNormal(latestCompletedXid));
running.latestCompletedXid = latestCompletedXid;
@@ -9801,13 +9751,11 @@ xlog_redo(XLogReaderState *record)
}
/* ControlFile->checkPointCopy always tracks the latest ckpt XID */
- ControlFile->checkPointCopy.nextXidEpoch = checkPoint.nextXidEpoch;
- ControlFile->checkPointCopy.nextXid = checkPoint.nextXid;
+ ControlFile->checkPointCopy.nextFullXid = checkPoint.nextFullXid;
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/*
@@ -9828,9 +9776,9 @@ xlog_redo(XLogReaderState *record)
memcpy(&checkPoint, XLogRecGetData(record), sizeof(CheckPoint));
/* In an ONLINE checkpoint, treat the XID counter as a minimum */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- if (TransactionIdPrecedes(ShmemVariableCache->nextXid,
- checkPoint.nextXid))
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ if (FullTransactionIdPrecedes(ShmemVariableCache->nextFullXid,
+ checkPoint.nextFullXid))
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
LWLockRelease(XidGenLock);
/*
@@ -9860,13 +9808,11 @@ xlog_redo(XLogReaderState *record)
SetTransactionIdLimit(checkPoint.oldestXid,
checkPoint.oldestXidDB);
/* ControlFile->checkPointCopy always tracks the latest ckpt XID */
- ControlFile->checkPointCopy.nextXidEpoch = checkPoint.nextXidEpoch;
- ControlFile->checkPointCopy.nextXid = checkPoint.nextXid;
+ ControlFile->checkPointCopy.nextFullXid = checkPoint.nextFullXid;
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/* TLI should not change in an on-line checkpoint */
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index d90cb9a9022..33f0d7ad299 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -647,7 +647,7 @@ vacuum_set_xid_limits(Relation rel,
* autovacuum_freeze_max_age / 2 XIDs old), complain and force a minimum
* freeze age of zero.
*/
- safeLimit = ReadNewTransactionId() - autovacuum_freeze_max_age;
+ safeLimit = XidFromFullTransactionId(ReadNextFullTransactionId()) - autovacuum_freeze_max_age;
if (!TransactionIdIsNormal(safeLimit))
safeLimit = FirstNormalTransactionId;
@@ -725,7 +725,7 @@ vacuum_set_xid_limits(Relation rel,
* Compute XID limit causing a full-table vacuum, being careful not to
* generate a "permanent" XID.
*/
- limit = ReadNewTransactionId() - freezetable;
+ limit = XidFromFullTransactionId(ReadNextFullTransactionId()) - freezetable;
if (!TransactionIdIsNormal(limit))
limit = FirstNormalTransactionId;
@@ -944,7 +944,7 @@ vac_update_relstats(Relation relation,
if (TransactionIdIsNormal(frozenxid) &&
pgcform->relfrozenxid != frozenxid &&
(TransactionIdPrecedes(pgcform->relfrozenxid, frozenxid) ||
- TransactionIdPrecedes(ReadNewTransactionId(),
+ TransactionIdPrecedes(XidFromFullTransactionId(ReadNextFullTransactionId()),
pgcform->relfrozenxid)))
{
pgcform->relfrozenxid = frozenxid;
@@ -1021,7 +1021,7 @@ vac_update_datfrozenxid(void)
* validly see during the scan. These are conservative values, but it's
* not really worth trying to be more exact.
*/
- lastSaneFrozenXid = ReadNewTransactionId();
+ lastSaneFrozenXid = XidFromFullTransactionId(ReadNextFullTransactionId());
lastSaneMinMulti = ReadNextMultiXactId();
/*
@@ -1157,7 +1157,7 @@ vac_truncate_clog(TransactionId frozenXID,
TransactionId lastSaneFrozenXid,
MultiXactId lastSaneMinMulti)
{
- TransactionId nextXID = ReadNewTransactionId();
+ TransactionId nextXID = XidFromFullTransactionId(ReadNextFullTransactionId());
Relation relation;
HeapScanDesc scan;
HeapTuple tuple;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 02e6d8131e0..3a07a0c3530 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1172,7 +1172,7 @@ do_start_worker(void)
* pass without forcing a vacuum. (This limit can be tightened for
* particular tables, but not loosened.)
*/
- recentXid = ReadNewTransactionId();
+ recentXid = XidFromFullTransactionId(ReadNextFullTransactionId());
xidForceLimit = recentXid - autovacuum_freeze_max_age;
/* ensure it's a "normal" XID, else TransactionIdPrecedes misbehaves */
/* this can cause the limit to go backwards by 3, but that's OK */
@@ -1703,7 +1703,7 @@ AutoVacWorkerMain(int argc, char *argv[])
pg_usleep(PostAuthDelay * 1000000L);
/* And do an appropriate amount of work */
- recentXid = ReadNewTransactionId();
+ recentXid = XidFromFullTransactionId(ReadNextFullTransactionId());
recentMulti = ReadNextMultiXactId();
do_autovacuum();
}
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 987bb84683c..6a11f8a06c1 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -1194,6 +1194,7 @@ static void
XLogWalRcvSendHSFeedback(bool immed)
{
TimestampTz now;
+ FullTransactionId nextFullXid;
TransactionId nextXid;
uint32 xmin_epoch,
catalog_xmin_epoch;
@@ -1272,7 +1273,9 @@ XLogWalRcvSendHSFeedback(bool immed)
* Get epoch and adjust if nextXid and oldestXmin are different sides of
* the epoch boundary.
*/
- GetNextXidAndEpoch(&nextXid, &xmin_epoch);
+ nextFullXid = ReadNextFullTransactionId();
+ nextXid = XidFromFullTransactionId(nextFullXid);
+ xmin_epoch = EpochFromFullTransactionId(nextFullXid);
catalog_xmin_epoch = xmin_epoch;
if (nextXid < xmin)
xmin_epoch--;
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index e47ddca6bca..6903dbc9ca1 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1888,35 +1888,20 @@ PhysicalReplicationSlotNewXmin(TransactionId feedbackXmin, TransactionId feedbac
* Check that the provided xmin/epoch are sane, that is, not in the future
* and not so far back as to be already wrapped around.
*
- * Epoch of nextXid should be same as standby, or if the counter has
- * wrapped, then one greater than standby.
- *
* This check doesn't care about whether clog exists for these xids
* at all.
*/
static bool
TransactionIdInRecentPast(TransactionId xid, uint32 epoch)
{
- TransactionId nextXid;
- uint32 nextEpoch;
-
- GetNextXidAndEpoch(&nextXid, &nextEpoch);
-
- if (xid <= nextXid)
- {
- if (epoch != nextEpoch)
- return false;
- }
- else
- {
- if (epoch + 1 != nextEpoch)
- return false;
- }
-
- if (!TransactionIdPrecedesOrEquals(xid, nextXid))
- return false; /* epoch OK, but it's wrapped around */
-
- return true;
+ FullTransactionId nextFullXid = ReadNextFullTransactionId();
+ FullTransactionId fullXid = MakeFullTransactionId(epoch, xid);
+
+ /* TODO: this is not nice */
+ return
+ FullTransactionIdPrecedesOrEquals(fullXid, nextFullXid) &&
+ U64FromFullTransactionId(nextFullXid) -
+ U64FromFullTransactionId(fullXid) < INT64CONST(1) << 32;
}
/*
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index bd20497d81a..0bf2a11e931 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -883,15 +883,10 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
* it, though.
*/
nextXid = latestObservedXid;
+ AdvanceNextFullTransactionIdPast(nextXid, true);
TransactionIdAdvance(nextXid);
- if (TransactionIdFollows(nextXid, ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = nextXid;
- LWLockRelease(XidGenLock);
- }
- Assert(TransactionIdIsValid(ShmemVariableCache->nextXid));
+ Assert(TransactionIdIsValid(XidFromFullTransactionId(ShmemVariableCache->nextFullXid)));
KnownAssignedXidsDisplay(trace_recovery(DEBUG3));
if (standbyState == STANDBY_SNAPSHOT_READY)
@@ -1979,7 +1974,7 @@ GetRunningTransactionData(void)
latestCompletedXid = ShmemVariableCache->latestCompletedXid;
- oldestRunningXid = ShmemVariableCache->nextXid;
+ oldestRunningXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* Spin over procArray collecting all xids
@@ -2068,7 +2063,7 @@ GetRunningTransactionData(void)
CurrentRunningXacts->xcnt = count - subcount;
CurrentRunningXacts->subxcnt = subcount;
CurrentRunningXacts->subxid_overflow = suboverflowed;
- CurrentRunningXacts->nextXid = ShmemVariableCache->nextXid;
+ CurrentRunningXacts->nextXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
CurrentRunningXacts->oldestRunningXid = oldestRunningXid;
CurrentRunningXacts->latestCompletedXid = latestCompletedXid;
@@ -2113,7 +2108,7 @@ GetOldestActiveTransactionId(void)
* have already completed), when we spin over it.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- oldestRunningXid = ShmemVariableCache->nextXid;
+ oldestRunningXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
LWLockRelease(XidGenLock);
/*
@@ -2181,7 +2176,7 @@ GetOldestSafeDecodingTransactionId(bool catalogOnly)
* a safe, albeit pessimal, value.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- oldestSafeXid = ShmemVariableCache->nextXid;
+ oldestSafeXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* If there's already a slot pegging the xmin horizon, we can start with
@@ -3237,10 +3232,8 @@ RecordKnownAssignedTransactionIds(TransactionId xid)
/* ShmemVariableCache->nextXid must be beyond any observed xid */
next_expected_xid = latestObservedXid;
+ AdvanceNextFullTransactionIdPast(next_expected_xid, false);
TransactionIdAdvance(next_expected_xid);
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = next_expected_xid;
- LWLockRelease(XidGenLock);
}
}
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index e8390311d03..880e6c14ef1 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -3270,7 +3270,7 @@ ReleasePredicateLocks(bool isCommit)
* transaction to complete before freeing some RAM; correctness of visible
* behavior is not affected.
*/
- MySerializableXact->finishedBefore = ShmemVariableCache->nextXid;
+ MySerializableXact->finishedBefore = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* If it's not a commit it's a rollback, and we can clear our locks
diff --git a/src/backend/utils/adt/txid.c b/src/backend/utils/adt/txid.c
index 7974c0bd3d8..0fd35eda6b7 100644
--- a/src/backend/utils/adt/txid.c
+++ b/src/backend/utils/adt/txid.c
@@ -91,7 +91,10 @@ typedef struct
static void
load_xid_epoch(TxidEpoch *state)
{
- GetNextXidAndEpoch(&state->last_xid, &state->epoch);
+ FullTransactionId bigXid = ReadNextFullTransactionId();
+
+ state->last_xid = XidFromFullTransactionId(bigXid);
+ state->epoch = EpochFromFullTransactionId(bigXid);
}
/*
@@ -114,8 +117,11 @@ TransactionIdInRecentPast(uint64 xid_with_epoch, TransactionId *extracted_xid)
TransactionId xid = (TransactionId) xid_with_epoch;
uint32 now_epoch;
TransactionId now_epoch_last_xid;
+ FullTransactionId now_xid;
- GetNextXidAndEpoch(&now_epoch_last_xid, &now_epoch);
+ now_xid = ReadNextFullTransactionId();
+ now_epoch_last_xid = XidFromFullTransactionId(now_xid);
+ now_epoch = EpochFromFullTransactionId(now_xid);
if (extracted_xid != NULL)
*extracted_xid = xid;
@@ -128,8 +134,7 @@ TransactionIdInRecentPast(uint64 xid_with_epoch, TransactionId *extracted_xid)
return true;
/* If the transaction ID is in the future, throw an error. */
- if (xid_epoch > now_epoch
- || (xid_epoch == now_epoch && xid > now_epoch_last_xid))
+ if (xid_with_epoch > U64FromFullTransactionId(now_xid))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("transaction ID %s is in the future",
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 3fc8b6a8a84..37fb18130f4 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -16,6 +16,7 @@
#include "postgres.h"
#include "access/htup_details.h"
+#include "access/transam.h"
#include "access/xlog_internal.h"
#include "access/xlog.h"
#include "catalog/pg_control.h"
@@ -164,8 +165,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
nulls[5] = false;
values[6] = CStringGetTextDatum(psprintf("%u:%u",
- ControlFile->checkPointCopy.nextXidEpoch,
- ControlFile->checkPointCopy.nextXid));
+ EpochFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid),
+ XidFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid)));
nulls[6] = false;
values[7] = ObjectIdGetDatum(ControlFile->checkPointCopy.nextOid);
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index 895a51f89d5..f6731bfd28d 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -20,6 +20,7 @@
#include <time.h>
+#include "access/transam.h"
#include "access/xlog.h"
#include "access/xlog_internal.h"
#include "catalog/pg_control.h"
@@ -256,8 +257,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's full_page_writes: %s\n"),
ControlFile->checkPointCopy.fullPageWrites ? _("on") : _("off"));
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
- ControlFile->checkPointCopy.nextXidEpoch,
- ControlFile->checkPointCopy.nextXid);
+ EpochFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid),
+ XidFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid));
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index 8cff5356925..b19b429a54b 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -431,11 +431,15 @@ main(int argc, char *argv[])
* if any, includes these values.)
*/
if (set_xid_epoch != -1)
- ControlFile.checkPointCopy.nextXidEpoch = set_xid_epoch;
+ ControlFile.checkPointCopy.nextFullXid =
+ MakeFullTransactionId(set_xid_epoch,
+ XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
if (set_xid != 0)
{
- ControlFile.checkPointCopy.nextXid = set_xid;
+ ControlFile.checkPointCopy.nextFullXid =
+ MakeFullTransactionId(EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid),
+ set_xid);
/*
* For the moment, just set oldestXid to a value that will force
@@ -705,8 +709,7 @@ GuessControlValues(void)
ControlFile.checkPointCopy.ThisTimeLineID = 1;
ControlFile.checkPointCopy.PrevTimeLineID = 1;
ControlFile.checkPointCopy.fullPageWrites = false;
- ControlFile.checkPointCopy.nextXidEpoch = 0;
- ControlFile.checkPointCopy.nextXid = FirstNormalTransactionId;
+ ControlFile.checkPointCopy.nextFullXid = MakeFullTransactionId(0, FirstNormalTransactionId);
ControlFile.checkPointCopy.nextOid = FirstBootstrapObjectId;
ControlFile.checkPointCopy.nextMulti = FirstMultiXactId;
ControlFile.checkPointCopy.nextMultiOffset = 0;
@@ -786,8 +789,8 @@ PrintControlValues(bool guessed)
printf(_("Latest checkpoint's full_page_writes: %s\n"),
ControlFile.checkPointCopy.fullPageWrites ? _("on") : _("off"));
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
- ControlFile.checkPointCopy.nextXidEpoch,
- ControlFile.checkPointCopy.nextXid);
+ EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid),
+ XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile.checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
@@ -879,7 +882,7 @@ PrintNewControlValues(void)
if (set_xid != 0)
{
printf(_("NextXID: %u\n"),
- ControlFile.checkPointCopy.nextXid);
+ XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
printf(_("OldestXID: %u\n"),
ControlFile.checkPointCopy.oldestXid);
printf(_("OldestXID's DB: %u\n"),
@@ -889,7 +892,7 @@ PrintNewControlValues(void)
if (set_xid_epoch != -1)
{
printf(_("NextXID epoch: %u\n"),
- ControlFile.checkPointCopy.nextXidEpoch);
+ EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
}
if (set_oldest_commit_ts_xid != 0)
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 83ec3f19797..4f0053bfa92 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -44,6 +44,32 @@
#define TransactionIdStore(xid, dest) (*(dest) = (xid))
#define StoreInvalidTransactionId(dest) (*(dest) = InvalidTransactionId)
+#define EpochFromFullTransactionId(x) ((uint32) ((x).value >> 32))
+#define XidFromFullTransactionId(x) ((uint32) (x).value)
+#define U64FromFullTransactionId(x) ((x).value)
+#define FullTransactionIdPrecedes(a, b) ((a).value < (b).value)
+#define FullTransactionIdPrecedesOrEquals(a, b) ((a).value <= (b).value)
+
+/*
+ * A 64 bit value that contains an epoch and a TransactionId. This is
+ * wrapped in a struct to prevent implicit conversion to/from TransactionId.
+ * Allowing such conversions seems likely to be error-prone.
+ */
+typedef struct FullTransactionId
+{
+ uint64 value;
+} FullTransactionId;
+
+static inline FullTransactionId
+MakeFullTransactionId(uint32 epoch, TransactionId xid)
+{
+ FullTransactionId result;
+
+ result.value = ((uint64) epoch) << 32 | xid;
+
+ return result;
+}
+
/* advance a transaction ID variable, handling wraparound correctly */
#define TransactionIdAdvance(dest) \
do { \
@@ -52,6 +78,15 @@
(dest) = FirstNormalTransactionId; \
} while(0)
+/* advance a FullTransactionId lvalue, handling wraparound correctly */
+#define FullTransactionIdAdvance(dest) \
+ do { \
+ (dest).value++; \
+ if (XidFromFullTransactionId(dest) < FirstNormalTransactionId) \
+ (dest) = MakeFullTransactionId(EpochFromFullTransactionId(dest), \
+ FirstNormalTransactionId); \
+ } while (0)
+
/* back up a transaction ID variable, handling wraparound correctly */
#define TransactionIdRetreat(dest) \
do { \
@@ -114,12 +149,12 @@ typedef struct VariableCacheData
/*
* These fields are protected by XidGenLock.
*/
- TransactionId nextXid; /* next XID to assign */
+ FullTransactionId nextFullXid; /* next XID to assign */
TransactionId oldestXid; /* cluster-wide minimum datfrozenxid */
TransactionId xidVacLimit; /* start forcing autovacuums here */
TransactionId xidWarnLimit; /* start complaining here */
- TransactionId xidStopLimit; /* refuse to advance nextXid beyond here */
+ TransactionId xidStopLimit; /* refuse to advance nextFullXid beyond here */
TransactionId xidWrapLimit; /* where the world ends */
Oid oldestXidDB; /* database with minimum datfrozenxid */
@@ -176,7 +211,8 @@ extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
/* in transam/varsup.c */
extern TransactionId GetNewTransactionId(bool isSubXact);
-extern TransactionId ReadNewTransactionId(void);
+extern void AdvanceNextFullTransactionIdPast(TransactionId xid, bool lock_free_check);
+extern FullTransactionId ReadNextFullTransactionId(void);
extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
Oid oldest_datoid);
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 421ba6d7755..3c9d3401df5 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -276,7 +276,6 @@ extern XLogRecPtr GetRedoRecPtr(void);
extern XLogRecPtr GetInsertRecPtr(void);
extern XLogRecPtr GetFlushRecPtr(void);
extern XLogRecPtr GetLastImportantRecPtr(void);
-extern void GetNextXidAndEpoch(TransactionId *xid, uint32 *epoch);
extern void RemovePromoteSignalFiles(void);
extern bool CheckPromoteSignal(void);
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 773d9e6ebae..fa7ff049403 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -15,6 +15,7 @@
#ifndef PG_CONTROL_H
#define PG_CONTROL_H
+#include "access/transam.h"
#include "access/xlogdefs.h"
#include "pgtime.h" /* for pg_time_t */
#include "port/pg_crc32c.h"
@@ -39,8 +40,7 @@ typedef struct CheckPoint
TimeLineID PrevTimeLineID; /* previous TLI, if this record begins a new
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
- uint32 nextXidEpoch; /* higher-order bits of nextXid */
- TransactionId nextXid; /* next free XID */
+ FullTransactionId nextFullXid; /* next free XID */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index 1fcd8cf1b59..d1116454095 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -72,7 +72,7 @@ typedef struct RunningTransactionsData
int xcnt; /* # of xact ids in xids[] */
int subxcnt; /* # of subxact ids in xids[] */
bool subxid_overflow; /* snapshot overflowed, subxids missing */
- TransactionId nextXid; /* copy of ShmemVariableCache->nextXid */
+ TransactionId nextXid; /* xid from ShmemVariableCache->nextFullXid */
TransactionId oldestRunningXid; /* *not* oldestXmin */
TransactionId latestCompletedXid; /* so we can set xmax */
--
2.17.0
On 2018-07-10 13:20:52 +1200, Thomas Munro wrote:
defined in transam.h because c.h didn't feel right.
Yea, that looks better.
Client code lost the ability to use operator < directly. I needed to
use a static inline function as a constructor. I lost the
interchangeability with the wide xids in txid.c, so I provided
U64FromFullTransactionId() (I think that'll be useful for
serialisation over the write too).
Should probably, at a later point, introduce a SQL type for it too.
I don't know what to think about the encoding or meaning of non-normal
xids in this thing.
I'm not following what you mean by this?
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c index 4faa21f5aef..fa0847afc81 100644 --- a/src/backend/access/transam/subtrans.c +++ b/src/backend/access/transam/subtrans.c @@ -261,7 +261,7 @@ StartupSUBTRANS(TransactionId oldestActiveXID) LWLockAcquire(SubtransControlLock, LW_EXCLUSIVE);startPage = TransactionIdToPage(oldestActiveXID); - endPage = TransactionIdToPage(ShmemVariableCache->nextXid); + endPage = TransactionIdToPage(XidFromFullTransactionId(ShmemVariableCache->nextFullXid));
These probably need an intermediate variable.
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c index 394843f7e91..13020f54d98 100644 --- a/src/backend/access/transam/varsup.c +++ b/src/backend/access/transam/varsup.c @@ -73,7 +73,7 @@ GetNewTransactionId(bool isSubXact)LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = ShmemVariableCache->nextXid; + xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
It's a bit over the top, I know, but I'd move the conversion to outside
the lock. Less because it makes a meaningful efficiency difference, and
more because I find it clearer, and easier to reason about if we ever go
to atomically incrementing nextFullXid.
@@ -6700,7 +6698,8 @@ StartupXLOG(void) wasShutdown ? "true" : "false"))); ereport(DEBUG1, (errmsg_internal("next transaction ID: %u:%u; next OID: %u", - checkPoint.nextXidEpoch, checkPoint.nextXid, + EpochFromFullTransactionId(checkPoint.nextFullXid), + XidFromFullTransactionId(checkPoint.nextFullXid), checkPoint.nextOid)));
I don't think we should continue to expose epochs, but rather go to full
xids where appropriate.
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c index 895a51f89d5..f6731bfd28d 100644 --- a/src/bin/pg_controldata/pg_controldata.c +++ b/src/bin/pg_controldata/pg_controldata.c @@ -20,6 +20,7 @@#include <time.h>
+#include "access/transam.h" #include "access/xlog.h" #include "access/xlog_internal.h" #include "catalog/pg_control.h" @@ -256,8 +257,8 @@ main(int argc, char *argv[]) printf(_("Latest checkpoint's full_page_writes: %s\n"), ControlFile->checkPointCopy.fullPageWrites ? _("on") : _("off")); printf(_("Latest checkpoint's NextXID: %u:%u\n"), - ControlFile->checkPointCopy.nextXidEpoch, - ControlFile->checkPointCopy.nextXid); + EpochFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid), + XidFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid)); printf(_("Latest checkpoint's NextOID: %u\n"), ControlFile->checkPointCopy.nextOid); printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
See above re exposing epoch.
--- a/src/include/access/transam.h +++ b/src/include/access/transam.h @@ -44,6 +44,32 @@ #define TransactionIdStore(xid, dest) (*(dest) = (xid)) #define StoreInvalidTransactionId(dest) (*(dest) = InvalidTransactionId)+#define EpochFromFullTransactionId(x) ((uint32) ((x).value >> 32)) +#define XidFromFullTransactionId(x) ((uint32) (x).value) +#define U64FromFullTransactionId(x) ((x).value) +#define FullTransactionIdPrecedes(a, b) ((a).value < (b).value) +#define FullTransactionIdPrecedesOrEquals(a, b) ((a).value <= (b).value) + +/* + * A 64 bit value that contains an epoch and a TransactionId. This is + * wrapped in a struct to prevent implicit conversion to/from TransactionId. + * Allowing such conversions seems likely to be error-prone. + */ +typedef struct FullTransactionId +{ + uint64 value; +} FullTransactionId; + +static inline FullTransactionId +MakeFullTransactionId(uint32 epoch, TransactionId xid) +{ + FullTransactionId result; + + result.value = ((uint64) epoch) << 32 | xid; + + return result; +} + /* advance a transaction ID variable, handling wraparound correctly */ #define TransactionIdAdvance(dest) \ do { \ @@ -52,6 +78,15 @@ (dest) = FirstNormalTransactionId; \ } while(0)+/* advance a FullTransactionId lvalue, handling wraparound correctly */ +#define FullTransactionIdAdvance(dest) \ + do { \ + (dest).value++; \ + if (XidFromFullTransactionId(dest) < FirstNormalTransactionId) \ + (dest) = MakeFullTransactionId(EpochFromFullTransactionId(dest), \ + FirstNormalTransactionId); \ + } while (0)
Yikes. Why isn't this an inline function? Because it assigns to dest?
Greetings,
Andres Freund
On Tue, Jul 10, 2018 at 1:30 PM, Andres Freund <andres@anarazel.de> wrote:
On 2018-07-10 13:20:52 +1200, Thomas Munro wrote:
I don't know what to think about the encoding or meaning of non-normal
xids in this thing.I'm not following what you mean by this?
While defining FullTransactionIdPrecedes() in the image of
TransactionIdPrecedes(), I wondered whether the xid component of a
FullTransactionId would ever be one of the special values below
FirstNormalTransactionId that require special treatment. The question
doesn't actually arise in this code, however, so I ignored that
thought and simply used x < y.
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c index 4faa21f5aef..fa0847afc81 100644 --- a/src/backend/access/transam/subtrans.c +++ b/src/backend/access/transam/subtrans.c @@ -261,7 +261,7 @@ StartupSUBTRANS(TransactionId oldestActiveXID) LWLockAcquire(SubtransControlLock, LW_EXCLUSIVE);startPage = TransactionIdToPage(oldestActiveXID); - endPage = TransactionIdToPage(ShmemVariableCache->nextXid); + endPage = TransactionIdToPage(XidFromFullTransactionId(ShmemVariableCache->nextFullXid));These probably need an intermediate variable.
Ok, did that in a couple of places.
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c index 394843f7e91..13020f54d98 100644 --- a/src/backend/access/transam/varsup.c +++ b/src/backend/access/transam/varsup.c @@ -73,7 +73,7 @@ GetNewTransactionId(bool isSubXact)LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = ShmemVariableCache->nextXid; + xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);It's a bit over the top, I know, but I'd move the conversion to outside
the lock. Less because it makes a meaningful efficiency difference, and
more because I find it clearer, and easier to reason about if we ever go
to atomically incrementing nextFullXid.
Erm -- I can't read nextFullXid until I have the lock, and then I need
it in a 32 bit format for the code that follows immediately (I don't
currently have xidVacLimit in FullTransactionId format). I'm not sure
what you mean.
@@ -6700,7 +6698,8 @@ StartupXLOG(void) wasShutdown ? "true" : "false"))); ereport(DEBUG1, (errmsg_internal("next transaction ID: %u:%u; next OID: %u", - checkPoint.nextXidEpoch, checkPoint.nextXid, + EpochFromFullTransactionId(checkPoint.nextFullXid), + XidFromFullTransactionId(checkPoint.nextFullXid), checkPoint.nextOid)));I don't think we should continue to expose epochs, but rather go to full
xids where appropriate.
OK, done.
Hmm, that's going to hurt some eyeballs, because other
fields are shown in 32 bit xid format. Should we also go to hex
everywhere so that we feeble humans can see the epoch component and
compare the xid component by eye?
Here's what I see on my test cluster:
Latest checkpoint's NextXID: 184683724519
Latest checkpoint's oldestXID: 4294901760
In hexadecimal money that'd be (with extra spaces, to line up with a
monospace font):
Latest checkpoint's NextXID: 0000002b0001fee7
Latest checkpoint's oldestXID: ffff0000
I left pg_resetwal's arguments -x and -e alone, but I suppose someone
might want a new switch that does both together...
+/* advance a FullTransactionId lvalue, handling wraparound correctly */ +#define FullTransactionIdAdvance(dest) \ + do { \ + (dest).value++; \ + if (XidFromFullTransactionId(dest) < FirstNormalTransactionId) \ + (dest) = MakeFullTransactionId(EpochFromFullTransactionId(dest), \ + FirstNormalTransactionId); \ + } while (0)Yikes. Why isn't this an inline function? Because it assigns to dest?
Following existing malpractice, and yeah because it requires an lvalue
so can't be changed without a different convention at the call site.
Fixed.
--
Thomas Munro
http://www.enterprisedb.com
Attachments:
0001-Track-the-next-xid-using-64-bits-v3.patchapplication/octet-stream; name=0001-Track-the-next-xid-using-64-bits-v3.patchDownload
From 2f893e10dcf9ec82d9cabcdaeedce2681632b87d Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@enterprisedb.com>
Date: Mon, 9 Jul 2018 21:54:03 +1200
Subject: [PATCH] Track the next xid using 64 bits.
Instead of tracking the epoch independently, start using a 64 bit transaction
ID in several places. This fix an unlikely bug where an epoch increment could
be missed if you managed to consume more than 2^32 transactions between
checkpoints.
Work in progress!
Author: Thomas Munro
Reviewed-by: Andres Freund
Diagnosis-by: Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1%2BMv%2Bmb0HFfWM9Srtc6MVe160WFurXV68iAFMcagRZ0dQ%40mail.gmail.com
---
src/backend/access/nbtree/nbtpage.c | 4 +-
src/backend/access/rmgrdesc/xlogdesc.c | 6 +-
src/backend/access/transam/clog.c | 4 +-
src/backend/access/transam/commit_ts.c | 5 +-
src/backend/access/transam/multixact.c | 9 +-
src/backend/access/transam/subtrans.c | 4 +-
src/backend/access/transam/twophase.c | 19 ++---
src/backend/access/transam/varsup.c | 53 +++++++++---
src/backend/access/transam/xact.c | 21 +----
src/backend/access/transam/xlog.c | 105 ++++++------------------
src/backend/commands/vacuum.c | 10 +--
src/backend/postmaster/autovacuum.c | 4 +-
src/backend/replication/walreceiver.c | 5 +-
src/backend/replication/walsender.c | 31 ++-----
src/backend/storage/ipc/procarray.c | 21 ++---
src/backend/storage/lmgr/predicate.c | 2 +-
src/backend/utils/adt/txid.c | 13 ++-
src/backend/utils/misc/pg_controldata.c | 6 +-
src/bin/pg_controldata/pg_controldata.c | 6 +-
src/bin/pg_resetwal/pg_resetwal.c | 20 +++--
src/include/access/transam.h | 43 +++++++++-
src/include/access/xlog.h | 1 -
src/include/catalog/pg_control.h | 4 +-
src/include/storage/standby.h | 2 +-
24 files changed, 186 insertions(+), 212 deletions(-)
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 2e959da5f85..3e1f3683734 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -1946,7 +1946,7 @@ _bt_unlink_halfdead_page(Relation rel, Buffer leafbuf, bool *rightsib_empty)
* Mark the page itself deleted. It can be recycled when all current
* transactions are gone. Storing GetTopTransactionId() would work, but
* we're in VACUUM and would not otherwise have an XID. Having already
- * updated links to the target, ReadNewTransactionId() suffices as an
+ * updated links to the target, ReadNextFullTransactionId() suffices as an
* upper bound. Any scan having retained a now-stale link is advertising
* in its PGXACT an xmin less than or equal to the value we read here. It
* will continue to do so, holding back RecentGlobalXmin, for the duration
@@ -1956,7 +1956,7 @@ _bt_unlink_halfdead_page(Relation rel, Buffer leafbuf, bool *rightsib_empty)
opaque = (BTPageOpaque) PageGetSpecialPointer(page);
opaque->btpo_flags &= ~BTP_HALF_DEAD;
opaque->btpo_flags |= BTP_DELETED;
- opaque->btpo.xact = ReadNewTransactionId();
+ opaque->btpo.xact = XidFromFullTransactionId(ReadNextFullTransactionId());
/* And update the metapage, if needed */
if (BufferIsValid(metabuf))
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 00741c7b09e..3655bac9ef1 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -14,6 +14,7 @@
*/
#include "postgres.h"
+#include "access/transam.h"
#include "access/xlog.h"
#include "access/xlog_internal.h"
#include "catalog/pg_control.h"
@@ -44,7 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
+ "tli %u; prev tli %u; fpw %s; xid " UINT64_FORMAT ";"
+ " oid %u; multi %u; offset %u; "
"oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
@@ -52,7 +54,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->ThisTimeLineID,
checkpoint->PrevTimeLineID,
checkpoint->fullPageWrites ? "true" : "false",
- checkpoint->nextXidEpoch, checkpoint->nextXid,
+ U64FromFullTransactionId(checkpoint->nextFullXid),
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index 8b7ff5b0c24..4fccbc9516c 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -754,7 +754,7 @@ ZeroCLOGPage(int pageno, bool writeXlog)
void
StartupCLOG(void)
{
- TransactionId xid = ShmemVariableCache->nextXid;
+ TransactionId xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
int pageno = TransactionIdToPage(xid);
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
@@ -773,7 +773,7 @@ StartupCLOG(void)
void
TrimCLOG(void)
{
- TransactionId xid = ShmemVariableCache->nextXid;
+ TransactionId xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
int pageno = TransactionIdToPage(xid);
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 73fac1ba81d..da85904b3d6 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -644,7 +644,7 @@ ActivateCommitTs(void)
}
LWLockRelease(CommitTsLock);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
pageno = TransactionIdToCTsPage(xid);
/*
@@ -671,7 +671,8 @@ ActivateCommitTs(void)
if (ShmemVariableCache->oldestCommitTsXid == InvalidTransactionId)
{
ShmemVariableCache->oldestCommitTsXid =
- ShmemVariableCache->newestCommitTsXid = ReadNewTransactionId();
+ ShmemVariableCache->newestCommitTsXid =
+ XidFromFullTransactionId(ReadNextFullTransactionId());
}
LWLockRelease(CommitTsLock);
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index a9a51055e96..7478beed44d 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -3283,14 +3283,7 @@ multixact_redo(XLogReaderState *record)
* process doesn't need to hold a lock while checking this. We still
* acquire the lock to modify it, though.
*/
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPast(max_xid, true);
}
else if (info == XLOG_MULTIXACT_TRUNCATE_ID)
{
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c
index 4faa21f5aef..0cded8ac3c6 100644
--- a/src/backend/access/transam/subtrans.c
+++ b/src/backend/access/transam/subtrans.c
@@ -249,6 +249,7 @@ ZeroSUBTRANSPage(int pageno)
void
StartupSUBTRANS(TransactionId oldestActiveXID)
{
+ FullTransactionId nextFullXid;
int startPage;
int endPage;
@@ -261,7 +262,8 @@ StartupSUBTRANS(TransactionId oldestActiveXID)
LWLockAcquire(SubtransControlLock, LW_EXCLUSIVE);
startPage = TransactionIdToPage(oldestActiveXID);
- endPage = TransactionIdToPage(ShmemVariableCache->nextXid);
+ nextFullXid = ShmemVariableCache->nextFullXid;
+ endPage = TransactionIdToPage(XidFromFullTransactionId(nextFullXid));
while (startPage != endPage)
{
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index e8d4e37fe30..36603a9a34c 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -1879,7 +1879,8 @@ restoreTwoPhaseData(void)
TransactionId
PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
{
- TransactionId origNextXid = ShmemVariableCache->nextXid;
+ FullTransactionId nextFullXid = ShmemVariableCache->nextFullXid;
+ TransactionId origNextXid = XidFromFullTransactionId(nextFullXid);
TransactionId result = origNextXid;
TransactionId *xids = NULL;
int nxids = 0;
@@ -2104,7 +2105,8 @@ ProcessTwoPhaseBuffer(TransactionId xid,
bool fromdisk,
bool setParent, bool setNextXid)
{
- TransactionId origNextXid = ShmemVariableCache->nextXid;
+ FullTransactionId nextFullXid = ShmemVariableCache->nextFullXid;
+ TransactionId origNextXid = XidFromFullTransactionId(nextFullXid);
TransactionId *subxids;
char *buf;
TwoPhaseFileHeader *hdr;
@@ -2210,23 +2212,14 @@ ProcessTwoPhaseBuffer(TransactionId xid,
Assert(TransactionIdFollows(subxid, xid));
/* update nextXid if needed */
- if (setNextXid &&
- TransactionIdFollowsOrEquals(subxid,
- ShmemVariableCache->nextXid))
+ if (setNextXid)
{
/*
* We don't expect anyone else to modify nextXid, hence we don't
* need to hold a lock while examining it. We still acquire the
* lock to modify it, though, so we recheck.
*/
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- if (TransactionIdFollowsOrEquals(subxid,
- ShmemVariableCache->nextXid))
- {
- ShmemVariableCache->nextXid = subxid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- }
- LWLockRelease(XidGenLock);
+ AdvanceNextFullTransactionIdPast(subxid, true);
}
if (setParent)
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 394843f7e91..d1396f3f0e1 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -73,7 +73,7 @@ GetNewTransactionId(bool isSubXact)
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*----------
* Check to see if it's safe to assign another XID. This protects against
@@ -156,7 +156,7 @@ GetNewTransactionId(bool isSubXact)
/* Re-acquire lock and start over */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
}
/*
@@ -173,12 +173,12 @@ GetNewTransactionId(bool isSubXact)
ExtendSUBTRANS(xid);
/*
- * Now advance the nextXid counter. This must not happen until after we
+ * Now advance the nextFullXid counter. This must not happen until after we
* have successfully completed ExtendCLOG() --- if that routine fails, we
* want the next incoming transaction to try it again. We cannot assign
* more XIDs until there is CLOG space for them.
*/
- TransactionIdAdvance(ShmemVariableCache->nextXid);
+ FullTransactionIdAdvance(&ShmemVariableCache->nextFullXid);
/*
* We must store the new XID into the shared ProcArray before releasing
@@ -244,18 +244,47 @@ GetNewTransactionId(bool isSubXact)
}
/*
- * Read nextXid but don't allocate it.
+ * Read nextFullXid but don't allocate it.
*/
-TransactionId
-ReadNewTransactionId(void)
+FullTransactionId
+ReadNextFullTransactionId(void)
{
- TransactionId xid;
+ FullTransactionId fullXid;
LWLockAcquire(XidGenLock, LW_SHARED);
- xid = ShmemVariableCache->nextXid;
+ fullXid = ShmemVariableCache->nextFullXid;
LWLockRelease(XidGenLock);
- return xid;
+ return fullXid;
+}
+
+/*
+ * Advance nextFullXid to the value after a given xid. The epoch is inferred.
+ * If lock_free_check is true, then the caller must be sure that it's safe to
+ * read nextFullXid without holding XidGenLock (ie during recovery).
+ */
+void
+AdvanceNextFullTransactionIdPast(TransactionId xid, bool lock_free_check)
+{
+ TransactionId current_xid;
+ uint32 epoch;
+
+ if (lock_free_check &&
+ !TransactionIdFollowsOrEquals(xid,
+ XidFromFullTransactionId(ShmemVariableCache->nextFullXid)))
+ return;
+
+ LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
+ current_xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
+ if (TransactionIdFollowsOrEquals(xid, current_xid))
+ {
+ epoch = EpochFromFullTransactionId(ShmemVariableCache->nextFullXid);
+ if (xid < current_xid)
+ ++epoch; /* epoch wrapped */
+ ShmemVariableCache->nextFullXid = MakeFullTransactionId(epoch, xid);
+ FullTransactionIdAdvance(&ShmemVariableCache->nextFullXid);
+ }
+ LWLockRelease(XidGenLock);
}
/*
@@ -359,7 +388,7 @@ SetTransactionIdLimit(TransactionId oldest_datfrozenxid, Oid oldest_datoid)
ShmemVariableCache->xidStopLimit = xidStopLimit;
ShmemVariableCache->xidWrapLimit = xidWrapLimit;
ShmemVariableCache->oldestXidDB = oldest_datoid;
- curXid = ShmemVariableCache->nextXid;
+ curXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
LWLockRelease(XidGenLock);
/* Log the info */
@@ -435,7 +464,7 @@ ForceTransactionIdLimitUpdate(void)
/* Locking is probably not really necessary, but let's be careful */
LWLockAcquire(XidGenLock, LW_SHARED);
- nextXid = ShmemVariableCache->nextXid;
+ nextXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
xidVacLimit = ShmemVariableCache->xidVacLimit;
oldestXid = ShmemVariableCache->oldestXid;
oldestXidDB = ShmemVariableCache->oldestXidDB;
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 1da1f13ef33..6ccccc760d5 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -468,7 +468,7 @@ GetStableLatestTransactionId(void)
lxid = MyProc->lxid;
stablexid = GetTopTransactionIdIfAny();
if (!TransactionIdIsValid(stablexid))
- stablexid = ReadNewTransactionId();
+ stablexid = XidFromFullTransactionId(ReadNextFullTransactionId());
}
Assert(TransactionIdIsValid(stablexid));
@@ -5529,14 +5529,7 @@ xact_redo_commit(xl_xact_parsed_commit *parsed,
* hold a lock while checking this. We still acquire the lock to modify
* it, though.
*/
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPast(max_xid, true);
Assert(((parsed->xinfo & XACT_XINFO_HAS_ORIGIN) == 0) ==
(origin_id == InvalidRepOriginId));
@@ -5688,15 +5681,7 @@ xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid)
max_xid = TransactionIdLatest(xid,
parsed->nsubxacts,
parsed->subxacts);
-
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPast(max_xid, true);
if (standbyState == STANDBY_DISABLED)
{
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 20b23cb3609..101a03d7c45 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -579,8 +579,7 @@ typedef struct XLogCtlData
/* Protected by info_lck: */
XLogwrtRqst LogwrtRqst;
XLogRecPtr RedoRecPtr; /* a recent copy of Insert->RedoRecPtr */
- uint32 ckptXidEpoch; /* nextXID & epoch of latest checkpoint */
- TransactionId ckptXid;
+ FullTransactionId ckptFullXid; /* nextXID & epoch of latest checkpoint */
XLogRecPtr asyncXactLSN; /* LSN of newest async commit/abort */
XLogRecPtr replicationSlotMinLSN; /* oldest LSN needed by any slot */
@@ -5102,8 +5101,7 @@ BootStrapXLOG(void)
checkPoint.ThisTimeLineID = ThisTimeLineID;
checkPoint.PrevTimeLineID = ThisTimeLineID;
checkPoint.fullPageWrites = fullPageWrites;
- checkPoint.nextXidEpoch = 0;
- checkPoint.nextXid = FirstNormalTransactionId;
+ checkPoint.nextFullXid = MakeFullTransactionId(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstBootstrapObjectId;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
@@ -5116,7 +5114,7 @@ BootStrapXLOG(void)
checkPoint.time = (pg_time_t) time(NULL);
checkPoint.oldestActiveXid = InvalidTransactionId;
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
@@ -6699,8 +6697,8 @@ StartupXLOG(void)
(uint32) (checkPoint.redo >> 32), (uint32) checkPoint.redo,
wasShutdown ? "true" : "false")));
ereport(DEBUG1,
- (errmsg_internal("next transaction ID: %u:%u; next OID: %u",
- checkPoint.nextXidEpoch, checkPoint.nextXid,
+ (errmsg_internal("next transaction ID: " UINT64_FORMAT "; next OID: %u",
+ U64FromFullTransactionId(checkPoint.nextFullXid),
checkPoint.nextOid)));
ereport(DEBUG1,
(errmsg_internal("next MultiXactId: %u; next MultiXactOffset: %u",
@@ -6715,12 +6713,12 @@ StartupXLOG(void)
(errmsg_internal("commit timestamp Xid oldest/newest: %u/%u",
checkPoint.oldestCommitTsXid,
checkPoint.newestCommitTsXid)));
- if (!TransactionIdIsNormal(checkPoint.nextXid))
+ if (!TransactionIdIsNormal(XidFromFullTransactionId(checkPoint.nextFullXid)))
ereport(PANIC,
(errmsg("invalid next transaction ID")));
/* initialize shared memory variables from the checkpoint record */
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
@@ -6729,8 +6727,7 @@ StartupXLOG(void)
SetMultiXactIdLimit(checkPoint.oldestMulti, checkPoint.oldestMultiDB, true);
SetCommitTsLimit(checkPoint.oldestCommitTsXid,
checkPoint.newestCommitTsXid);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
/*
* Initialize replication slots, before there's a chance to remove
@@ -7000,7 +6997,7 @@ StartupXLOG(void)
Assert(TransactionIdIsValid(oldestActiveXID));
/* Tell procarray about the range of xids it has to deal with */
- ProcArrayInitRecovery(ShmemVariableCache->nextXid);
+ ProcArrayInitRecovery(XidFromFullTransactionId(ShmemVariableCache->nextFullXid));
/*
* Startup commit log and subtrans only. MultiXact and commit
@@ -7030,9 +7027,9 @@ StartupXLOG(void)
running.xcnt = nxids;
running.subxcnt = 0;
running.subxid_overflow = false;
- running.nextXid = checkPoint.nextXid;
+ running.nextXid = XidFromFullTransactionId(checkPoint.nextFullXid);
running.oldestRunningXid = oldestActiveXID;
- latestCompletedXid = checkPoint.nextXid;
+ latestCompletedXid = XidFromFullTransactionId(checkPoint.nextFullXid);
TransactionIdRetreat(latestCompletedXid);
Assert(TransactionIdIsNormal(latestCompletedXid));
running.latestCompletedXid = latestCompletedXid;
@@ -7208,14 +7205,7 @@ StartupXLOG(void)
* don't need to hold a lock while examining it. We still
* acquire the lock to modify it, though.
*/
- if (TransactionIdFollowsOrEquals(record->xl_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = record->xl_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPast(record->xl_xid, true);
/*
* Before replaying this record, check if this record causes
@@ -7782,7 +7772,7 @@ StartupXLOG(void)
/* also initialize latestCompletedXid, to nextXid - 1 */
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- ShmemVariableCache->latestCompletedXid = ShmemVariableCache->nextXid;
+ ShmemVariableCache->latestCompletedXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
TransactionIdRetreat(ShmemVariableCache->latestCompletedXid);
LWLockRelease(ProcArrayLock);
@@ -8375,41 +8365,6 @@ GetLastSegSwitchData(XLogRecPtr *lastSwitchLSN)
return result;
}
-/*
- * GetNextXidAndEpoch - get the current nextXid value and associated epoch
- *
- * This is exported for use by code that would like to have 64-bit XIDs.
- * We don't really support such things, but all XIDs within the system
- * can be presumed "close to" the result, and thus the epoch associated
- * with them can be determined.
- */
-void
-GetNextXidAndEpoch(TransactionId *xid, uint32 *epoch)
-{
- uint32 ckptXidEpoch;
- TransactionId ckptXid;
- TransactionId nextXid;
-
- /* Must read checkpoint info first, else have race condition */
- SpinLockAcquire(&XLogCtl->info_lck);
- ckptXidEpoch = XLogCtl->ckptXidEpoch;
- ckptXid = XLogCtl->ckptXid;
- SpinLockRelease(&XLogCtl->info_lck);
-
- /* Now fetch current nextXid */
- nextXid = ReadNewTransactionId();
-
- /*
- * nextXid is certainly logically later than ckptXid. So if it's
- * numerically less, it must have wrapped into the next epoch.
- */
- if (nextXid < ckptXid)
- ckptXidEpoch++;
-
- *xid = nextXid;
- *epoch = ckptXidEpoch;
-}
-
/*
* This must be called ONCE during postmaster or standalone-backend shutdown
*/
@@ -8819,7 +8774,7 @@ CreateCheckPoint(int flags)
* there.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- checkPoint.nextXid = ShmemVariableCache->nextXid;
+ checkPoint.nextFullXid = ShmemVariableCache->nextFullXid;
checkPoint.oldestXid = ShmemVariableCache->oldestXid;
checkPoint.oldestXidDB = ShmemVariableCache->oldestXidDB;
LWLockRelease(XidGenLock);
@@ -8829,11 +8784,6 @@ CreateCheckPoint(int flags)
checkPoint.newestCommitTsXid = ShmemVariableCache->newestCommitTsXid;
LWLockRelease(CommitTsLock);
- /* Increase XID epoch if we've wrapped around since last checkpoint */
- checkPoint.nextXidEpoch = ControlFile->checkPointCopy.nextXidEpoch;
- if (checkPoint.nextXid < ControlFile->checkPointCopy.nextXid)
- checkPoint.nextXidEpoch++;
-
LWLockAcquire(OidGenLock, LW_SHARED);
checkPoint.nextOid = ShmemVariableCache->nextOid;
if (!shutdown)
@@ -8977,8 +8927,7 @@ CreateCheckPoint(int flags)
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/*
@@ -9733,7 +9682,7 @@ xlog_redo(XLogReaderState *record)
memcpy(&checkPoint, XLogRecGetData(record), sizeof(CheckPoint));
/* In a SHUTDOWN checkpoint, believe the counters exactly */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
LWLockRelease(XidGenLock);
LWLockAcquire(OidGenLock, LW_EXCLUSIVE);
ShmemVariableCache->nextOid = checkPoint.nextOid;
@@ -9787,9 +9736,9 @@ xlog_redo(XLogReaderState *record)
running.xcnt = nxids;
running.subxcnt = 0;
running.subxid_overflow = false;
- running.nextXid = checkPoint.nextXid;
+ running.nextXid = XidFromFullTransactionId(checkPoint.nextFullXid);
running.oldestRunningXid = oldestActiveXID;
- latestCompletedXid = checkPoint.nextXid;
+ latestCompletedXid = XidFromFullTransactionId(checkPoint.nextFullXid);
TransactionIdRetreat(latestCompletedXid);
Assert(TransactionIdIsNormal(latestCompletedXid));
running.latestCompletedXid = latestCompletedXid;
@@ -9801,13 +9750,11 @@ xlog_redo(XLogReaderState *record)
}
/* ControlFile->checkPointCopy always tracks the latest ckpt XID */
- ControlFile->checkPointCopy.nextXidEpoch = checkPoint.nextXidEpoch;
- ControlFile->checkPointCopy.nextXid = checkPoint.nextXid;
+ ControlFile->checkPointCopy.nextFullXid = checkPoint.nextFullXid;
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/*
@@ -9828,9 +9775,9 @@ xlog_redo(XLogReaderState *record)
memcpy(&checkPoint, XLogRecGetData(record), sizeof(CheckPoint));
/* In an ONLINE checkpoint, treat the XID counter as a minimum */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- if (TransactionIdPrecedes(ShmemVariableCache->nextXid,
- checkPoint.nextXid))
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ if (FullTransactionIdPrecedes(ShmemVariableCache->nextFullXid,
+ checkPoint.nextFullXid))
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
LWLockRelease(XidGenLock);
/*
@@ -9860,13 +9807,11 @@ xlog_redo(XLogReaderState *record)
SetTransactionIdLimit(checkPoint.oldestXid,
checkPoint.oldestXidDB);
/* ControlFile->checkPointCopy always tracks the latest ckpt XID */
- ControlFile->checkPointCopy.nextXidEpoch = checkPoint.nextXidEpoch;
- ControlFile->checkPointCopy.nextXid = checkPoint.nextXid;
+ ControlFile->checkPointCopy.nextFullXid = checkPoint.nextFullXid;
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/* TLI should not change in an on-line checkpoint */
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index d90cb9a9022..33f0d7ad299 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -647,7 +647,7 @@ vacuum_set_xid_limits(Relation rel,
* autovacuum_freeze_max_age / 2 XIDs old), complain and force a minimum
* freeze age of zero.
*/
- safeLimit = ReadNewTransactionId() - autovacuum_freeze_max_age;
+ safeLimit = XidFromFullTransactionId(ReadNextFullTransactionId()) - autovacuum_freeze_max_age;
if (!TransactionIdIsNormal(safeLimit))
safeLimit = FirstNormalTransactionId;
@@ -725,7 +725,7 @@ vacuum_set_xid_limits(Relation rel,
* Compute XID limit causing a full-table vacuum, being careful not to
* generate a "permanent" XID.
*/
- limit = ReadNewTransactionId() - freezetable;
+ limit = XidFromFullTransactionId(ReadNextFullTransactionId()) - freezetable;
if (!TransactionIdIsNormal(limit))
limit = FirstNormalTransactionId;
@@ -944,7 +944,7 @@ vac_update_relstats(Relation relation,
if (TransactionIdIsNormal(frozenxid) &&
pgcform->relfrozenxid != frozenxid &&
(TransactionIdPrecedes(pgcform->relfrozenxid, frozenxid) ||
- TransactionIdPrecedes(ReadNewTransactionId(),
+ TransactionIdPrecedes(XidFromFullTransactionId(ReadNextFullTransactionId()),
pgcform->relfrozenxid)))
{
pgcform->relfrozenxid = frozenxid;
@@ -1021,7 +1021,7 @@ vac_update_datfrozenxid(void)
* validly see during the scan. These are conservative values, but it's
* not really worth trying to be more exact.
*/
- lastSaneFrozenXid = ReadNewTransactionId();
+ lastSaneFrozenXid = XidFromFullTransactionId(ReadNextFullTransactionId());
lastSaneMinMulti = ReadNextMultiXactId();
/*
@@ -1157,7 +1157,7 @@ vac_truncate_clog(TransactionId frozenXID,
TransactionId lastSaneFrozenXid,
MultiXactId lastSaneMinMulti)
{
- TransactionId nextXID = ReadNewTransactionId();
+ TransactionId nextXID = XidFromFullTransactionId(ReadNextFullTransactionId());
Relation relation;
HeapScanDesc scan;
HeapTuple tuple;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 02e6d8131e0..3a07a0c3530 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1172,7 +1172,7 @@ do_start_worker(void)
* pass without forcing a vacuum. (This limit can be tightened for
* particular tables, but not loosened.)
*/
- recentXid = ReadNewTransactionId();
+ recentXid = XidFromFullTransactionId(ReadNextFullTransactionId());
xidForceLimit = recentXid - autovacuum_freeze_max_age;
/* ensure it's a "normal" XID, else TransactionIdPrecedes misbehaves */
/* this can cause the limit to go backwards by 3, but that's OK */
@@ -1703,7 +1703,7 @@ AutoVacWorkerMain(int argc, char *argv[])
pg_usleep(PostAuthDelay * 1000000L);
/* And do an appropriate amount of work */
- recentXid = ReadNewTransactionId();
+ recentXid = XidFromFullTransactionId(ReadNextFullTransactionId());
recentMulti = ReadNextMultiXactId();
do_autovacuum();
}
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 987bb84683c..6a11f8a06c1 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -1194,6 +1194,7 @@ static void
XLogWalRcvSendHSFeedback(bool immed)
{
TimestampTz now;
+ FullTransactionId nextFullXid;
TransactionId nextXid;
uint32 xmin_epoch,
catalog_xmin_epoch;
@@ -1272,7 +1273,9 @@ XLogWalRcvSendHSFeedback(bool immed)
* Get epoch and adjust if nextXid and oldestXmin are different sides of
* the epoch boundary.
*/
- GetNextXidAndEpoch(&nextXid, &xmin_epoch);
+ nextFullXid = ReadNextFullTransactionId();
+ nextXid = XidFromFullTransactionId(nextFullXid);
+ xmin_epoch = EpochFromFullTransactionId(nextFullXid);
catalog_xmin_epoch = xmin_epoch;
if (nextXid < xmin)
xmin_epoch--;
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index e47ddca6bca..6903dbc9ca1 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1888,35 +1888,20 @@ PhysicalReplicationSlotNewXmin(TransactionId feedbackXmin, TransactionId feedbac
* Check that the provided xmin/epoch are sane, that is, not in the future
* and not so far back as to be already wrapped around.
*
- * Epoch of nextXid should be same as standby, or if the counter has
- * wrapped, then one greater than standby.
- *
* This check doesn't care about whether clog exists for these xids
* at all.
*/
static bool
TransactionIdInRecentPast(TransactionId xid, uint32 epoch)
{
- TransactionId nextXid;
- uint32 nextEpoch;
-
- GetNextXidAndEpoch(&nextXid, &nextEpoch);
-
- if (xid <= nextXid)
- {
- if (epoch != nextEpoch)
- return false;
- }
- else
- {
- if (epoch + 1 != nextEpoch)
- return false;
- }
-
- if (!TransactionIdPrecedesOrEquals(xid, nextXid))
- return false; /* epoch OK, but it's wrapped around */
-
- return true;
+ FullTransactionId nextFullXid = ReadNextFullTransactionId();
+ FullTransactionId fullXid = MakeFullTransactionId(epoch, xid);
+
+ /* TODO: this is not nice */
+ return
+ FullTransactionIdPrecedesOrEquals(fullXid, nextFullXid) &&
+ U64FromFullTransactionId(nextFullXid) -
+ U64FromFullTransactionId(fullXid) < INT64CONST(1) << 32;
}
/*
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index bd20497d81a..0bf2a11e931 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -883,15 +883,10 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
* it, though.
*/
nextXid = latestObservedXid;
+ AdvanceNextFullTransactionIdPast(nextXid, true);
TransactionIdAdvance(nextXid);
- if (TransactionIdFollows(nextXid, ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = nextXid;
- LWLockRelease(XidGenLock);
- }
- Assert(TransactionIdIsValid(ShmemVariableCache->nextXid));
+ Assert(TransactionIdIsValid(XidFromFullTransactionId(ShmemVariableCache->nextFullXid)));
KnownAssignedXidsDisplay(trace_recovery(DEBUG3));
if (standbyState == STANDBY_SNAPSHOT_READY)
@@ -1979,7 +1974,7 @@ GetRunningTransactionData(void)
latestCompletedXid = ShmemVariableCache->latestCompletedXid;
- oldestRunningXid = ShmemVariableCache->nextXid;
+ oldestRunningXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* Spin over procArray collecting all xids
@@ -2068,7 +2063,7 @@ GetRunningTransactionData(void)
CurrentRunningXacts->xcnt = count - subcount;
CurrentRunningXacts->subxcnt = subcount;
CurrentRunningXacts->subxid_overflow = suboverflowed;
- CurrentRunningXacts->nextXid = ShmemVariableCache->nextXid;
+ CurrentRunningXacts->nextXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
CurrentRunningXacts->oldestRunningXid = oldestRunningXid;
CurrentRunningXacts->latestCompletedXid = latestCompletedXid;
@@ -2113,7 +2108,7 @@ GetOldestActiveTransactionId(void)
* have already completed), when we spin over it.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- oldestRunningXid = ShmemVariableCache->nextXid;
+ oldestRunningXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
LWLockRelease(XidGenLock);
/*
@@ -2181,7 +2176,7 @@ GetOldestSafeDecodingTransactionId(bool catalogOnly)
* a safe, albeit pessimal, value.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- oldestSafeXid = ShmemVariableCache->nextXid;
+ oldestSafeXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* If there's already a slot pegging the xmin horizon, we can start with
@@ -3237,10 +3232,8 @@ RecordKnownAssignedTransactionIds(TransactionId xid)
/* ShmemVariableCache->nextXid must be beyond any observed xid */
next_expected_xid = latestObservedXid;
+ AdvanceNextFullTransactionIdPast(next_expected_xid, false);
TransactionIdAdvance(next_expected_xid);
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = next_expected_xid;
- LWLockRelease(XidGenLock);
}
}
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index e8390311d03..880e6c14ef1 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -3270,7 +3270,7 @@ ReleasePredicateLocks(bool isCommit)
* transaction to complete before freeing some RAM; correctness of visible
* behavior is not affected.
*/
- MySerializableXact->finishedBefore = ShmemVariableCache->nextXid;
+ MySerializableXact->finishedBefore = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* If it's not a commit it's a rollback, and we can clear our locks
diff --git a/src/backend/utils/adt/txid.c b/src/backend/utils/adt/txid.c
index 7974c0bd3d8..4c34e215d26 100644
--- a/src/backend/utils/adt/txid.c
+++ b/src/backend/utils/adt/txid.c
@@ -91,7 +91,10 @@ typedef struct
static void
load_xid_epoch(TxidEpoch *state)
{
- GetNextXidAndEpoch(&state->last_xid, &state->epoch);
+ FullTransactionId fullXid = ReadNextFullTransactionId();
+
+ state->last_xid = XidFromFullTransactionId(fullXid);
+ state->epoch = EpochFromFullTransactionId(fullXid);
}
/*
@@ -114,8 +117,11 @@ TransactionIdInRecentPast(uint64 xid_with_epoch, TransactionId *extracted_xid)
TransactionId xid = (TransactionId) xid_with_epoch;
uint32 now_epoch;
TransactionId now_epoch_last_xid;
+ FullTransactionId now_xid;
- GetNextXidAndEpoch(&now_epoch_last_xid, &now_epoch);
+ now_xid = ReadNextFullTransactionId();
+ now_epoch_last_xid = XidFromFullTransactionId(now_xid);
+ now_epoch = EpochFromFullTransactionId(now_xid);
if (extracted_xid != NULL)
*extracted_xid = xid;
@@ -128,8 +134,7 @@ TransactionIdInRecentPast(uint64 xid_with_epoch, TransactionId *extracted_xid)
return true;
/* If the transaction ID is in the future, throw an error. */
- if (xid_epoch > now_epoch
- || (xid_epoch == now_epoch && xid > now_epoch_last_xid))
+ if (xid_with_epoch > U64FromFullTransactionId(now_xid))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("transaction ID %s is in the future",
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 3fc8b6a8a84..51a200544e5 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -16,6 +16,7 @@
#include "postgres.h"
#include "access/htup_details.h"
+#include "access/transam.h"
#include "access/xlog_internal.h"
#include "access/xlog.h"
#include "catalog/pg_control.h"
@@ -163,9 +164,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[5] = BoolGetDatum(ControlFile->checkPointCopy.fullPageWrites);
nulls[5] = false;
- values[6] = CStringGetTextDatum(psprintf("%u:%u",
- ControlFile->checkPointCopy.nextXidEpoch,
- ControlFile->checkPointCopy.nextXid));
+ values[6] = CStringGetTextDatum(psprintf(UINT64_FORMAT,
+ U64FromFullTransactionId(ControlFile->checkPointCopy.nextFullXid)));
nulls[6] = false;
values[7] = ObjectIdGetDatum(ControlFile->checkPointCopy.nextOid);
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index 895a51f89d5..a1c55e0c0d5 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -20,6 +20,7 @@
#include <time.h>
+#include "access/transam.h"
#include "access/xlog.h"
#include "access/xlog_internal.h"
#include "catalog/pg_control.h"
@@ -255,9 +256,8 @@ main(int argc, char *argv[])
ControlFile->checkPointCopy.PrevTimeLineID);
printf(_("Latest checkpoint's full_page_writes: %s\n"),
ControlFile->checkPointCopy.fullPageWrites ? _("on") : _("off"));
- printf(_("Latest checkpoint's NextXID: %u:%u\n"),
- ControlFile->checkPointCopy.nextXidEpoch,
- ControlFile->checkPointCopy.nextXid);
+ printf(_("Latest checkpoint's NextXID: " UINT64_FORMAT "\n"),
+ U64FromFullTransactionId(ControlFile->checkPointCopy.nextFullXid));
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index 8cff5356925..3fb15789d62 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -431,11 +431,15 @@ main(int argc, char *argv[])
* if any, includes these values.)
*/
if (set_xid_epoch != -1)
- ControlFile.checkPointCopy.nextXidEpoch = set_xid_epoch;
+ ControlFile.checkPointCopy.nextFullXid =
+ MakeFullTransactionId(set_xid_epoch,
+ XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
if (set_xid != 0)
{
- ControlFile.checkPointCopy.nextXid = set_xid;
+ ControlFile.checkPointCopy.nextFullXid =
+ MakeFullTransactionId(EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid),
+ set_xid);
/*
* For the moment, just set oldestXid to a value that will force
@@ -705,8 +709,7 @@ GuessControlValues(void)
ControlFile.checkPointCopy.ThisTimeLineID = 1;
ControlFile.checkPointCopy.PrevTimeLineID = 1;
ControlFile.checkPointCopy.fullPageWrites = false;
- ControlFile.checkPointCopy.nextXidEpoch = 0;
- ControlFile.checkPointCopy.nextXid = FirstNormalTransactionId;
+ ControlFile.checkPointCopy.nextFullXid = MakeFullTransactionId(0, FirstNormalTransactionId);
ControlFile.checkPointCopy.nextOid = FirstBootstrapObjectId;
ControlFile.checkPointCopy.nextMulti = FirstMultiXactId;
ControlFile.checkPointCopy.nextMultiOffset = 0;
@@ -785,9 +788,8 @@ PrintControlValues(bool guessed)
ControlFile.checkPointCopy.ThisTimeLineID);
printf(_("Latest checkpoint's full_page_writes: %s\n"),
ControlFile.checkPointCopy.fullPageWrites ? _("on") : _("off"));
- printf(_("Latest checkpoint's NextXID: %u:%u\n"),
- ControlFile.checkPointCopy.nextXidEpoch,
- ControlFile.checkPointCopy.nextXid);
+ printf(_("Latest checkpoint's NextXID: " UINT64_FORMAT "\n"),
+ U64FromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile.checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
@@ -879,7 +881,7 @@ PrintNewControlValues(void)
if (set_xid != 0)
{
printf(_("NextXID: %u\n"),
- ControlFile.checkPointCopy.nextXid);
+ XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
printf(_("OldestXID: %u\n"),
ControlFile.checkPointCopy.oldestXid);
printf(_("OldestXID's DB: %u\n"),
@@ -889,7 +891,7 @@ PrintNewControlValues(void)
if (set_xid_epoch != -1)
{
printf(_("NextXID epoch: %u\n"),
- ControlFile.checkPointCopy.nextXidEpoch);
+ EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
}
if (set_oldest_commit_ts_xid != 0)
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 83ec3f19797..070f3bfdc74 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -44,6 +44,32 @@
#define TransactionIdStore(xid, dest) (*(dest) = (xid))
#define StoreInvalidTransactionId(dest) (*(dest) = InvalidTransactionId)
+#define EpochFromFullTransactionId(x) ((uint32) ((x).value >> 32))
+#define XidFromFullTransactionId(x) ((uint32) (x).value)
+#define U64FromFullTransactionId(x) ((x).value)
+#define FullTransactionIdPrecedes(a, b) ((a).value < (b).value)
+#define FullTransactionIdPrecedesOrEquals(a, b) ((a).value <= (b).value)
+
+/*
+ * A 64 bit value that contains an epoch and a TransactionId. This is
+ * wrapped in a struct to prevent implicit conversion to/from TransactionId.
+ * Allowing such conversions seems likely to be error-prone.
+ */
+typedef struct FullTransactionId
+{
+ uint64 value;
+} FullTransactionId;
+
+static inline FullTransactionId
+MakeFullTransactionId(uint32 epoch, TransactionId xid)
+{
+ FullTransactionId result;
+
+ result.value = ((uint64) epoch) << 32 | xid;
+
+ return result;
+}
+
/* advance a transaction ID variable, handling wraparound correctly */
#define TransactionIdAdvance(dest) \
do { \
@@ -52,6 +78,16 @@
(dest) = FirstNormalTransactionId; \
} while(0)
+/* advance a FullTransactionId lvalue, handling wraparound correctly */
+static inline void
+FullTransactionIdAdvance(FullTransactionId *dest)
+{
+ dest->value++;
+ if (XidFromFullTransactionId(*dest) < FirstNormalTransactionId)
+ *dest = MakeFullTransactionId(EpochFromFullTransactionId(*dest),
+ FirstNormalTransactionId);
+}
+
/* back up a transaction ID variable, handling wraparound correctly */
#define TransactionIdRetreat(dest) \
do { \
@@ -114,12 +150,12 @@ typedef struct VariableCacheData
/*
* These fields are protected by XidGenLock.
*/
- TransactionId nextXid; /* next XID to assign */
+ FullTransactionId nextFullXid; /* next XID to assign */
TransactionId oldestXid; /* cluster-wide minimum datfrozenxid */
TransactionId xidVacLimit; /* start forcing autovacuums here */
TransactionId xidWarnLimit; /* start complaining here */
- TransactionId xidStopLimit; /* refuse to advance nextXid beyond here */
+ TransactionId xidStopLimit; /* refuse to advance nextFullXid beyond here */
TransactionId xidWrapLimit; /* where the world ends */
Oid oldestXidDB; /* database with minimum datfrozenxid */
@@ -176,7 +212,8 @@ extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
/* in transam/varsup.c */
extern TransactionId GetNewTransactionId(bool isSubXact);
-extern TransactionId ReadNewTransactionId(void);
+extern void AdvanceNextFullTransactionIdPast(TransactionId xid, bool lock_free_check);
+extern FullTransactionId ReadNextFullTransactionId(void);
extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
Oid oldest_datoid);
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 421ba6d7755..3c9d3401df5 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -276,7 +276,6 @@ extern XLogRecPtr GetRedoRecPtr(void);
extern XLogRecPtr GetInsertRecPtr(void);
extern XLogRecPtr GetFlushRecPtr(void);
extern XLogRecPtr GetLastImportantRecPtr(void);
-extern void GetNextXidAndEpoch(TransactionId *xid, uint32 *epoch);
extern void RemovePromoteSignalFiles(void);
extern bool CheckPromoteSignal(void);
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 773d9e6ebae..fa7ff049403 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -15,6 +15,7 @@
#ifndef PG_CONTROL_H
#define PG_CONTROL_H
+#include "access/transam.h"
#include "access/xlogdefs.h"
#include "pgtime.h" /* for pg_time_t */
#include "port/pg_crc32c.h"
@@ -39,8 +40,7 @@ typedef struct CheckPoint
TimeLineID PrevTimeLineID; /* previous TLI, if this record begins a new
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
- uint32 nextXidEpoch; /* higher-order bits of nextXid */
- TransactionId nextXid; /* next free XID */
+ FullTransactionId nextFullXid; /* next free XID */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index 1fcd8cf1b59..d1116454095 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -72,7 +72,7 @@ typedef struct RunningTransactionsData
int xcnt; /* # of xact ids in xids[] */
int subxcnt; /* # of subxact ids in xids[] */
bool subxid_overflow; /* snapshot overflowed, subxids missing */
- TransactionId nextXid; /* copy of ShmemVariableCache->nextXid */
+ TransactionId nextXid; /* xid from ShmemVariableCache->nextFullXid */
TransactionId oldestRunningXid; /* *not* oldestXmin */
TransactionId latestCompletedXid; /* so we can set xmax */
--
2.17.0
On 10 July 2018 at 07:35, Thomas Munro <thomas.munro@enterprisedb.com>
wrote:
I played around with this idea yesterday. Experiment-grade patch
attached. Approach:1. Introduce a new type BigTransactionId (better names welcome).
txid_current() should be changed to BigTransactionId too. It's currently
bigint.
Users are currently left in a real mess when it comes to pg_stat_activity
xids, etc, which are epoch-unqualified. txid_current() reports an
epoch-qualified xid, but there's no sensible and safe conversion from
txid_current()'s bigint to an epoch-qualified ID. age() takes 'xid',
everything takes 'xid', but is completely oblivious to epoch.
IMO all user-facing xids should be moved to BigTransactionId, with the
'xid' type ceasing to appear in any user facing role.
I think it's probably a good idea to make it very explicit when moving
between big and small transaction IDs, hence the including of the word
'big' in variable and function names and the use of a function-like
macro (rather than implicit conversion, which C doesn't give me a good
way to prevent).
The only way I know of to prevent it is to use a wrapper by-value struct.
I've used this technique in the past where it's critical that implicit
conversions don't occurr, but it sure isn't pretty.
typedef struct BigTransactionId
{
uint64 value;
} BigTransactionId;
instead of
typedef uint64 BigTransactionId;
It's annoying having to get the value member, but it's definitely highly
protective against errors. Pass-by-value isn't a problem, neither is
return-by-value.
BigTransactionId
add_one(BigTransactionId xid)
{
BigTransactionId ret;
ret.value = xid.value + 1;
return ret;
}
Personally I think it's potentially worth the required gymnastics if older
compilers do a sane job of code generation with this.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 10 July 2018 at 10:32, Craig Ringer <craig@2ndquadrant.com> wrote:
I think it's probably a good idea to make it very explicit when moving
between big and small transaction IDs, hence the including of the word
'big' in variable and function names and the use of a function-like
macro (rather than implicit conversion, which C doesn't give me a good
way to prevent).The only way I know of to prevent it is to use a wrapper by-value struct.
... and that's what I get for not finishing the thread before replying.
Anyway, please consider the other point re txid_current() and the user
facing xid type.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 2018-07-10 10:32:44 +0800, Craig Ringer wrote:
On 10 July 2018 at 07:35, Thomas Munro <thomas.munro@enterprisedb.com>
wrote:I played around with this idea yesterday. Experiment-grade patch
attached. Approach:1. Introduce a new type BigTransactionId (better names welcome).
txid_current() should be changed to BigTransactionId too. It's currently
bigint.
That's quite a bit more work though, no? We'd have to introduce a new
SQL level type (including operators). As I said nearby, I think that's
a good idea, but I don't see any sort of benefit of forcing those
patches to be done at once.
Users are currently left in a real mess when it comes to pg_stat_activity
xids, etc, which are epoch-unqualified. txid_current() reports an
epoch-qualified xid, but there's no sensible and safe conversion from
txid_current()'s bigint to an epoch-qualified ID.
I am confused what you mean by "to an epoch-qualified ID". You want to
split txid_current()'s return value into epoch and xid? Isn't that
fairly trivial?
SELECT bigxid & x'ffffff'::int8 AS xid, bigxid >> 32 epoch FROM txid_current() bigxid;
Now I'm not saying that's pretty and couldn't be made easier.
Greetings,
Andres Freund
On 10 July 2018 at 10:40, Andres Freund <andres@anarazel.de> wrote:
On 2018-07-10 10:32:44 +0800, Craig Ringer wrote:
On 10 July 2018 at 07:35, Thomas Munro <thomas.munro@enterprisedb.com>
wrote:I played around with this idea yesterday. Experiment-grade patch
attached. Approach:1. Introduce a new type BigTransactionId (better names welcome).
txid_current() should be changed to BigTransactionId too. It's currently
bigint.That's quite a bit more work though, no? We'd have to introduce a new
SQL level type (including operators). As I said nearby, I think that's
a good idea, but I don't see any sort of benefit of forcing those
patches to be done at once.
Yeah. You're right. One step at a time.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Jul 10, 2018 at 2:15 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
On Tue, Jul 10, 2018 at 1:30 PM, Andres Freund <andres@anarazel.de> wrote:,
(errmsg_internal("next transaction ID: %u:%u; next OID: %u", - checkPoint.nextXidEpoch, checkPoint.nextXid, + EpochFromFullTransactionId(checkPoint.nextFullXid), + XidFromFullTransactionId(checkPoint.nextFullXid), checkPoint.nextOid)));I don't think we should continue to expose epochs, but rather go to full
xids where appropriate.OK, done.
Ugh. When I changed pg_resetwal.c to print out the 64 bit value, I
broke controldata.c's code that reads those strings back in, as
revealed by a failing pg_upgrade test in check-world that I should
have waited for. Perhaps we shouldn't change that output format...
though I think it does make sense to move forwards here (and probably
eventually for the other values too). So here is a version that fixes
the parsing problem. Unfortunately pg_upgrade is not allowed to use
pg_strtouint64() so for now I jammed the same macrology into
controldata.c, but I assume that could be sorted out.
--
Thomas Munro
http://www.enterprisedb.com
Attachments:
0001-Track-the-next-xid-using-64-bits-v4.patchapplication/octet-stream; name=0001-Track-the-next-xid-using-64-bits-v4.patchDownload
From 015fe158be68464191cdcfb57c0441c61c8050a4 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@enterprisedb.com>
Date: Mon, 9 Jul 2018 21:54:03 +1200
Subject: [PATCH] Track the next xid using 64 bits.
Instead of tracking the epoch independently, start using a 64 bit transaction
ID in several places. This fix an unlikely bug where an epoch increment could
be missed if you managed to consume more than 2^32 transactions between
checkpoints.
Work in progress!
Author: Thomas Munro
Reviewed-by: Andres Freund
Diagnosis-by: Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1%2BMv%2Bmb0HFfWM9Srtc6MVe160WFurXV68iAFMcagRZ0dQ%40mail.gmail.com
---
src/backend/access/nbtree/nbtpage.c | 4 +-
src/backend/access/rmgrdesc/xlogdesc.c | 6 +-
src/backend/access/transam/clog.c | 4 +-
src/backend/access/transam/commit_ts.c | 5 +-
src/backend/access/transam/multixact.c | 9 +-
src/backend/access/transam/subtrans.c | 4 +-
src/backend/access/transam/twophase.c | 19 ++---
src/backend/access/transam/varsup.c | 53 +++++++++---
src/backend/access/transam/xact.c | 21 +----
src/backend/access/transam/xlog.c | 105 ++++++------------------
src/backend/commands/vacuum.c | 10 +--
src/backend/postmaster/autovacuum.c | 4 +-
src/backend/replication/walreceiver.c | 5 +-
src/backend/replication/walsender.c | 31 ++-----
src/backend/storage/ipc/procarray.c | 21 ++---
src/backend/storage/lmgr/predicate.c | 2 +-
src/backend/utils/adt/txid.c | 13 ++-
src/backend/utils/misc/pg_controldata.c | 6 +-
src/bin/pg_controldata/pg_controldata.c | 6 +-
src/bin/pg_resetwal/pg_resetwal.c | 20 +++--
src/bin/pg_upgrade/controldata.c | 22 +++++
src/include/access/transam.h | 43 +++++++++-
src/include/access/xlog.h | 1 -
src/include/catalog/pg_control.h | 4 +-
src/include/storage/standby.h | 2 +-
25 files changed, 208 insertions(+), 212 deletions(-)
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 2e959da5f85..3e1f3683734 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -1946,7 +1946,7 @@ _bt_unlink_halfdead_page(Relation rel, Buffer leafbuf, bool *rightsib_empty)
* Mark the page itself deleted. It can be recycled when all current
* transactions are gone. Storing GetTopTransactionId() would work, but
* we're in VACUUM and would not otherwise have an XID. Having already
- * updated links to the target, ReadNewTransactionId() suffices as an
+ * updated links to the target, ReadNextFullTransactionId() suffices as an
* upper bound. Any scan having retained a now-stale link is advertising
* in its PGXACT an xmin less than or equal to the value we read here. It
* will continue to do so, holding back RecentGlobalXmin, for the duration
@@ -1956,7 +1956,7 @@ _bt_unlink_halfdead_page(Relation rel, Buffer leafbuf, bool *rightsib_empty)
opaque = (BTPageOpaque) PageGetSpecialPointer(page);
opaque->btpo_flags &= ~BTP_HALF_DEAD;
opaque->btpo_flags |= BTP_DELETED;
- opaque->btpo.xact = ReadNewTransactionId();
+ opaque->btpo.xact = XidFromFullTransactionId(ReadNextFullTransactionId());
/* And update the metapage, if needed */
if (BufferIsValid(metabuf))
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 00741c7b09e..3655bac9ef1 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -14,6 +14,7 @@
*/
#include "postgres.h"
+#include "access/transam.h"
#include "access/xlog.h"
#include "access/xlog_internal.h"
#include "catalog/pg_control.h"
@@ -44,7 +45,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
CheckPoint *checkpoint = (CheckPoint *) rec;
appendStringInfo(buf, "redo %X/%X; "
- "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
+ "tli %u; prev tli %u; fpw %s; xid " UINT64_FORMAT ";"
+ " oid %u; multi %u; offset %u; "
"oldest xid %u in DB %u; oldest multi %u in DB %u; "
"oldest/newest commit timestamp xid: %u/%u; "
"oldest running xid %u; %s",
@@ -52,7 +54,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->ThisTimeLineID,
checkpoint->PrevTimeLineID,
checkpoint->fullPageWrites ? "true" : "false",
- checkpoint->nextXidEpoch, checkpoint->nextXid,
+ U64FromFullTransactionId(checkpoint->nextFullXid),
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index 8b7ff5b0c24..4fccbc9516c 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -754,7 +754,7 @@ ZeroCLOGPage(int pageno, bool writeXlog)
void
StartupCLOG(void)
{
- TransactionId xid = ShmemVariableCache->nextXid;
+ TransactionId xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
int pageno = TransactionIdToPage(xid);
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
@@ -773,7 +773,7 @@ StartupCLOG(void)
void
TrimCLOG(void)
{
- TransactionId xid = ShmemVariableCache->nextXid;
+ TransactionId xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
int pageno = TransactionIdToPage(xid);
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 73fac1ba81d..da85904b3d6 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -644,7 +644,7 @@ ActivateCommitTs(void)
}
LWLockRelease(CommitTsLock);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
pageno = TransactionIdToCTsPage(xid);
/*
@@ -671,7 +671,8 @@ ActivateCommitTs(void)
if (ShmemVariableCache->oldestCommitTsXid == InvalidTransactionId)
{
ShmemVariableCache->oldestCommitTsXid =
- ShmemVariableCache->newestCommitTsXid = ReadNewTransactionId();
+ ShmemVariableCache->newestCommitTsXid =
+ XidFromFullTransactionId(ReadNextFullTransactionId());
}
LWLockRelease(CommitTsLock);
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index a9a51055e96..7478beed44d 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -3283,14 +3283,7 @@ multixact_redo(XLogReaderState *record)
* process doesn't need to hold a lock while checking this. We still
* acquire the lock to modify it, though.
*/
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPast(max_xid, true);
}
else if (info == XLOG_MULTIXACT_TRUNCATE_ID)
{
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c
index 4faa21f5aef..0cded8ac3c6 100644
--- a/src/backend/access/transam/subtrans.c
+++ b/src/backend/access/transam/subtrans.c
@@ -249,6 +249,7 @@ ZeroSUBTRANSPage(int pageno)
void
StartupSUBTRANS(TransactionId oldestActiveXID)
{
+ FullTransactionId nextFullXid;
int startPage;
int endPage;
@@ -261,7 +262,8 @@ StartupSUBTRANS(TransactionId oldestActiveXID)
LWLockAcquire(SubtransControlLock, LW_EXCLUSIVE);
startPage = TransactionIdToPage(oldestActiveXID);
- endPage = TransactionIdToPage(ShmemVariableCache->nextXid);
+ nextFullXid = ShmemVariableCache->nextFullXid;
+ endPage = TransactionIdToPage(XidFromFullTransactionId(nextFullXid));
while (startPage != endPage)
{
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index e8d4e37fe30..36603a9a34c 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -1879,7 +1879,8 @@ restoreTwoPhaseData(void)
TransactionId
PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
{
- TransactionId origNextXid = ShmemVariableCache->nextXid;
+ FullTransactionId nextFullXid = ShmemVariableCache->nextFullXid;
+ TransactionId origNextXid = XidFromFullTransactionId(nextFullXid);
TransactionId result = origNextXid;
TransactionId *xids = NULL;
int nxids = 0;
@@ -2104,7 +2105,8 @@ ProcessTwoPhaseBuffer(TransactionId xid,
bool fromdisk,
bool setParent, bool setNextXid)
{
- TransactionId origNextXid = ShmemVariableCache->nextXid;
+ FullTransactionId nextFullXid = ShmemVariableCache->nextFullXid;
+ TransactionId origNextXid = XidFromFullTransactionId(nextFullXid);
TransactionId *subxids;
char *buf;
TwoPhaseFileHeader *hdr;
@@ -2210,23 +2212,14 @@ ProcessTwoPhaseBuffer(TransactionId xid,
Assert(TransactionIdFollows(subxid, xid));
/* update nextXid if needed */
- if (setNextXid &&
- TransactionIdFollowsOrEquals(subxid,
- ShmemVariableCache->nextXid))
+ if (setNextXid)
{
/*
* We don't expect anyone else to modify nextXid, hence we don't
* need to hold a lock while examining it. We still acquire the
* lock to modify it, though, so we recheck.
*/
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- if (TransactionIdFollowsOrEquals(subxid,
- ShmemVariableCache->nextXid))
- {
- ShmemVariableCache->nextXid = subxid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- }
- LWLockRelease(XidGenLock);
+ AdvanceNextFullTransactionIdPast(subxid, true);
}
if (setParent)
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 394843f7e91..d1396f3f0e1 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -73,7 +73,7 @@ GetNewTransactionId(bool isSubXact)
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*----------
* Check to see if it's safe to assign another XID. This protects against
@@ -156,7 +156,7 @@ GetNewTransactionId(bool isSubXact)
/* Re-acquire lock and start over */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
}
/*
@@ -173,12 +173,12 @@ GetNewTransactionId(bool isSubXact)
ExtendSUBTRANS(xid);
/*
- * Now advance the nextXid counter. This must not happen until after we
+ * Now advance the nextFullXid counter. This must not happen until after we
* have successfully completed ExtendCLOG() --- if that routine fails, we
* want the next incoming transaction to try it again. We cannot assign
* more XIDs until there is CLOG space for them.
*/
- TransactionIdAdvance(ShmemVariableCache->nextXid);
+ FullTransactionIdAdvance(&ShmemVariableCache->nextFullXid);
/*
* We must store the new XID into the shared ProcArray before releasing
@@ -244,18 +244,47 @@ GetNewTransactionId(bool isSubXact)
}
/*
- * Read nextXid but don't allocate it.
+ * Read nextFullXid but don't allocate it.
*/
-TransactionId
-ReadNewTransactionId(void)
+FullTransactionId
+ReadNextFullTransactionId(void)
{
- TransactionId xid;
+ FullTransactionId fullXid;
LWLockAcquire(XidGenLock, LW_SHARED);
- xid = ShmemVariableCache->nextXid;
+ fullXid = ShmemVariableCache->nextFullXid;
LWLockRelease(XidGenLock);
- return xid;
+ return fullXid;
+}
+
+/*
+ * Advance nextFullXid to the value after a given xid. The epoch is inferred.
+ * If lock_free_check is true, then the caller must be sure that it's safe to
+ * read nextFullXid without holding XidGenLock (ie during recovery).
+ */
+void
+AdvanceNextFullTransactionIdPast(TransactionId xid, bool lock_free_check)
+{
+ TransactionId current_xid;
+ uint32 epoch;
+
+ if (lock_free_check &&
+ !TransactionIdFollowsOrEquals(xid,
+ XidFromFullTransactionId(ShmemVariableCache->nextFullXid)))
+ return;
+
+ LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
+ current_xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
+ if (TransactionIdFollowsOrEquals(xid, current_xid))
+ {
+ epoch = EpochFromFullTransactionId(ShmemVariableCache->nextFullXid);
+ if (xid < current_xid)
+ ++epoch; /* epoch wrapped */
+ ShmemVariableCache->nextFullXid = MakeFullTransactionId(epoch, xid);
+ FullTransactionIdAdvance(&ShmemVariableCache->nextFullXid);
+ }
+ LWLockRelease(XidGenLock);
}
/*
@@ -359,7 +388,7 @@ SetTransactionIdLimit(TransactionId oldest_datfrozenxid, Oid oldest_datoid)
ShmemVariableCache->xidStopLimit = xidStopLimit;
ShmemVariableCache->xidWrapLimit = xidWrapLimit;
ShmemVariableCache->oldestXidDB = oldest_datoid;
- curXid = ShmemVariableCache->nextXid;
+ curXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
LWLockRelease(XidGenLock);
/* Log the info */
@@ -435,7 +464,7 @@ ForceTransactionIdLimitUpdate(void)
/* Locking is probably not really necessary, but let's be careful */
LWLockAcquire(XidGenLock, LW_SHARED);
- nextXid = ShmemVariableCache->nextXid;
+ nextXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
xidVacLimit = ShmemVariableCache->xidVacLimit;
oldestXid = ShmemVariableCache->oldestXid;
oldestXidDB = ShmemVariableCache->oldestXidDB;
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 1da1f13ef33..6ccccc760d5 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -468,7 +468,7 @@ GetStableLatestTransactionId(void)
lxid = MyProc->lxid;
stablexid = GetTopTransactionIdIfAny();
if (!TransactionIdIsValid(stablexid))
- stablexid = ReadNewTransactionId();
+ stablexid = XidFromFullTransactionId(ReadNextFullTransactionId());
}
Assert(TransactionIdIsValid(stablexid));
@@ -5529,14 +5529,7 @@ xact_redo_commit(xl_xact_parsed_commit *parsed,
* hold a lock while checking this. We still acquire the lock to modify
* it, though.
*/
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPast(max_xid, true);
Assert(((parsed->xinfo & XACT_XINFO_HAS_ORIGIN) == 0) ==
(origin_id == InvalidRepOriginId));
@@ -5688,15 +5681,7 @@ xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid)
max_xid = TransactionIdLatest(xid,
parsed->nsubxacts,
parsed->subxacts);
-
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPast(max_xid, true);
if (standbyState == STANDBY_DISABLED)
{
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 20b23cb3609..101a03d7c45 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -579,8 +579,7 @@ typedef struct XLogCtlData
/* Protected by info_lck: */
XLogwrtRqst LogwrtRqst;
XLogRecPtr RedoRecPtr; /* a recent copy of Insert->RedoRecPtr */
- uint32 ckptXidEpoch; /* nextXID & epoch of latest checkpoint */
- TransactionId ckptXid;
+ FullTransactionId ckptFullXid; /* nextXID & epoch of latest checkpoint */
XLogRecPtr asyncXactLSN; /* LSN of newest async commit/abort */
XLogRecPtr replicationSlotMinLSN; /* oldest LSN needed by any slot */
@@ -5102,8 +5101,7 @@ BootStrapXLOG(void)
checkPoint.ThisTimeLineID = ThisTimeLineID;
checkPoint.PrevTimeLineID = ThisTimeLineID;
checkPoint.fullPageWrites = fullPageWrites;
- checkPoint.nextXidEpoch = 0;
- checkPoint.nextXid = FirstNormalTransactionId;
+ checkPoint.nextFullXid = MakeFullTransactionId(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstBootstrapObjectId;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
@@ -5116,7 +5114,7 @@ BootStrapXLOG(void)
checkPoint.time = (pg_time_t) time(NULL);
checkPoint.oldestActiveXid = InvalidTransactionId;
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
@@ -6699,8 +6697,8 @@ StartupXLOG(void)
(uint32) (checkPoint.redo >> 32), (uint32) checkPoint.redo,
wasShutdown ? "true" : "false")));
ereport(DEBUG1,
- (errmsg_internal("next transaction ID: %u:%u; next OID: %u",
- checkPoint.nextXidEpoch, checkPoint.nextXid,
+ (errmsg_internal("next transaction ID: " UINT64_FORMAT "; next OID: %u",
+ U64FromFullTransactionId(checkPoint.nextFullXid),
checkPoint.nextOid)));
ereport(DEBUG1,
(errmsg_internal("next MultiXactId: %u; next MultiXactOffset: %u",
@@ -6715,12 +6713,12 @@ StartupXLOG(void)
(errmsg_internal("commit timestamp Xid oldest/newest: %u/%u",
checkPoint.oldestCommitTsXid,
checkPoint.newestCommitTsXid)));
- if (!TransactionIdIsNormal(checkPoint.nextXid))
+ if (!TransactionIdIsNormal(XidFromFullTransactionId(checkPoint.nextFullXid)))
ereport(PANIC,
(errmsg("invalid next transaction ID")));
/* initialize shared memory variables from the checkpoint record */
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
@@ -6729,8 +6727,7 @@ StartupXLOG(void)
SetMultiXactIdLimit(checkPoint.oldestMulti, checkPoint.oldestMultiDB, true);
SetCommitTsLimit(checkPoint.oldestCommitTsXid,
checkPoint.newestCommitTsXid);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
/*
* Initialize replication slots, before there's a chance to remove
@@ -7000,7 +6997,7 @@ StartupXLOG(void)
Assert(TransactionIdIsValid(oldestActiveXID));
/* Tell procarray about the range of xids it has to deal with */
- ProcArrayInitRecovery(ShmemVariableCache->nextXid);
+ ProcArrayInitRecovery(XidFromFullTransactionId(ShmemVariableCache->nextFullXid));
/*
* Startup commit log and subtrans only. MultiXact and commit
@@ -7030,9 +7027,9 @@ StartupXLOG(void)
running.xcnt = nxids;
running.subxcnt = 0;
running.subxid_overflow = false;
- running.nextXid = checkPoint.nextXid;
+ running.nextXid = XidFromFullTransactionId(checkPoint.nextFullXid);
running.oldestRunningXid = oldestActiveXID;
- latestCompletedXid = checkPoint.nextXid;
+ latestCompletedXid = XidFromFullTransactionId(checkPoint.nextFullXid);
TransactionIdRetreat(latestCompletedXid);
Assert(TransactionIdIsNormal(latestCompletedXid));
running.latestCompletedXid = latestCompletedXid;
@@ -7208,14 +7205,7 @@ StartupXLOG(void)
* don't need to hold a lock while examining it. We still
* acquire the lock to modify it, though.
*/
- if (TransactionIdFollowsOrEquals(record->xl_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = record->xl_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPast(record->xl_xid, true);
/*
* Before replaying this record, check if this record causes
@@ -7782,7 +7772,7 @@ StartupXLOG(void)
/* also initialize latestCompletedXid, to nextXid - 1 */
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- ShmemVariableCache->latestCompletedXid = ShmemVariableCache->nextXid;
+ ShmemVariableCache->latestCompletedXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
TransactionIdRetreat(ShmemVariableCache->latestCompletedXid);
LWLockRelease(ProcArrayLock);
@@ -8375,41 +8365,6 @@ GetLastSegSwitchData(XLogRecPtr *lastSwitchLSN)
return result;
}
-/*
- * GetNextXidAndEpoch - get the current nextXid value and associated epoch
- *
- * This is exported for use by code that would like to have 64-bit XIDs.
- * We don't really support such things, but all XIDs within the system
- * can be presumed "close to" the result, and thus the epoch associated
- * with them can be determined.
- */
-void
-GetNextXidAndEpoch(TransactionId *xid, uint32 *epoch)
-{
- uint32 ckptXidEpoch;
- TransactionId ckptXid;
- TransactionId nextXid;
-
- /* Must read checkpoint info first, else have race condition */
- SpinLockAcquire(&XLogCtl->info_lck);
- ckptXidEpoch = XLogCtl->ckptXidEpoch;
- ckptXid = XLogCtl->ckptXid;
- SpinLockRelease(&XLogCtl->info_lck);
-
- /* Now fetch current nextXid */
- nextXid = ReadNewTransactionId();
-
- /*
- * nextXid is certainly logically later than ckptXid. So if it's
- * numerically less, it must have wrapped into the next epoch.
- */
- if (nextXid < ckptXid)
- ckptXidEpoch++;
-
- *xid = nextXid;
- *epoch = ckptXidEpoch;
-}
-
/*
* This must be called ONCE during postmaster or standalone-backend shutdown
*/
@@ -8819,7 +8774,7 @@ CreateCheckPoint(int flags)
* there.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- checkPoint.nextXid = ShmemVariableCache->nextXid;
+ checkPoint.nextFullXid = ShmemVariableCache->nextFullXid;
checkPoint.oldestXid = ShmemVariableCache->oldestXid;
checkPoint.oldestXidDB = ShmemVariableCache->oldestXidDB;
LWLockRelease(XidGenLock);
@@ -8829,11 +8784,6 @@ CreateCheckPoint(int flags)
checkPoint.newestCommitTsXid = ShmemVariableCache->newestCommitTsXid;
LWLockRelease(CommitTsLock);
- /* Increase XID epoch if we've wrapped around since last checkpoint */
- checkPoint.nextXidEpoch = ControlFile->checkPointCopy.nextXidEpoch;
- if (checkPoint.nextXid < ControlFile->checkPointCopy.nextXid)
- checkPoint.nextXidEpoch++;
-
LWLockAcquire(OidGenLock, LW_SHARED);
checkPoint.nextOid = ShmemVariableCache->nextOid;
if (!shutdown)
@@ -8977,8 +8927,7 @@ CreateCheckPoint(int flags)
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/*
@@ -9733,7 +9682,7 @@ xlog_redo(XLogReaderState *record)
memcpy(&checkPoint, XLogRecGetData(record), sizeof(CheckPoint));
/* In a SHUTDOWN checkpoint, believe the counters exactly */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
LWLockRelease(XidGenLock);
LWLockAcquire(OidGenLock, LW_EXCLUSIVE);
ShmemVariableCache->nextOid = checkPoint.nextOid;
@@ -9787,9 +9736,9 @@ xlog_redo(XLogReaderState *record)
running.xcnt = nxids;
running.subxcnt = 0;
running.subxid_overflow = false;
- running.nextXid = checkPoint.nextXid;
+ running.nextXid = XidFromFullTransactionId(checkPoint.nextFullXid);
running.oldestRunningXid = oldestActiveXID;
- latestCompletedXid = checkPoint.nextXid;
+ latestCompletedXid = XidFromFullTransactionId(checkPoint.nextFullXid);
TransactionIdRetreat(latestCompletedXid);
Assert(TransactionIdIsNormal(latestCompletedXid));
running.latestCompletedXid = latestCompletedXid;
@@ -9801,13 +9750,11 @@ xlog_redo(XLogReaderState *record)
}
/* ControlFile->checkPointCopy always tracks the latest ckpt XID */
- ControlFile->checkPointCopy.nextXidEpoch = checkPoint.nextXidEpoch;
- ControlFile->checkPointCopy.nextXid = checkPoint.nextXid;
+ ControlFile->checkPointCopy.nextFullXid = checkPoint.nextFullXid;
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/*
@@ -9828,9 +9775,9 @@ xlog_redo(XLogReaderState *record)
memcpy(&checkPoint, XLogRecGetData(record), sizeof(CheckPoint));
/* In an ONLINE checkpoint, treat the XID counter as a minimum */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- if (TransactionIdPrecedes(ShmemVariableCache->nextXid,
- checkPoint.nextXid))
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ if (FullTransactionIdPrecedes(ShmemVariableCache->nextFullXid,
+ checkPoint.nextFullXid))
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
LWLockRelease(XidGenLock);
/*
@@ -9860,13 +9807,11 @@ xlog_redo(XLogReaderState *record)
SetTransactionIdLimit(checkPoint.oldestXid,
checkPoint.oldestXidDB);
/* ControlFile->checkPointCopy always tracks the latest ckpt XID */
- ControlFile->checkPointCopy.nextXidEpoch = checkPoint.nextXidEpoch;
- ControlFile->checkPointCopy.nextXid = checkPoint.nextXid;
+ ControlFile->checkPointCopy.nextFullXid = checkPoint.nextFullXid;
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/* TLI should not change in an on-line checkpoint */
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index d90cb9a9022..33f0d7ad299 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -647,7 +647,7 @@ vacuum_set_xid_limits(Relation rel,
* autovacuum_freeze_max_age / 2 XIDs old), complain and force a minimum
* freeze age of zero.
*/
- safeLimit = ReadNewTransactionId() - autovacuum_freeze_max_age;
+ safeLimit = XidFromFullTransactionId(ReadNextFullTransactionId()) - autovacuum_freeze_max_age;
if (!TransactionIdIsNormal(safeLimit))
safeLimit = FirstNormalTransactionId;
@@ -725,7 +725,7 @@ vacuum_set_xid_limits(Relation rel,
* Compute XID limit causing a full-table vacuum, being careful not to
* generate a "permanent" XID.
*/
- limit = ReadNewTransactionId() - freezetable;
+ limit = XidFromFullTransactionId(ReadNextFullTransactionId()) - freezetable;
if (!TransactionIdIsNormal(limit))
limit = FirstNormalTransactionId;
@@ -944,7 +944,7 @@ vac_update_relstats(Relation relation,
if (TransactionIdIsNormal(frozenxid) &&
pgcform->relfrozenxid != frozenxid &&
(TransactionIdPrecedes(pgcform->relfrozenxid, frozenxid) ||
- TransactionIdPrecedes(ReadNewTransactionId(),
+ TransactionIdPrecedes(XidFromFullTransactionId(ReadNextFullTransactionId()),
pgcform->relfrozenxid)))
{
pgcform->relfrozenxid = frozenxid;
@@ -1021,7 +1021,7 @@ vac_update_datfrozenxid(void)
* validly see during the scan. These are conservative values, but it's
* not really worth trying to be more exact.
*/
- lastSaneFrozenXid = ReadNewTransactionId();
+ lastSaneFrozenXid = XidFromFullTransactionId(ReadNextFullTransactionId());
lastSaneMinMulti = ReadNextMultiXactId();
/*
@@ -1157,7 +1157,7 @@ vac_truncate_clog(TransactionId frozenXID,
TransactionId lastSaneFrozenXid,
MultiXactId lastSaneMinMulti)
{
- TransactionId nextXID = ReadNewTransactionId();
+ TransactionId nextXID = XidFromFullTransactionId(ReadNextFullTransactionId());
Relation relation;
HeapScanDesc scan;
HeapTuple tuple;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 02e6d8131e0..3a07a0c3530 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1172,7 +1172,7 @@ do_start_worker(void)
* pass without forcing a vacuum. (This limit can be tightened for
* particular tables, but not loosened.)
*/
- recentXid = ReadNewTransactionId();
+ recentXid = XidFromFullTransactionId(ReadNextFullTransactionId());
xidForceLimit = recentXid - autovacuum_freeze_max_age;
/* ensure it's a "normal" XID, else TransactionIdPrecedes misbehaves */
/* this can cause the limit to go backwards by 3, but that's OK */
@@ -1703,7 +1703,7 @@ AutoVacWorkerMain(int argc, char *argv[])
pg_usleep(PostAuthDelay * 1000000L);
/* And do an appropriate amount of work */
- recentXid = ReadNewTransactionId();
+ recentXid = XidFromFullTransactionId(ReadNextFullTransactionId());
recentMulti = ReadNextMultiXactId();
do_autovacuum();
}
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 987bb84683c..6a11f8a06c1 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -1194,6 +1194,7 @@ static void
XLogWalRcvSendHSFeedback(bool immed)
{
TimestampTz now;
+ FullTransactionId nextFullXid;
TransactionId nextXid;
uint32 xmin_epoch,
catalog_xmin_epoch;
@@ -1272,7 +1273,9 @@ XLogWalRcvSendHSFeedback(bool immed)
* Get epoch and adjust if nextXid and oldestXmin are different sides of
* the epoch boundary.
*/
- GetNextXidAndEpoch(&nextXid, &xmin_epoch);
+ nextFullXid = ReadNextFullTransactionId();
+ nextXid = XidFromFullTransactionId(nextFullXid);
+ xmin_epoch = EpochFromFullTransactionId(nextFullXid);
catalog_xmin_epoch = xmin_epoch;
if (nextXid < xmin)
xmin_epoch--;
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index e47ddca6bca..6903dbc9ca1 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1888,35 +1888,20 @@ PhysicalReplicationSlotNewXmin(TransactionId feedbackXmin, TransactionId feedbac
* Check that the provided xmin/epoch are sane, that is, not in the future
* and not so far back as to be already wrapped around.
*
- * Epoch of nextXid should be same as standby, or if the counter has
- * wrapped, then one greater than standby.
- *
* This check doesn't care about whether clog exists for these xids
* at all.
*/
static bool
TransactionIdInRecentPast(TransactionId xid, uint32 epoch)
{
- TransactionId nextXid;
- uint32 nextEpoch;
-
- GetNextXidAndEpoch(&nextXid, &nextEpoch);
-
- if (xid <= nextXid)
- {
- if (epoch != nextEpoch)
- return false;
- }
- else
- {
- if (epoch + 1 != nextEpoch)
- return false;
- }
-
- if (!TransactionIdPrecedesOrEquals(xid, nextXid))
- return false; /* epoch OK, but it's wrapped around */
-
- return true;
+ FullTransactionId nextFullXid = ReadNextFullTransactionId();
+ FullTransactionId fullXid = MakeFullTransactionId(epoch, xid);
+
+ /* TODO: this is not nice */
+ return
+ FullTransactionIdPrecedesOrEquals(fullXid, nextFullXid) &&
+ U64FromFullTransactionId(nextFullXid) -
+ U64FromFullTransactionId(fullXid) < INT64CONST(1) << 32;
}
/*
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index bd20497d81a..0bf2a11e931 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -883,15 +883,10 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
* it, though.
*/
nextXid = latestObservedXid;
+ AdvanceNextFullTransactionIdPast(nextXid, true);
TransactionIdAdvance(nextXid);
- if (TransactionIdFollows(nextXid, ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = nextXid;
- LWLockRelease(XidGenLock);
- }
- Assert(TransactionIdIsValid(ShmemVariableCache->nextXid));
+ Assert(TransactionIdIsValid(XidFromFullTransactionId(ShmemVariableCache->nextFullXid)));
KnownAssignedXidsDisplay(trace_recovery(DEBUG3));
if (standbyState == STANDBY_SNAPSHOT_READY)
@@ -1979,7 +1974,7 @@ GetRunningTransactionData(void)
latestCompletedXid = ShmemVariableCache->latestCompletedXid;
- oldestRunningXid = ShmemVariableCache->nextXid;
+ oldestRunningXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* Spin over procArray collecting all xids
@@ -2068,7 +2063,7 @@ GetRunningTransactionData(void)
CurrentRunningXacts->xcnt = count - subcount;
CurrentRunningXacts->subxcnt = subcount;
CurrentRunningXacts->subxid_overflow = suboverflowed;
- CurrentRunningXacts->nextXid = ShmemVariableCache->nextXid;
+ CurrentRunningXacts->nextXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
CurrentRunningXacts->oldestRunningXid = oldestRunningXid;
CurrentRunningXacts->latestCompletedXid = latestCompletedXid;
@@ -2113,7 +2108,7 @@ GetOldestActiveTransactionId(void)
* have already completed), when we spin over it.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- oldestRunningXid = ShmemVariableCache->nextXid;
+ oldestRunningXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
LWLockRelease(XidGenLock);
/*
@@ -2181,7 +2176,7 @@ GetOldestSafeDecodingTransactionId(bool catalogOnly)
* a safe, albeit pessimal, value.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- oldestSafeXid = ShmemVariableCache->nextXid;
+ oldestSafeXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* If there's already a slot pegging the xmin horizon, we can start with
@@ -3237,10 +3232,8 @@ RecordKnownAssignedTransactionIds(TransactionId xid)
/* ShmemVariableCache->nextXid must be beyond any observed xid */
next_expected_xid = latestObservedXid;
+ AdvanceNextFullTransactionIdPast(next_expected_xid, false);
TransactionIdAdvance(next_expected_xid);
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = next_expected_xid;
- LWLockRelease(XidGenLock);
}
}
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index e8390311d03..880e6c14ef1 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -3270,7 +3270,7 @@ ReleasePredicateLocks(bool isCommit)
* transaction to complete before freeing some RAM; correctness of visible
* behavior is not affected.
*/
- MySerializableXact->finishedBefore = ShmemVariableCache->nextXid;
+ MySerializableXact->finishedBefore = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* If it's not a commit it's a rollback, and we can clear our locks
diff --git a/src/backend/utils/adt/txid.c b/src/backend/utils/adt/txid.c
index 7974c0bd3d8..4c34e215d26 100644
--- a/src/backend/utils/adt/txid.c
+++ b/src/backend/utils/adt/txid.c
@@ -91,7 +91,10 @@ typedef struct
static void
load_xid_epoch(TxidEpoch *state)
{
- GetNextXidAndEpoch(&state->last_xid, &state->epoch);
+ FullTransactionId fullXid = ReadNextFullTransactionId();
+
+ state->last_xid = XidFromFullTransactionId(fullXid);
+ state->epoch = EpochFromFullTransactionId(fullXid);
}
/*
@@ -114,8 +117,11 @@ TransactionIdInRecentPast(uint64 xid_with_epoch, TransactionId *extracted_xid)
TransactionId xid = (TransactionId) xid_with_epoch;
uint32 now_epoch;
TransactionId now_epoch_last_xid;
+ FullTransactionId now_xid;
- GetNextXidAndEpoch(&now_epoch_last_xid, &now_epoch);
+ now_xid = ReadNextFullTransactionId();
+ now_epoch_last_xid = XidFromFullTransactionId(now_xid);
+ now_epoch = EpochFromFullTransactionId(now_xid);
if (extracted_xid != NULL)
*extracted_xid = xid;
@@ -128,8 +134,7 @@ TransactionIdInRecentPast(uint64 xid_with_epoch, TransactionId *extracted_xid)
return true;
/* If the transaction ID is in the future, throw an error. */
- if (xid_epoch > now_epoch
- || (xid_epoch == now_epoch && xid > now_epoch_last_xid))
+ if (xid_with_epoch > U64FromFullTransactionId(now_xid))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("transaction ID %s is in the future",
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 3fc8b6a8a84..51a200544e5 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -16,6 +16,7 @@
#include "postgres.h"
#include "access/htup_details.h"
+#include "access/transam.h"
#include "access/xlog_internal.h"
#include "access/xlog.h"
#include "catalog/pg_control.h"
@@ -163,9 +164,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
values[5] = BoolGetDatum(ControlFile->checkPointCopy.fullPageWrites);
nulls[5] = false;
- values[6] = CStringGetTextDatum(psprintf("%u:%u",
- ControlFile->checkPointCopy.nextXidEpoch,
- ControlFile->checkPointCopy.nextXid));
+ values[6] = CStringGetTextDatum(psprintf(UINT64_FORMAT,
+ U64FromFullTransactionId(ControlFile->checkPointCopy.nextFullXid)));
nulls[6] = false;
values[7] = ObjectIdGetDatum(ControlFile->checkPointCopy.nextOid);
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index 895a51f89d5..a1c55e0c0d5 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -20,6 +20,7 @@
#include <time.h>
+#include "access/transam.h"
#include "access/xlog.h"
#include "access/xlog_internal.h"
#include "catalog/pg_control.h"
@@ -255,9 +256,8 @@ main(int argc, char *argv[])
ControlFile->checkPointCopy.PrevTimeLineID);
printf(_("Latest checkpoint's full_page_writes: %s\n"),
ControlFile->checkPointCopy.fullPageWrites ? _("on") : _("off"));
- printf(_("Latest checkpoint's NextXID: %u:%u\n"),
- ControlFile->checkPointCopy.nextXidEpoch,
- ControlFile->checkPointCopy.nextXid);
+ printf(_("Latest checkpoint's NextXID: " UINT64_FORMAT "\n"),
+ U64FromFullTransactionId(ControlFile->checkPointCopy.nextFullXid));
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index 8cff5356925..3fb15789d62 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -431,11 +431,15 @@ main(int argc, char *argv[])
* if any, includes these values.)
*/
if (set_xid_epoch != -1)
- ControlFile.checkPointCopy.nextXidEpoch = set_xid_epoch;
+ ControlFile.checkPointCopy.nextFullXid =
+ MakeFullTransactionId(set_xid_epoch,
+ XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
if (set_xid != 0)
{
- ControlFile.checkPointCopy.nextXid = set_xid;
+ ControlFile.checkPointCopy.nextFullXid =
+ MakeFullTransactionId(EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid),
+ set_xid);
/*
* For the moment, just set oldestXid to a value that will force
@@ -705,8 +709,7 @@ GuessControlValues(void)
ControlFile.checkPointCopy.ThisTimeLineID = 1;
ControlFile.checkPointCopy.PrevTimeLineID = 1;
ControlFile.checkPointCopy.fullPageWrites = false;
- ControlFile.checkPointCopy.nextXidEpoch = 0;
- ControlFile.checkPointCopy.nextXid = FirstNormalTransactionId;
+ ControlFile.checkPointCopy.nextFullXid = MakeFullTransactionId(0, FirstNormalTransactionId);
ControlFile.checkPointCopy.nextOid = FirstBootstrapObjectId;
ControlFile.checkPointCopy.nextMulti = FirstMultiXactId;
ControlFile.checkPointCopy.nextMultiOffset = 0;
@@ -785,9 +788,8 @@ PrintControlValues(bool guessed)
ControlFile.checkPointCopy.ThisTimeLineID);
printf(_("Latest checkpoint's full_page_writes: %s\n"),
ControlFile.checkPointCopy.fullPageWrites ? _("on") : _("off"));
- printf(_("Latest checkpoint's NextXID: %u:%u\n"),
- ControlFile.checkPointCopy.nextXidEpoch,
- ControlFile.checkPointCopy.nextXid);
+ printf(_("Latest checkpoint's NextXID: " UINT64_FORMAT "\n"),
+ U64FromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile.checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
@@ -879,7 +881,7 @@ PrintNewControlValues(void)
if (set_xid != 0)
{
printf(_("NextXID: %u\n"),
- ControlFile.checkPointCopy.nextXid);
+ XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
printf(_("OldestXID: %u\n"),
ControlFile.checkPointCopy.oldestXid);
printf(_("OldestXID's DB: %u\n"),
@@ -889,7 +891,7 @@ PrintNewControlValues(void)
if (set_xid_epoch != -1)
{
printf(_("NextXID epoch: %u\n"),
- ControlFile.checkPointCopy.nextXidEpoch);
+ EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
}
if (set_oldest_commit_ts_xid != 0)
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index 0fe98a550e1..da4e8328df8 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -201,6 +201,27 @@ get_control_data(ClusterInfo *cluster, bool live_check)
pg_fatal("%d: controldata retrieval problem\n", __LINE__);
p++; /* remove ':' char */
+
+ /* Changed to 64 bit FullTransactionId in 12. */
+ if (GET_MAJOR_VERSION(cluster->major_version) >= 1200)
+ {
+ uint64 fullXid;
+
+ /* Like pg_strtouint64 (which we can't call from here) */
+ /* TODO: FIXME */
+#ifdef _MSC_VER /* MSVC only */
+ fullXid = _strtoui64(p, NULL, 10);
+#elif defined(HAVE_STRTOULL) && SIZEOF_LONG < 8
+ fullXid = strtoull(p, NULL, 10);
+#else
+ fullXid = strtoul(p, NULL, 10);
+#endif
+
+ cluster->controldata.chkpnt_nxtepoch = fullXid >> 32;
+ cluster->controldata.chkpnt_nxtxid = (uint32) fullXid;
+ goto parsed_xid;
+ }
+
cluster->controldata.chkpnt_nxtepoch = str2uint(p);
/*
@@ -221,6 +242,7 @@ get_control_data(ClusterInfo *cluster, bool live_check)
p++; /* remove '/' or ':' char */
cluster->controldata.chkpnt_nxtxid = str2uint(p);
+parsed_xid:
got_xid = true;
}
else if ((p = strstr(bufin, "Latest checkpoint's NextOID:")) != NULL)
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 83ec3f19797..070f3bfdc74 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -44,6 +44,32 @@
#define TransactionIdStore(xid, dest) (*(dest) = (xid))
#define StoreInvalidTransactionId(dest) (*(dest) = InvalidTransactionId)
+#define EpochFromFullTransactionId(x) ((uint32) ((x).value >> 32))
+#define XidFromFullTransactionId(x) ((uint32) (x).value)
+#define U64FromFullTransactionId(x) ((x).value)
+#define FullTransactionIdPrecedes(a, b) ((a).value < (b).value)
+#define FullTransactionIdPrecedesOrEquals(a, b) ((a).value <= (b).value)
+
+/*
+ * A 64 bit value that contains an epoch and a TransactionId. This is
+ * wrapped in a struct to prevent implicit conversion to/from TransactionId.
+ * Allowing such conversions seems likely to be error-prone.
+ */
+typedef struct FullTransactionId
+{
+ uint64 value;
+} FullTransactionId;
+
+static inline FullTransactionId
+MakeFullTransactionId(uint32 epoch, TransactionId xid)
+{
+ FullTransactionId result;
+
+ result.value = ((uint64) epoch) << 32 | xid;
+
+ return result;
+}
+
/* advance a transaction ID variable, handling wraparound correctly */
#define TransactionIdAdvance(dest) \
do { \
@@ -52,6 +78,16 @@
(dest) = FirstNormalTransactionId; \
} while(0)
+/* advance a FullTransactionId lvalue, handling wraparound correctly */
+static inline void
+FullTransactionIdAdvance(FullTransactionId *dest)
+{
+ dest->value++;
+ if (XidFromFullTransactionId(*dest) < FirstNormalTransactionId)
+ *dest = MakeFullTransactionId(EpochFromFullTransactionId(*dest),
+ FirstNormalTransactionId);
+}
+
/* back up a transaction ID variable, handling wraparound correctly */
#define TransactionIdRetreat(dest) \
do { \
@@ -114,12 +150,12 @@ typedef struct VariableCacheData
/*
* These fields are protected by XidGenLock.
*/
- TransactionId nextXid; /* next XID to assign */
+ FullTransactionId nextFullXid; /* next XID to assign */
TransactionId oldestXid; /* cluster-wide minimum datfrozenxid */
TransactionId xidVacLimit; /* start forcing autovacuums here */
TransactionId xidWarnLimit; /* start complaining here */
- TransactionId xidStopLimit; /* refuse to advance nextXid beyond here */
+ TransactionId xidStopLimit; /* refuse to advance nextFullXid beyond here */
TransactionId xidWrapLimit; /* where the world ends */
Oid oldestXidDB; /* database with minimum datfrozenxid */
@@ -176,7 +212,8 @@ extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
/* in transam/varsup.c */
extern TransactionId GetNewTransactionId(bool isSubXact);
-extern TransactionId ReadNewTransactionId(void);
+extern void AdvanceNextFullTransactionIdPast(TransactionId xid, bool lock_free_check);
+extern FullTransactionId ReadNextFullTransactionId(void);
extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
Oid oldest_datoid);
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 421ba6d7755..3c9d3401df5 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -276,7 +276,6 @@ extern XLogRecPtr GetRedoRecPtr(void);
extern XLogRecPtr GetInsertRecPtr(void);
extern XLogRecPtr GetFlushRecPtr(void);
extern XLogRecPtr GetLastImportantRecPtr(void);
-extern void GetNextXidAndEpoch(TransactionId *xid, uint32 *epoch);
extern void RemovePromoteSignalFiles(void);
extern bool CheckPromoteSignal(void);
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 773d9e6ebae..fa7ff049403 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -15,6 +15,7 @@
#ifndef PG_CONTROL_H
#define PG_CONTROL_H
+#include "access/transam.h"
#include "access/xlogdefs.h"
#include "pgtime.h" /* for pg_time_t */
#include "port/pg_crc32c.h"
@@ -39,8 +40,7 @@ typedef struct CheckPoint
TimeLineID PrevTimeLineID; /* previous TLI, if this record begins a new
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
- uint32 nextXidEpoch; /* higher-order bits of nextXid */
- TransactionId nextXid; /* next free XID */
+ FullTransactionId nextFullXid; /* next free XID */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index 1fcd8cf1b59..d1116454095 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -72,7 +72,7 @@ typedef struct RunningTransactionsData
int xcnt; /* # of xact ids in xids[] */
int subxcnt; /* # of subxact ids in xids[] */
bool subxid_overflow; /* snapshot overflowed, subxids missing */
- TransactionId nextXid; /* copy of ShmemVariableCache->nextXid */
+ TransactionId nextXid; /* xid from ShmemVariableCache->nextFullXid */
TransactionId oldestRunningXid; /* *not* oldestXmin */
TransactionId latestCompletedXid; /* so we can set xmax */
--
2.17.0
Andres Freund <andres@anarazel.de> writes:
On 2018-07-09 19:56:25 -0400, Tom Lane wrote:
Or, perhaps, use a struct in assert builds and int64 otherwise?
You could hide the ensuing notational differences in macros.
That should be doable. But I'd like to check if it's necessary
first. Optimizing passing an 8 byte struct shouldn't be hard for
compilers these days - especially when most things dealing with them are
inline functions. If we find that it's not a problem on contemporary
compilers, it might be worthwhile to use a bit more type safety in other
places too.
I checked your example program on hardware I have laying around:
x86_64, gcc 4.4.7 (RHEL6): identical code, confirms your result
x86_64, LLVM 9.1.0 (macOS High Sierra): also identical
x86, gcc 4.2.1 (old macOS --- dromedary's host): also identical code;
this surprised me a bit. It looks like the ABI convention is that
64-bit values must be passed on the stack but can be returned in a
register pair, and it doesn't matter whether scalar or struct.
PPC, gcc 4.0.1 (ancient macOS --- prairiedog's host): *not* identical.
It looks like 64-bit arguments are passed in registers either way, but
a struct function result is returned in memory not a register.
ARM64, gcc 8.1.1 (Fedora 28): identical code
ARM64, clang 6.0.0 (FreeBSD 12): identical code
ARMv7, gcc 6.3.0 (Raspbian): *not* identical.
Looks like both pass and return conventions are memory-based for structs.
Offhand it would seem that we can get away with struct wrappers
on any platform where performance is really of concern today.
regards, tom lane
Hi,
On 2018-07-15 16:41:35 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
On 2018-07-09 19:56:25 -0400, Tom Lane wrote:
Or, perhaps, use a struct in assert builds and int64 otherwise?
You could hide the ensuing notational differences in macros.That should be doable. But I'd like to check if it's necessary
first. Optimizing passing an 8 byte struct shouldn't be hard for
compilers these days - especially when most things dealing with them are
inline functions. If we find that it's not a problem on contemporary
compilers, it might be worthwhile to use a bit more type safety in other
places too.[ bunch of test results ]
Offhand it would seem that we can get away with struct wrappers
on any platform where performance is really of concern today.
Cool, thanks for checking!
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
On 2018-07-15 16:41:35 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
On 2018-07-09 19:56:25 -0400, Tom Lane wrote:
Or, perhaps, use a struct in assert builds and int64 otherwise?
You could hide the ensuing notational differences in macros.
[ bunch of test results ]
Offhand it would seem that we can get away with struct wrappers
on any platform where performance is really of concern today.
Cool, thanks for checking!
BTW, independently of any performance questions, these results show
that my idea above was untenable anyway. On those platforms where
there is a codegen difference, doing it like that would have resulted
in an ABI difference between regular and assert builds. That's something
to avoid, at least in any API that's visible to extension modules ---
we've had project policy for some time that it should be possible to
use non-assert extensions with assert-enabled core and vice versa.
Conceivably we could have used the struct API only under a special
devel flag that few people use except a buildfarm animal or two.
But that's just a pain in the rear.
regards, tom lane
On Tue, Jul 17, 2018 at 1:55 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@anarazel.de> writes:
On 2018-07-15 16:41:35 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
On 2018-07-09 19:56:25 -0400, Tom Lane wrote:
Or, perhaps, use a struct in assert builds and int64 otherwise?
You could hide the ensuing notational differences in macros.[ bunch of test results ]
Offhand it would seem that we can get away with struct wrappers
on any platform where performance is really of concern today.Cool, thanks for checking!
+1
Here's a new version. I did some cosmetic clean-up, and I dropped the
changes to pg_controldata output, replication epoch/xid processing
code and various similar non-essential changes. For this patch I want
just the new type + next xid generator + appropriate conversions.
I propose that we get this committed early in the cycle. That'd
maximise testing time in the tree, fix the bug identified by Amit, and
leave plenty of time for later patches to use FullTransactionId in
more places as appropriate.
Does anyone have specific kinds of validation or testing they'd like to see?
--
Thomas Munro
http://www.enterprisedb.com
Attachments:
0001-Track-the-next-xid-using-64-bits-v5.patchapplication/octet-stream; name=0001-Track-the-next-xid-using-64-bits-v5.patchDownload
From 810ef19ba2ec39a34e6b277cc0fd38d8560e180d Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@enterprisedb.com>
Date: Mon, 9 Jul 2018 21:54:03 +1200
Subject: [PATCH] Track the next xid using 64 bits.
Instead of tracking the epoch independently, introduce a 64 bit
FullTransactionId type and use it to track xid generation. This fixes
an unlikely bug where an epoch increment could be missed if you managed
to consume more than 2^32 transactions between checkpoints.
This patch creates the basic infrastructure for later patches to adopt
64 bit transaction IDs as appropriate.
Author: Thomas Munro
Reported-by: Amit Kapila
Reviewed-by: Andres Freund, Tom Lane
Discussion: https://postgr.es/m/CAA4eK1%2BMv%2Bmb0HFfWM9Srtc6MVe160WFurXV68iAFMcagRZ0dQ%40mail.gmail.com
---
src/backend/access/rmgrdesc/xlogdesc.c | 4 +-
src/backend/access/transam/clog.c | 8 +-
src/backend/access/transam/commit_ts.c | 4 +-
src/backend/access/transam/multixact.c | 13 +--
src/backend/access/transam/subtrans.c | 8 +-
src/backend/access/transam/twophase.c | 35 ++++----
src/backend/access/transam/varsup.c | 53 +++++++++---
src/backend/access/transam/xact.c | 19 +----
src/backend/access/transam/xlog.c | 105 ++++++------------------
src/backend/replication/walreceiver.c | 5 +-
src/backend/replication/walsender.c | 5 +-
src/backend/storage/ipc/procarray.c | 26 ++----
src/backend/storage/ipc/standby.c | 2 +-
src/backend/storage/lmgr/predicate.c | 2 +-
src/backend/utils/adt/txid.c | 13 ++-
src/backend/utils/misc/pg_controldata.c | 5 +-
src/bin/pg_controldata/pg_controldata.c | 5 +-
src/bin/pg_resetwal/pg_resetwal.c | 19 +++--
src/include/access/transam.h | 53 +++++++++++-
src/include/access/xlog.h | 1 -
src/include/catalog/pg_control.h | 4 +-
src/include/storage/standby.h | 2 +-
22 files changed, 196 insertions(+), 195 deletions(-)
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 00741c7b09e..549f9dae305 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -14,6 +14,7 @@
*/
#include "postgres.h"
+#include "access/transam.h"
#include "access/xlog.h"
#include "access/xlog_internal.h"
#include "catalog/pg_control.h"
@@ -52,7 +53,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->ThisTimeLineID,
checkpoint->PrevTimeLineID,
checkpoint->fullPageWrites ? "true" : "false",
- checkpoint->nextXidEpoch, checkpoint->nextXid,
+ EpochFromFullTransactionId(checkpoint->nextFullXid),
+ XidFromFullTransactionId(checkpoint->nextFullXid),
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index 8b7ff5b0c24..80da006138d 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -749,12 +749,12 @@ ZeroCLOGPage(int pageno, bool writeXlog)
/*
* This must be called ONCE during postmaster or standalone-backend startup,
- * after StartupXLOG has initialized ShmemVariableCache->nextXid.
+ * after StartupXLOG has initialized ShmemVariableCache->nextFullXid.
*/
void
StartupCLOG(void)
{
- TransactionId xid = ShmemVariableCache->nextXid;
+ TransactionId xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
int pageno = TransactionIdToPage(xid);
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
@@ -773,7 +773,7 @@ StartupCLOG(void)
void
TrimCLOG(void)
{
- TransactionId xid = ShmemVariableCache->nextXid;
+ TransactionId xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
int pageno = TransactionIdToPage(xid);
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
@@ -792,7 +792,7 @@ TrimCLOG(void)
* but makes no WAL entry). Let's just be safe. (We need not worry about
* pages beyond the current one, since those will be zeroed when first
* used. For the same reason, there is no need to do anything when
- * nextXid is exactly at a page boundary; and it's likely that the
+ * nextFullXid is exactly at a page boundary; and it's likely that the
* "current" page doesn't exist yet in that case.)
*/
if (TransactionIdToPgIndex(xid) != 0)
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 73fac1ba81d..3815ccbec65 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -553,7 +553,7 @@ ZeroCommitTsPage(int pageno, bool writeXlog)
/*
* This must be called ONCE during postmaster or standalone-backend startup,
- * after StartupXLOG has initialized ShmemVariableCache->nextXid.
+ * after StartupXLOG has initialized ShmemVariableCache->nextFullXid.
*/
void
StartupCommitTs(void)
@@ -644,7 +644,7 @@ ActivateCommitTs(void)
}
LWLockRelease(CommitTsLock);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
pageno = TransactionIdToCTsPage(xid);
/*
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 365daf153ab..c3188eb3de2 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -3267,7 +3267,7 @@ multixact_redo(XLogReaderState *record)
xlrec->moff + xlrec->nmembers);
/*
- * Make sure nextXid is beyond any XID mentioned in the record. This
+ * Make sure nextFullXid is beyond any XID mentioned in the record. This
* should be unnecessary, since any XID found here ought to have other
* evidence in the XLOG, but let's be safe.
*/
@@ -3279,18 +3279,11 @@ multixact_redo(XLogReaderState *record)
}
/*
- * We don't expect anyone else to modify nextXid, hence startup
+ * We don't expect anyone else to modify nextFullXid, hence startup
* process doesn't need to hold a lock while checking this. We still
* acquire the lock to modify it, though.
*/
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPastXid(max_xid, true);
}
else if (info == XLOG_MULTIXACT_TRUNCATE_ID)
{
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c
index 4faa21f5aef..91ea77aee82 100644
--- a/src/backend/access/transam/subtrans.c
+++ b/src/backend/access/transam/subtrans.c
@@ -241,14 +241,15 @@ ZeroSUBTRANSPage(int pageno)
/*
* This must be called ONCE during postmaster or standalone-backend startup,
- * after StartupXLOG has initialized ShmemVariableCache->nextXid.
+ * after StartupXLOG has initialized ShmemVariableCache->nextFullXid.
*
- * oldestActiveXID is the oldest XID of any prepared transaction, or nextXid
+ * oldestActiveXID is the oldest XID of any prepared transaction, or nextFullXid
* if there are none.
*/
void
StartupSUBTRANS(TransactionId oldestActiveXID)
{
+ FullTransactionId nextFullXid;
int startPage;
int endPage;
@@ -261,7 +262,8 @@ StartupSUBTRANS(TransactionId oldestActiveXID)
LWLockAcquire(SubtransControlLock, LW_EXCLUSIVE);
startPage = TransactionIdToPage(oldestActiveXID);
- endPage = TransactionIdToPage(ShmemVariableCache->nextXid);
+ nextFullXid = ShmemVariableCache->nextFullXid;
+ endPage = TransactionIdToPage(XidFromFullTransactionId(nextFullXid));
while (startPage != endPage)
{
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 306861bb793..8edd3a55619 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -1858,22 +1858,22 @@ restoreTwoPhaseData(void)
*
* Scan the shared memory entries of TwoPhaseState and determine the range
* of valid XIDs present. This is run during database startup, after we
- * have completed reading WAL. ShmemVariableCache->nextXid has been set to
+ * have completed reading WAL. ShmemVariableCache->nextFullXid has been set to
* one more than the highest XID for which evidence exists in WAL.
*
- * We throw away any prepared xacts with main XID beyond nextXid --- if any
+ * We throw away any prepared xacts with main XID beyond nextFullXid --- if any
* are present, it suggests that the DBA has done a PITR recovery to an
* earlier point in time without cleaning out pg_twophase. We dare not
* try to recover such prepared xacts since they likely depend on database
* state that doesn't exist now.
*
- * However, we will advance nextXid beyond any subxact XIDs belonging to
+ * However, we will advance nextFullXid beyond any subxact XIDs belonging to
* valid prepared xacts. We need to do this since subxact commit doesn't
* write a WAL entry, and so there might be no evidence in WAL of those
* subxact XIDs.
*
* Our other responsibility is to determine and return the oldest valid XID
- * among the prepared xacts (if none, return ShmemVariableCache->nextXid).
+ * among the prepared xacts (if none, return ShmemVariableCache->nextFullXid).
* This is needed to synchronize pg_subtrans startup properly.
*
* If xids_p and nxids_p are not NULL, pointer to a palloc'd array of all
@@ -1883,7 +1883,8 @@ restoreTwoPhaseData(void)
TransactionId
PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
{
- TransactionId origNextXid = ShmemVariableCache->nextXid;
+ FullTransactionId nextFullXid = ShmemVariableCache->nextFullXid;
+ TransactionId origNextXid = XidFromFullTransactionId(nextFullXid);
TransactionId result = origNextXid;
TransactionId *xids = NULL;
int nxids = 0;
@@ -2099,7 +2100,7 @@ RecoverPreparedTransactions(void)
*
* If setParent is true, set up subtransaction parent linkages.
*
- * If setNextXid is true, set ShmemVariableCache->nextXid to the newest
+ * If setNextXid is true, set ShmemVariableCache->nextFullXid to the newest
* value scanned.
*/
static char *
@@ -2108,7 +2109,8 @@ ProcessTwoPhaseBuffer(TransactionId xid,
bool fromdisk,
bool setParent, bool setNextXid)
{
- TransactionId origNextXid = ShmemVariableCache->nextXid;
+ FullTransactionId nextFullXid = ShmemVariableCache->nextFullXid;
+ TransactionId origNextXid = XidFromFullTransactionId(nextFullXid);
TransactionId *subxids;
char *buf;
TwoPhaseFileHeader *hdr;
@@ -2202,7 +2204,7 @@ ProcessTwoPhaseBuffer(TransactionId xid,
/*
* Examine subtransaction XIDs ... they should all follow main XID, and
- * they may force us to advance nextXid.
+ * they may force us to advance nextFullXid.
*/
subxids = (TransactionId *) (buf +
MAXALIGN(sizeof(TwoPhaseFileHeader)) +
@@ -2213,24 +2215,15 @@ ProcessTwoPhaseBuffer(TransactionId xid,
Assert(TransactionIdFollows(subxid, xid));
- /* update nextXid if needed */
- if (setNextXid &&
- TransactionIdFollowsOrEquals(subxid,
- ShmemVariableCache->nextXid))
+ /* update nextFullXid if needed */
+ if (setNextXid)
{
/*
- * We don't expect anyone else to modify nextXid, hence we don't
+ * We don't expect anyone else to modify nextFullXid, hence we don't
* need to hold a lock while examining it. We still acquire the
* lock to modify it, though, so we recheck.
*/
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- if (TransactionIdFollowsOrEquals(subxid,
- ShmemVariableCache->nextXid))
- {
- ShmemVariableCache->nextXid = subxid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- }
- LWLockRelease(XidGenLock);
+ AdvanceNextFullTransactionIdPastXid(subxid, true);
}
if (setParent)
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 394843f7e91..24a10f82eaf 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -73,7 +73,7 @@ GetNewTransactionId(bool isSubXact)
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*----------
* Check to see if it's safe to assign another XID. This protects against
@@ -156,7 +156,7 @@ GetNewTransactionId(bool isSubXact)
/* Re-acquire lock and start over */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
}
/*
@@ -173,12 +173,12 @@ GetNewTransactionId(bool isSubXact)
ExtendSUBTRANS(xid);
/*
- * Now advance the nextXid counter. This must not happen until after we
+ * Now advance the nextFullXid counter. This must not happen until after we
* have successfully completed ExtendCLOG() --- if that routine fails, we
* want the next incoming transaction to try it again. We cannot assign
* more XIDs until there is CLOG space for them.
*/
- TransactionIdAdvance(ShmemVariableCache->nextXid);
+ FullTransactionIdAdvance(&ShmemVariableCache->nextFullXid);
/*
* We must store the new XID into the shared ProcArray before releasing
@@ -244,18 +244,47 @@ GetNewTransactionId(bool isSubXact)
}
/*
- * Read nextXid but don't allocate it.
+ * Read nextFullXid but don't allocate it.
*/
-TransactionId
-ReadNewTransactionId(void)
+FullTransactionId
+ReadNextFullTransactionId(void)
{
- TransactionId xid;
+ FullTransactionId fullXid;
LWLockAcquire(XidGenLock, LW_SHARED);
- xid = ShmemVariableCache->nextXid;
+ fullXid = ShmemVariableCache->nextFullXid;
LWLockRelease(XidGenLock);
- return xid;
+ return fullXid;
+}
+
+/*
+ * Advance nextFullXid to the value after a given xid. The epoch is inferred.
+ * If lock_free_check is true, then the caller must be sure that it's safe to
+ * read nextFullXid without holding XidGenLock (ie during recovery).
+ */
+void
+AdvanceNextFullTransactionIdPastXid(TransactionId xid, bool lock_free_check)
+{
+ TransactionId current_xid;
+ uint32 epoch;
+
+ if (lock_free_check &&
+ !TransactionIdFollowsOrEquals(xid,
+ XidFromFullTransactionId(ShmemVariableCache->nextFullXid)))
+ return;
+
+ LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
+ current_xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
+ if (TransactionIdFollowsOrEquals(xid, current_xid))
+ {
+ epoch = EpochFromFullTransactionId(ShmemVariableCache->nextFullXid);
+ if (xid < current_xid)
+ ++epoch; /* epoch wrapped */
+ ShmemVariableCache->nextFullXid = MakeFullTransactionId(epoch, xid);
+ FullTransactionIdAdvance(&ShmemVariableCache->nextFullXid);
+ }
+ LWLockRelease(XidGenLock);
}
/*
@@ -359,7 +388,7 @@ SetTransactionIdLimit(TransactionId oldest_datfrozenxid, Oid oldest_datoid)
ShmemVariableCache->xidStopLimit = xidStopLimit;
ShmemVariableCache->xidWrapLimit = xidWrapLimit;
ShmemVariableCache->oldestXidDB = oldest_datoid;
- curXid = ShmemVariableCache->nextXid;
+ curXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
LWLockRelease(XidGenLock);
/* Log the info */
@@ -435,7 +464,7 @@ ForceTransactionIdLimitUpdate(void)
/* Locking is probably not really necessary, but let's be careful */
LWLockAcquire(XidGenLock, LW_SHARED);
- nextXid = ShmemVariableCache->nextXid;
+ nextXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
xidVacLimit = ShmemVariableCache->xidVacLimit;
oldestXid = ShmemVariableCache->oldestXid;
oldestXidDB = ShmemVariableCache->oldestXidDB;
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 9aa63c8792b..46c5de6a8de 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -5531,14 +5531,7 @@ xact_redo_commit(xl_xact_parsed_commit *parsed,
* hold a lock while checking this. We still acquire the lock to modify
* it, though.
*/
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPastXid(max_xid, true);
Assert(((parsed->xinfo & XACT_XINFO_HAS_ORIGIN) == 0) ==
(origin_id == InvalidRepOriginId));
@@ -5690,15 +5683,7 @@ xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid)
max_xid = TransactionIdLatest(xid,
parsed->nsubxacts,
parsed->subxacts);
-
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPastXid(max_xid, true);
if (standbyState == STANDBY_DISABLED)
{
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 493f1db7b97..5f53cf83951 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -579,8 +579,7 @@ typedef struct XLogCtlData
/* Protected by info_lck: */
XLogwrtRqst LogwrtRqst;
XLogRecPtr RedoRecPtr; /* a recent copy of Insert->RedoRecPtr */
- uint32 ckptXidEpoch; /* nextXID & epoch of latest checkpoint */
- TransactionId ckptXid;
+ FullTransactionId ckptFullXid; /* nextXID & epoch of latest checkpoint */
XLogRecPtr asyncXactLSN; /* LSN of newest async commit/abort */
XLogRecPtr replicationSlotMinLSN; /* oldest LSN needed by any slot */
@@ -5143,8 +5142,7 @@ BootStrapXLOG(void)
checkPoint.ThisTimeLineID = ThisTimeLineID;
checkPoint.PrevTimeLineID = ThisTimeLineID;
checkPoint.fullPageWrites = fullPageWrites;
- checkPoint.nextXidEpoch = 0;
- checkPoint.nextXid = FirstNormalTransactionId;
+ checkPoint.nextFullXid = MakeFullTransactionId(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstBootstrapObjectId;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
@@ -5157,7 +5155,7 @@ BootStrapXLOG(void)
checkPoint.time = (pg_time_t) time(NULL);
checkPoint.oldestActiveXid = InvalidTransactionId;
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
@@ -6757,8 +6755,8 @@ StartupXLOG(void)
(uint32) (checkPoint.redo >> 32), (uint32) checkPoint.redo,
wasShutdown ? "true" : "false")));
ereport(DEBUG1,
- (errmsg_internal("next transaction ID: %u:%u; next OID: %u",
- checkPoint.nextXidEpoch, checkPoint.nextXid,
+ (errmsg_internal("next transaction ID: " UINT64_FORMAT "; next OID: %u",
+ U64FromFullTransactionId(checkPoint.nextFullXid),
checkPoint.nextOid)));
ereport(DEBUG1,
(errmsg_internal("next MultiXactId: %u; next MultiXactOffset: %u",
@@ -6773,12 +6771,12 @@ StartupXLOG(void)
(errmsg_internal("commit timestamp Xid oldest/newest: %u/%u",
checkPoint.oldestCommitTsXid,
checkPoint.newestCommitTsXid)));
- if (!TransactionIdIsNormal(checkPoint.nextXid))
+ if (!TransactionIdIsNormal(XidFromFullTransactionId(checkPoint.nextFullXid)))
ereport(PANIC,
(errmsg("invalid next transaction ID")));
/* initialize shared memory variables from the checkpoint record */
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
@@ -6787,8 +6785,7 @@ StartupXLOG(void)
SetMultiXactIdLimit(checkPoint.oldestMulti, checkPoint.oldestMultiDB, true);
SetCommitTsLimit(checkPoint.oldestCommitTsXid,
checkPoint.newestCommitTsXid);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
/*
* Initialize replication slots, before there's a chance to remove
@@ -7058,7 +7055,7 @@ StartupXLOG(void)
Assert(TransactionIdIsValid(oldestActiveXID));
/* Tell procarray about the range of xids it has to deal with */
- ProcArrayInitRecovery(ShmemVariableCache->nextXid);
+ ProcArrayInitRecovery(XidFromFullTransactionId(ShmemVariableCache->nextFullXid));
/*
* Startup commit log and subtrans only. MultiXact and commit
@@ -7088,9 +7085,9 @@ StartupXLOG(void)
running.xcnt = nxids;
running.subxcnt = 0;
running.subxid_overflow = false;
- running.nextXid = checkPoint.nextXid;
+ running.nextXid = XidFromFullTransactionId(checkPoint.nextFullXid);
running.oldestRunningXid = oldestActiveXID;
- latestCompletedXid = checkPoint.nextXid;
+ latestCompletedXid = XidFromFullTransactionId(checkPoint.nextFullXid);
TransactionIdRetreat(latestCompletedXid);
Assert(TransactionIdIsNormal(latestCompletedXid));
running.latestCompletedXid = latestCompletedXid;
@@ -7266,14 +7263,7 @@ StartupXLOG(void)
* don't need to hold a lock while examining it. We still
* acquire the lock to modify it, though.
*/
- if (TransactionIdFollowsOrEquals(record->xl_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = record->xl_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPastXid(record->xl_xid, true);
/*
* Before replaying this record, check if this record causes
@@ -7853,7 +7843,7 @@ StartupXLOG(void)
/* also initialize latestCompletedXid, to nextXid - 1 */
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- ShmemVariableCache->latestCompletedXid = ShmemVariableCache->nextXid;
+ ShmemVariableCache->latestCompletedXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
TransactionIdRetreat(ShmemVariableCache->latestCompletedXid);
LWLockRelease(ProcArrayLock);
@@ -8446,41 +8436,6 @@ GetLastSegSwitchData(XLogRecPtr *lastSwitchLSN)
return result;
}
-/*
- * GetNextXidAndEpoch - get the current nextXid value and associated epoch
- *
- * This is exported for use by code that would like to have 64-bit XIDs.
- * We don't really support such things, but all XIDs within the system
- * can be presumed "close to" the result, and thus the epoch associated
- * with them can be determined.
- */
-void
-GetNextXidAndEpoch(TransactionId *xid, uint32 *epoch)
-{
- uint32 ckptXidEpoch;
- TransactionId ckptXid;
- TransactionId nextXid;
-
- /* Must read checkpoint info first, else have race condition */
- SpinLockAcquire(&XLogCtl->info_lck);
- ckptXidEpoch = XLogCtl->ckptXidEpoch;
- ckptXid = XLogCtl->ckptXid;
- SpinLockRelease(&XLogCtl->info_lck);
-
- /* Now fetch current nextXid */
- nextXid = ReadNewTransactionId();
-
- /*
- * nextXid is certainly logically later than ckptXid. So if it's
- * numerically less, it must have wrapped into the next epoch.
- */
- if (nextXid < ckptXid)
- ckptXidEpoch++;
-
- *xid = nextXid;
- *epoch = ckptXidEpoch;
-}
-
/*
* This must be called ONCE during postmaster or standalone-backend shutdown
*/
@@ -8900,7 +8855,7 @@ CreateCheckPoint(int flags)
* there.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- checkPoint.nextXid = ShmemVariableCache->nextXid;
+ checkPoint.nextFullXid = ShmemVariableCache->nextFullXid;
checkPoint.oldestXid = ShmemVariableCache->oldestXid;
checkPoint.oldestXidDB = ShmemVariableCache->oldestXidDB;
LWLockRelease(XidGenLock);
@@ -8910,11 +8865,6 @@ CreateCheckPoint(int flags)
checkPoint.newestCommitTsXid = ShmemVariableCache->newestCommitTsXid;
LWLockRelease(CommitTsLock);
- /* Increase XID epoch if we've wrapped around since last checkpoint */
- checkPoint.nextXidEpoch = ControlFile->checkPointCopy.nextXidEpoch;
- if (checkPoint.nextXid < ControlFile->checkPointCopy.nextXid)
- checkPoint.nextXidEpoch++;
-
LWLockAcquire(OidGenLock, LW_SHARED);
checkPoint.nextOid = ShmemVariableCache->nextOid;
if (!shutdown)
@@ -9058,8 +9008,7 @@ CreateCheckPoint(int flags)
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/*
@@ -9810,7 +9759,7 @@ xlog_redo(XLogReaderState *record)
memcpy(&checkPoint, XLogRecGetData(record), sizeof(CheckPoint));
/* In a SHUTDOWN checkpoint, believe the counters exactly */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
LWLockRelease(XidGenLock);
LWLockAcquire(OidGenLock, LW_EXCLUSIVE);
ShmemVariableCache->nextOid = checkPoint.nextOid;
@@ -9864,9 +9813,9 @@ xlog_redo(XLogReaderState *record)
running.xcnt = nxids;
running.subxcnt = 0;
running.subxid_overflow = false;
- running.nextXid = checkPoint.nextXid;
+ running.nextXid = XidFromFullTransactionId(checkPoint.nextFullXid);
running.oldestRunningXid = oldestActiveXID;
- latestCompletedXid = checkPoint.nextXid;
+ latestCompletedXid = XidFromFullTransactionId(checkPoint.nextFullXid);
TransactionIdRetreat(latestCompletedXid);
Assert(TransactionIdIsNormal(latestCompletedXid));
running.latestCompletedXid = latestCompletedXid;
@@ -9878,13 +9827,11 @@ xlog_redo(XLogReaderState *record)
}
/* ControlFile->checkPointCopy always tracks the latest ckpt XID */
- ControlFile->checkPointCopy.nextXidEpoch = checkPoint.nextXidEpoch;
- ControlFile->checkPointCopy.nextXid = checkPoint.nextXid;
+ ControlFile->checkPointCopy.nextFullXid = checkPoint.nextFullXid;
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/*
@@ -9905,9 +9852,9 @@ xlog_redo(XLogReaderState *record)
memcpy(&checkPoint, XLogRecGetData(record), sizeof(CheckPoint));
/* In an ONLINE checkpoint, treat the XID counter as a minimum */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- if (TransactionIdPrecedes(ShmemVariableCache->nextXid,
- checkPoint.nextXid))
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ if (FullTransactionIdPrecedes(ShmemVariableCache->nextFullXid,
+ checkPoint.nextFullXid))
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
LWLockRelease(XidGenLock);
/*
@@ -9937,13 +9884,11 @@ xlog_redo(XLogReaderState *record)
SetTransactionIdLimit(checkPoint.oldestXid,
checkPoint.oldestXidDB);
/* ControlFile->checkPointCopy always tracks the latest ckpt XID */
- ControlFile->checkPointCopy.nextXidEpoch = checkPoint.nextXidEpoch;
- ControlFile->checkPointCopy.nextXid = checkPoint.nextXid;
+ ControlFile->checkPointCopy.nextFullXid = checkPoint.nextFullXid;
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/* TLI should not change in an on-line checkpoint */
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 7c292d8071b..c56c2516291 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -1188,6 +1188,7 @@ static void
XLogWalRcvSendHSFeedback(bool immed)
{
TimestampTz now;
+ FullTransactionId nextFullXid;
TransactionId nextXid;
uint32 xmin_epoch,
catalog_xmin_epoch;
@@ -1266,7 +1267,9 @@ XLogWalRcvSendHSFeedback(bool immed)
* Get epoch and adjust if nextXid and oldestXmin are different sides of
* the epoch boundary.
*/
- GetNextXidAndEpoch(&nextXid, &xmin_epoch);
+ nextFullXid = ReadNextFullTransactionId();
+ nextXid = XidFromFullTransactionId(nextFullXid);
+ xmin_epoch = EpochFromFullTransactionId(nextFullXid);
catalog_xmin_epoch = xmin_epoch;
if (nextXid < xmin)
xmin_epoch--;
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index d60026dfd1a..248f401766c 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1904,10 +1904,13 @@ PhysicalReplicationSlotNewXmin(TransactionId feedbackXmin, TransactionId feedbac
static bool
TransactionIdInRecentPast(TransactionId xid, uint32 epoch)
{
+ FullTransactionId nextFullXid;
TransactionId nextXid;
uint32 nextEpoch;
- GetNextXidAndEpoch(&nextXid, &nextEpoch);
+ nextFullXid = ReadNextFullTransactionId();
+ nextXid = XidFromFullTransactionId(nextFullXid);
+ nextEpoch = EpochFromFullTransactionId(nextFullXid);
if (xid <= nextXid)
{
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index bd20497d81a..b8e6721c960 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -669,7 +669,6 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
{
TransactionId *xids;
int nxids;
- TransactionId nextXid;
int i;
Assert(standbyState >= STANDBY_INITIALIZED);
@@ -882,16 +881,9 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
* hold a lock while examining it. We still acquire the lock to modify
* it, though.
*/
- nextXid = latestObservedXid;
- TransactionIdAdvance(nextXid);
- if (TransactionIdFollows(nextXid, ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = nextXid;
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPastXid(latestObservedXid, true);
- Assert(TransactionIdIsValid(ShmemVariableCache->nextXid));
+ Assert(TransactionIdIsValid(XidFromFullTransactionId(ShmemVariableCache->nextFullXid)));
KnownAssignedXidsDisplay(trace_recovery(DEBUG3));
if (standbyState == STANDBY_SNAPSHOT_READY)
@@ -1979,7 +1971,7 @@ GetRunningTransactionData(void)
latestCompletedXid = ShmemVariableCache->latestCompletedXid;
- oldestRunningXid = ShmemVariableCache->nextXid;
+ oldestRunningXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* Spin over procArray collecting all xids
@@ -2068,7 +2060,7 @@ GetRunningTransactionData(void)
CurrentRunningXacts->xcnt = count - subcount;
CurrentRunningXacts->subxcnt = subcount;
CurrentRunningXacts->subxid_overflow = suboverflowed;
- CurrentRunningXacts->nextXid = ShmemVariableCache->nextXid;
+ CurrentRunningXacts->nextXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
CurrentRunningXacts->oldestRunningXid = oldestRunningXid;
CurrentRunningXacts->latestCompletedXid = latestCompletedXid;
@@ -2113,7 +2105,7 @@ GetOldestActiveTransactionId(void)
* have already completed), when we spin over it.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- oldestRunningXid = ShmemVariableCache->nextXid;
+ oldestRunningXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
LWLockRelease(XidGenLock);
/*
@@ -2181,7 +2173,7 @@ GetOldestSafeDecodingTransactionId(bool catalogOnly)
* a safe, albeit pessimal, value.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- oldestSafeXid = ShmemVariableCache->nextXid;
+ oldestSafeXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* If there's already a slot pegging the xmin horizon, we can start with
@@ -3235,12 +3227,10 @@ RecordKnownAssignedTransactionIds(TransactionId xid)
*/
latestObservedXid = xid;
- /* ShmemVariableCache->nextXid must be beyond any observed xid */
+ /* ShmemVariableCache->nextFullXid must be beyond any observed xid */
+ AdvanceNextFullTransactionIdPastXid(latestObservedXid, false);
next_expected_xid = latestObservedXid;
TransactionIdAdvance(next_expected_xid);
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = next_expected_xid;
- LWLockRelease(XidGenLock);
}
}
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 2e077028951..dafa4b792fd 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -867,7 +867,7 @@ standby_redo(XLogReaderState *record)
* up from a checkpoint and are immediately at our starting point, we
* unconditionally move to STANDBY_INITIALIZED. After this point we
* must do 4 things:
- * * move shared nextXid forwards as we see new xids
+ * * move shared nextFullXid forwards as we see new xids
* * extend the clog and subtrans with each new xid
* * keep track of uncommitted known assigned xids
* * keep track of uncommitted AccessExclusiveLocks
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index e8390311d03..880e6c14ef1 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -3270,7 +3270,7 @@ ReleasePredicateLocks(bool isCommit)
* transaction to complete before freeing some RAM; correctness of visible
* behavior is not affected.
*/
- MySerializableXact->finishedBefore = ShmemVariableCache->nextXid;
+ MySerializableXact->finishedBefore = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* If it's not a commit it's a rollback, and we can clear our locks
diff --git a/src/backend/utils/adt/txid.c b/src/backend/utils/adt/txid.c
index 7974c0bd3d8..4c34e215d26 100644
--- a/src/backend/utils/adt/txid.c
+++ b/src/backend/utils/adt/txid.c
@@ -91,7 +91,10 @@ typedef struct
static void
load_xid_epoch(TxidEpoch *state)
{
- GetNextXidAndEpoch(&state->last_xid, &state->epoch);
+ FullTransactionId fullXid = ReadNextFullTransactionId();
+
+ state->last_xid = XidFromFullTransactionId(fullXid);
+ state->epoch = EpochFromFullTransactionId(fullXid);
}
/*
@@ -114,8 +117,11 @@ TransactionIdInRecentPast(uint64 xid_with_epoch, TransactionId *extracted_xid)
TransactionId xid = (TransactionId) xid_with_epoch;
uint32 now_epoch;
TransactionId now_epoch_last_xid;
+ FullTransactionId now_xid;
- GetNextXidAndEpoch(&now_epoch_last_xid, &now_epoch);
+ now_xid = ReadNextFullTransactionId();
+ now_epoch_last_xid = XidFromFullTransactionId(now_xid);
+ now_epoch = EpochFromFullTransactionId(now_xid);
if (extracted_xid != NULL)
*extracted_xid = xid;
@@ -128,8 +134,7 @@ TransactionIdInRecentPast(uint64 xid_with_epoch, TransactionId *extracted_xid)
return true;
/* If the transaction ID is in the future, throw an error. */
- if (xid_epoch > now_epoch
- || (xid_epoch == now_epoch && xid > now_epoch_last_xid))
+ if (xid_with_epoch > U64FromFullTransactionId(now_xid))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("transaction ID %s is in the future",
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 3fc8b6a8a84..37fb18130f4 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -16,6 +16,7 @@
#include "postgres.h"
#include "access/htup_details.h"
+#include "access/transam.h"
#include "access/xlog_internal.h"
#include "access/xlog.h"
#include "catalog/pg_control.h"
@@ -164,8 +165,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
nulls[5] = false;
values[6] = CStringGetTextDatum(psprintf("%u:%u",
- ControlFile->checkPointCopy.nextXidEpoch,
- ControlFile->checkPointCopy.nextXid));
+ EpochFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid),
+ XidFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid)));
nulls[6] = false;
values[7] = ObjectIdGetDatum(ControlFile->checkPointCopy.nextOid);
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index 895a51f89d5..f6731bfd28d 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -20,6 +20,7 @@
#include <time.h>
+#include "access/transam.h"
#include "access/xlog.h"
#include "access/xlog_internal.h"
#include "catalog/pg_control.h"
@@ -256,8 +257,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's full_page_writes: %s\n"),
ControlFile->checkPointCopy.fullPageWrites ? _("on") : _("off"));
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
- ControlFile->checkPointCopy.nextXidEpoch,
- ControlFile->checkPointCopy.nextXid);
+ EpochFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid),
+ XidFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid));
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index d63a3a27f60..1de792e3944 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -431,11 +431,15 @@ main(int argc, char *argv[])
* if any, includes these values.)
*/
if (set_xid_epoch != -1)
- ControlFile.checkPointCopy.nextXidEpoch = set_xid_epoch;
+ ControlFile.checkPointCopy.nextFullXid =
+ MakeFullTransactionId(set_xid_epoch,
+ XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
if (set_xid != 0)
{
- ControlFile.checkPointCopy.nextXid = set_xid;
+ ControlFile.checkPointCopy.nextFullXid =
+ MakeFullTransactionId(EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid),
+ set_xid);
/*
* For the moment, just set oldestXid to a value that will force
@@ -705,8 +709,7 @@ GuessControlValues(void)
ControlFile.checkPointCopy.ThisTimeLineID = 1;
ControlFile.checkPointCopy.PrevTimeLineID = 1;
ControlFile.checkPointCopy.fullPageWrites = false;
- ControlFile.checkPointCopy.nextXidEpoch = 0;
- ControlFile.checkPointCopy.nextXid = FirstNormalTransactionId;
+ ControlFile.checkPointCopy.nextFullXid = MakeFullTransactionId(0, FirstNormalTransactionId);
ControlFile.checkPointCopy.nextOid = FirstBootstrapObjectId;
ControlFile.checkPointCopy.nextMulti = FirstMultiXactId;
ControlFile.checkPointCopy.nextMultiOffset = 0;
@@ -786,8 +789,8 @@ PrintControlValues(bool guessed)
printf(_("Latest checkpoint's full_page_writes: %s\n"),
ControlFile.checkPointCopy.fullPageWrites ? _("on") : _("off"));
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
- ControlFile.checkPointCopy.nextXidEpoch,
- ControlFile.checkPointCopy.nextXid);
+ EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid),
+ XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile.checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
@@ -879,7 +882,7 @@ PrintNewControlValues(void)
if (set_xid != 0)
{
printf(_("NextXID: %u\n"),
- ControlFile.checkPointCopy.nextXid);
+ XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
printf(_("OldestXID: %u\n"),
ControlFile.checkPointCopy.oldestXid);
printf(_("OldestXID's DB: %u\n"),
@@ -889,7 +892,7 @@ PrintNewControlValues(void)
if (set_xid_epoch != -1)
{
printf(_("NextXID epoch: %u\n"),
- ControlFile.checkPointCopy.nextXidEpoch);
+ EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
}
if (set_oldest_commit_ts_xid != 0)
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 83ec3f19797..814becf96d7 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -44,6 +44,32 @@
#define TransactionIdStore(xid, dest) (*(dest) = (xid))
#define StoreInvalidTransactionId(dest) (*(dest) = InvalidTransactionId)
+#define EpochFromFullTransactionId(x) ((uint32) ((x).value >> 32))
+#define XidFromFullTransactionId(x) ((uint32) (x).value)
+#define U64FromFullTransactionId(x) ((x).value)
+#define FullTransactionIdPrecedes(a, b) ((a).value < (b).value)
+#define FullTransactionIdPrecedesOrEquals(a, b) ((a).value <= (b).value)
+
+/*
+ * A 64 bit value that contains an epoch and a TransactionId. This is
+ * wrapped in a struct to prevent implicit conversion to/from TransactionId.
+ * Allowing such conversions seems likely to be error-prone.
+ */
+typedef struct FullTransactionId
+{
+ uint64 value;
+} FullTransactionId;
+
+static inline FullTransactionId
+MakeFullTransactionId(uint32 epoch, TransactionId xid)
+{
+ FullTransactionId result;
+
+ result.value = ((uint64) epoch) << 32 | xid;
+
+ return result;
+}
+
/* advance a transaction ID variable, handling wraparound correctly */
#define TransactionIdAdvance(dest) \
do { \
@@ -52,6 +78,16 @@
(dest) = FirstNormalTransactionId; \
} while(0)
+/* advance a FullTransactionId variable, stepping over special XIDs */
+static inline void
+FullTransactionIdAdvance(FullTransactionId *dest)
+{
+ dest->value++;
+ if (XidFromFullTransactionId(*dest) < FirstNormalTransactionId)
+ *dest = MakeFullTransactionId(EpochFromFullTransactionId(*dest),
+ FirstNormalTransactionId);
+}
+
/* back up a transaction ID variable, handling wraparound correctly */
#define TransactionIdRetreat(dest) \
do { \
@@ -114,12 +150,12 @@ typedef struct VariableCacheData
/*
* These fields are protected by XidGenLock.
*/
- TransactionId nextXid; /* next XID to assign */
+ FullTransactionId nextFullXid; /* next XID to assign */
TransactionId oldestXid; /* cluster-wide minimum datfrozenxid */
TransactionId xidVacLimit; /* start forcing autovacuums here */
TransactionId xidWarnLimit; /* start complaining here */
- TransactionId xidStopLimit; /* refuse to advance nextXid beyond here */
+ TransactionId xidStopLimit; /* refuse to advance nextFullXid beyond here */
TransactionId xidWrapLimit; /* where the world ends */
Oid oldestXidDB; /* database with minimum datfrozenxid */
@@ -145,6 +181,7 @@ typedef struct VariableCacheData
typedef VariableCacheData *VariableCache;
+
/* ----------------
* extern declarations
* ----------------
@@ -176,11 +213,21 @@ extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
/* in transam/varsup.c */
extern TransactionId GetNewTransactionId(bool isSubXact);
-extern TransactionId ReadNewTransactionId(void);
+extern void AdvanceNextFullTransactionIdPastXid(TransactionId xid, bool lock_free_check);
+extern FullTransactionId ReadNextFullTransactionId(void);
extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
Oid oldest_datoid);
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+/*
+ * For callers that just need the XID part of the next transaction ID.
+ */
+static inline TransactionId
+ReadNewTransactionId(void)
+{
+ return XidFromFullTransactionId(ReadNextFullTransactionId());
+}
+
#endif /* TRAMSAM_H */
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 421ba6d7755..3c9d3401df5 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -276,7 +276,6 @@ extern XLogRecPtr GetRedoRecPtr(void);
extern XLogRecPtr GetInsertRecPtr(void);
extern XLogRecPtr GetFlushRecPtr(void);
extern XLogRecPtr GetLastImportantRecPtr(void);
-extern void GetNextXidAndEpoch(TransactionId *xid, uint32 *epoch);
extern void RemovePromoteSignalFiles(void);
extern bool CheckPromoteSignal(void);
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 773d9e6ebae..f3ee525cec3 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -15,6 +15,7 @@
#ifndef PG_CONTROL_H
#define PG_CONTROL_H
+#include "access/transam.h"
#include "access/xlogdefs.h"
#include "pgtime.h" /* for pg_time_t */
#include "port/pg_crc32c.h"
@@ -39,8 +40,7 @@ typedef struct CheckPoint
TimeLineID PrevTimeLineID; /* previous TLI, if this record begins a new
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
- uint32 nextXidEpoch; /* higher-order bits of nextXid */
- TransactionId nextXid; /* next free XID */
+ FullTransactionId nextFullXid; /* next free full transaction ID */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index 1fcd8cf1b59..d1116454095 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -72,7 +72,7 @@ typedef struct RunningTransactionsData
int xcnt; /* # of xact ids in xids[] */
int subxcnt; /* # of subxact ids in xids[] */
bool subxid_overflow; /* snapshot overflowed, subxids missing */
- TransactionId nextXid; /* copy of ShmemVariableCache->nextXid */
+ TransactionId nextXid; /* xid from ShmemVariableCache->nextFullXid */
TransactionId oldestRunningXid; /* *not* oldestXmin */
TransactionId latestCompletedXid; /* so we can set xmax */
--
2.17.0
On Tue, Jul 24, 2018 at 5:24 PM Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
Here's a new version.
Bitrot, rebased.
--
Thomas Munro
http://www.enterprisedb.com
Attachments:
0001-Track-the-next-xid-using-64-bits-v6.patchapplication/octet-stream; name=0001-Track-the-next-xid-using-64-bits-v6.patchDownload
From 8807aff050d1ecce9c85c46b6a14ae144e381b86 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@enterprisedb.com>
Date: Mon, 9 Jul 2018 21:54:03 +1200
Subject: [PATCH] Track the next xid using 64 bits.
Instead of tracking the epoch independently, introduce a 64 bit
FullTransactionId type and use it to track xid generation. This fixes
an unlikely bug where an epoch increment could be missed if you managed
to consume more than 2^32 transactions between checkpoints.
This patch creates the basic infrastructure for later patches to adopt
64 bit transaction IDs as appropriate.
Author: Thomas Munro
Reported-by: Amit Kapila
Reviewed-by: Andres Freund, Tom Lane
Discussion: https://postgr.es/m/CAA4eK1%2BMv%2Bmb0HFfWM9Srtc6MVe160WFurXV68iAFMcagRZ0dQ%40mail.gmail.com
---
src/backend/access/rmgrdesc/xlogdesc.c | 4 +-
src/backend/access/transam/clog.c | 8 +-
src/backend/access/transam/commit_ts.c | 4 +-
src/backend/access/transam/multixact.c | 13 +--
src/backend/access/transam/subtrans.c | 8 +-
src/backend/access/transam/twophase.c | 35 ++++----
src/backend/access/transam/varsup.c | 53 +++++++++---
src/backend/access/transam/xact.c | 19 +----
src/backend/access/transam/xlog.c | 105 ++++++------------------
src/backend/replication/walreceiver.c | 5 +-
src/backend/replication/walsender.c | 5 +-
src/backend/storage/ipc/procarray.c | 26 ++----
src/backend/storage/ipc/standby.c | 2 +-
src/backend/storage/lmgr/predicate.c | 2 +-
src/backend/utils/adt/txid.c | 13 ++-
src/backend/utils/misc/pg_controldata.c | 5 +-
src/bin/pg_controldata/pg_controldata.c | 5 +-
src/bin/pg_resetwal/pg_resetwal.c | 19 +++--
src/include/access/transam.h | 53 +++++++++++-
src/include/access/xlog.h | 1 -
src/include/catalog/pg_control.h | 4 +-
src/include/storage/standby.h | 2 +-
22 files changed, 196 insertions(+), 195 deletions(-)
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 00741c7b09e..549f9dae305 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -14,6 +14,7 @@
*/
#include "postgres.h"
+#include "access/transam.h"
#include "access/xlog.h"
#include "access/xlog_internal.h"
#include "catalog/pg_control.h"
@@ -52,7 +53,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->ThisTimeLineID,
checkpoint->PrevTimeLineID,
checkpoint->fullPageWrites ? "true" : "false",
- checkpoint->nextXidEpoch, checkpoint->nextXid,
+ EpochFromFullTransactionId(checkpoint->nextFullXid),
+ XidFromFullTransactionId(checkpoint->nextFullXid),
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index 8b7ff5b0c24..80da006138d 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -749,12 +749,12 @@ ZeroCLOGPage(int pageno, bool writeXlog)
/*
* This must be called ONCE during postmaster or standalone-backend startup,
- * after StartupXLOG has initialized ShmemVariableCache->nextXid.
+ * after StartupXLOG has initialized ShmemVariableCache->nextFullXid.
*/
void
StartupCLOG(void)
{
- TransactionId xid = ShmemVariableCache->nextXid;
+ TransactionId xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
int pageno = TransactionIdToPage(xid);
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
@@ -773,7 +773,7 @@ StartupCLOG(void)
void
TrimCLOG(void)
{
- TransactionId xid = ShmemVariableCache->nextXid;
+ TransactionId xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
int pageno = TransactionIdToPage(xid);
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
@@ -792,7 +792,7 @@ TrimCLOG(void)
* but makes no WAL entry). Let's just be safe. (We need not worry about
* pages beyond the current one, since those will be zeroed when first
* used. For the same reason, there is no need to do anything when
- * nextXid is exactly at a page boundary; and it's likely that the
+ * nextFullXid is exactly at a page boundary; and it's likely that the
* "current" page doesn't exist yet in that case.)
*/
if (TransactionIdToPgIndex(xid) != 0)
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 73fac1ba81d..3815ccbec65 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -553,7 +553,7 @@ ZeroCommitTsPage(int pageno, bool writeXlog)
/*
* This must be called ONCE during postmaster or standalone-backend startup,
- * after StartupXLOG has initialized ShmemVariableCache->nextXid.
+ * after StartupXLOG has initialized ShmemVariableCache->nextFullXid.
*/
void
StartupCommitTs(void)
@@ -644,7 +644,7 @@ ActivateCommitTs(void)
}
LWLockRelease(CommitTsLock);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
pageno = TransactionIdToCTsPage(xid);
/*
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 365daf153ab..c3188eb3de2 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -3267,7 +3267,7 @@ multixact_redo(XLogReaderState *record)
xlrec->moff + xlrec->nmembers);
/*
- * Make sure nextXid is beyond any XID mentioned in the record. This
+ * Make sure nextFullXid is beyond any XID mentioned in the record. This
* should be unnecessary, since any XID found here ought to have other
* evidence in the XLOG, but let's be safe.
*/
@@ -3279,18 +3279,11 @@ multixact_redo(XLogReaderState *record)
}
/*
- * We don't expect anyone else to modify nextXid, hence startup
+ * We don't expect anyone else to modify nextFullXid, hence startup
* process doesn't need to hold a lock while checking this. We still
* acquire the lock to modify it, though.
*/
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPastXid(max_xid, true);
}
else if (info == XLOG_MULTIXACT_TRUNCATE_ID)
{
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c
index 4faa21f5aef..91ea77aee82 100644
--- a/src/backend/access/transam/subtrans.c
+++ b/src/backend/access/transam/subtrans.c
@@ -241,14 +241,15 @@ ZeroSUBTRANSPage(int pageno)
/*
* This must be called ONCE during postmaster or standalone-backend startup,
- * after StartupXLOG has initialized ShmemVariableCache->nextXid.
+ * after StartupXLOG has initialized ShmemVariableCache->nextFullXid.
*
- * oldestActiveXID is the oldest XID of any prepared transaction, or nextXid
+ * oldestActiveXID is the oldest XID of any prepared transaction, or nextFullXid
* if there are none.
*/
void
StartupSUBTRANS(TransactionId oldestActiveXID)
{
+ FullTransactionId nextFullXid;
int startPage;
int endPage;
@@ -261,7 +262,8 @@ StartupSUBTRANS(TransactionId oldestActiveXID)
LWLockAcquire(SubtransControlLock, LW_EXCLUSIVE);
startPage = TransactionIdToPage(oldestActiveXID);
- endPage = TransactionIdToPage(ShmemVariableCache->nextXid);
+ nextFullXid = ShmemVariableCache->nextFullXid;
+ endPage = TransactionIdToPage(XidFromFullTransactionId(nextFullXid));
while (startPage != endPage)
{
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 3942734e5ae..7df60bca178 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -1849,16 +1849,16 @@ restoreTwoPhaseData(void)
*
* Scan the shared memory entries of TwoPhaseState and determine the range
* of valid XIDs present. This is run during database startup, after we
- * have completed reading WAL. ShmemVariableCache->nextXid has been set to
+ * have completed reading WAL. ShmemVariableCache->nextFullXid has been set to
* one more than the highest XID for which evidence exists in WAL.
*
- * We throw away any prepared xacts with main XID beyond nextXid --- if any
+ * We throw away any prepared xacts with main XID beyond nextFullXid --- if any
* are present, it suggests that the DBA has done a PITR recovery to an
* earlier point in time without cleaning out pg_twophase. We dare not
* try to recover such prepared xacts since they likely depend on database
* state that doesn't exist now.
*
- * However, we will advance nextXid beyond any subxact XIDs belonging to
+ * However, we will advance nextFullXid beyond any subxact XIDs belonging to
* valid prepared xacts. We need to do this since subxact commit doesn't
* write a WAL entry, and so there might be no evidence in WAL of those
* subxact XIDs.
@@ -1868,7 +1868,7 @@ restoreTwoPhaseData(void)
* backup should be rolled in.
*
* Our other responsibility is to determine and return the oldest valid XID
- * among the prepared xacts (if none, return ShmemVariableCache->nextXid).
+ * among the prepared xacts (if none, return ShmemVariableCache->nextFullXid).
* This is needed to synchronize pg_subtrans startup properly.
*
* If xids_p and nxids_p are not NULL, pointer to a palloc'd array of all
@@ -1878,7 +1878,8 @@ restoreTwoPhaseData(void)
TransactionId
PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
{
- TransactionId origNextXid = ShmemVariableCache->nextXid;
+ FullTransactionId nextFullXid = ShmemVariableCache->nextFullXid;
+ TransactionId origNextXid = XidFromFullTransactionId(nextFullXid);
TransactionId result = origNextXid;
TransactionId *xids = NULL;
int nxids = 0;
@@ -2094,7 +2095,7 @@ RecoverPreparedTransactions(void)
*
* If setParent is true, set up subtransaction parent linkages.
*
- * If setNextXid is true, set ShmemVariableCache->nextXid to the newest
+ * If setNextXid is true, set ShmemVariableCache->nextFullXid to the newest
* value scanned.
*/
static char *
@@ -2103,7 +2104,8 @@ ProcessTwoPhaseBuffer(TransactionId xid,
bool fromdisk,
bool setParent, bool setNextXid)
{
- TransactionId origNextXid = ShmemVariableCache->nextXid;
+ FullTransactionId nextFullXid = ShmemVariableCache->nextFullXid;
+ TransactionId origNextXid = XidFromFullTransactionId(nextFullXid);
TransactionId *subxids;
char *buf;
TwoPhaseFileHeader *hdr;
@@ -2183,7 +2185,7 @@ ProcessTwoPhaseBuffer(TransactionId xid,
/*
* Examine subtransaction XIDs ... they should all follow main XID, and
- * they may force us to advance nextXid.
+ * they may force us to advance nextFullXid.
*/
subxids = (TransactionId *) (buf +
MAXALIGN(sizeof(TwoPhaseFileHeader)) +
@@ -2194,24 +2196,15 @@ ProcessTwoPhaseBuffer(TransactionId xid,
Assert(TransactionIdFollows(subxid, xid));
- /* update nextXid if needed */
- if (setNextXid &&
- TransactionIdFollowsOrEquals(subxid,
- ShmemVariableCache->nextXid))
+ /* update nextFullXid if needed */
+ if (setNextXid)
{
/*
- * We don't expect anyone else to modify nextXid, hence we don't
+ * We don't expect anyone else to modify nextFullXid, hence we don't
* need to hold a lock while examining it. We still acquire the
* lock to modify it, though, so we recheck.
*/
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- if (TransactionIdFollowsOrEquals(subxid,
- ShmemVariableCache->nextXid))
- {
- ShmemVariableCache->nextXid = subxid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- }
- LWLockRelease(XidGenLock);
+ AdvanceNextFullTransactionIdPastXid(subxid, true);
}
if (setParent)
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index cede579d731..8d45dbc699d 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -73,7 +73,7 @@ GetNewTransactionId(bool isSubXact)
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*----------
* Check to see if it's safe to assign another XID. This protects against
@@ -156,7 +156,7 @@ GetNewTransactionId(bool isSubXact)
/* Re-acquire lock and start over */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
}
/*
@@ -173,12 +173,12 @@ GetNewTransactionId(bool isSubXact)
ExtendSUBTRANS(xid);
/*
- * Now advance the nextXid counter. This must not happen until after we
+ * Now advance the nextFullXid counter. This must not happen until after we
* have successfully completed ExtendCLOG() --- if that routine fails, we
* want the next incoming transaction to try it again. We cannot assign
* more XIDs until there is CLOG space for them.
*/
- TransactionIdAdvance(ShmemVariableCache->nextXid);
+ FullTransactionIdAdvance(&ShmemVariableCache->nextFullXid);
/*
* We must store the new XID into the shared ProcArray before releasing
@@ -244,18 +244,47 @@ GetNewTransactionId(bool isSubXact)
}
/*
- * Read nextXid but don't allocate it.
+ * Read nextFullXid but don't allocate it.
*/
-TransactionId
-ReadNewTransactionId(void)
+FullTransactionId
+ReadNextFullTransactionId(void)
{
- TransactionId xid;
+ FullTransactionId fullXid;
LWLockAcquire(XidGenLock, LW_SHARED);
- xid = ShmemVariableCache->nextXid;
+ fullXid = ShmemVariableCache->nextFullXid;
LWLockRelease(XidGenLock);
- return xid;
+ return fullXid;
+}
+
+/*
+ * Advance nextFullXid to the value after a given xid. The epoch is inferred.
+ * If lock_free_check is true, then the caller must be sure that it's safe to
+ * read nextFullXid without holding XidGenLock (ie during recovery).
+ */
+void
+AdvanceNextFullTransactionIdPastXid(TransactionId xid, bool lock_free_check)
+{
+ TransactionId current_xid;
+ uint32 epoch;
+
+ if (lock_free_check &&
+ !TransactionIdFollowsOrEquals(xid,
+ XidFromFullTransactionId(ShmemVariableCache->nextFullXid)))
+ return;
+
+ LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
+ current_xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
+ if (TransactionIdFollowsOrEquals(xid, current_xid))
+ {
+ epoch = EpochFromFullTransactionId(ShmemVariableCache->nextFullXid);
+ if (xid < current_xid)
+ ++epoch; /* epoch wrapped */
+ ShmemVariableCache->nextFullXid = MakeFullTransactionId(epoch, xid);
+ FullTransactionIdAdvance(&ShmemVariableCache->nextFullXid);
+ }
+ LWLockRelease(XidGenLock);
}
/*
@@ -359,7 +388,7 @@ SetTransactionIdLimit(TransactionId oldest_datfrozenxid, Oid oldest_datoid)
ShmemVariableCache->xidStopLimit = xidStopLimit;
ShmemVariableCache->xidWrapLimit = xidWrapLimit;
ShmemVariableCache->oldestXidDB = oldest_datoid;
- curXid = ShmemVariableCache->nextXid;
+ curXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
LWLockRelease(XidGenLock);
/* Log the info */
@@ -435,7 +464,7 @@ ForceTransactionIdLimitUpdate(void)
/* Locking is probably not really necessary, but let's be careful */
LWLockAcquire(XidGenLock, LW_SHARED);
- nextXid = ShmemVariableCache->nextXid;
+ nextXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
xidVacLimit = ShmemVariableCache->xidVacLimit;
oldestXid = ShmemVariableCache->oldestXid;
oldestXidDB = ShmemVariableCache->oldestXidDB;
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 875be180fe4..b6cac77cae0 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -5512,14 +5512,7 @@ xact_redo_commit(xl_xact_parsed_commit *parsed,
* hold a lock while checking this. We still acquire the lock to modify
* it, though.
*/
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPastXid(max_xid, true);
Assert(((parsed->xinfo & XACT_XINFO_HAS_ORIGIN) == 0) ==
(origin_id == InvalidRepOriginId));
@@ -5671,15 +5664,7 @@ xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid)
max_xid = TransactionIdLatest(xid,
parsed->nsubxacts,
parsed->subxacts);
-
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPastXid(max_xid, true);
if (standbyState == STANDBY_DISABLED)
{
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 3025d0badb8..599d8fd8674 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -579,8 +579,7 @@ typedef struct XLogCtlData
/* Protected by info_lck: */
XLogwrtRqst LogwrtRqst;
XLogRecPtr RedoRecPtr; /* a recent copy of Insert->RedoRecPtr */
- uint32 ckptXidEpoch; /* nextXID & epoch of latest checkpoint */
- TransactionId ckptXid;
+ FullTransactionId ckptFullXid; /* nextXID & epoch of latest checkpoint */
XLogRecPtr asyncXactLSN; /* LSN of newest async commit/abort */
XLogRecPtr replicationSlotMinLSN; /* oldest LSN needed by any slot */
@@ -5164,8 +5163,7 @@ BootStrapXLOG(void)
checkPoint.ThisTimeLineID = ThisTimeLineID;
checkPoint.PrevTimeLineID = ThisTimeLineID;
checkPoint.fullPageWrites = fullPageWrites;
- checkPoint.nextXidEpoch = 0;
- checkPoint.nextXid = FirstNormalTransactionId;
+ checkPoint.nextFullXid = MakeFullTransactionId(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstBootstrapObjectId;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
@@ -5178,7 +5176,7 @@ BootStrapXLOG(void)
checkPoint.time = (pg_time_t) time(NULL);
checkPoint.oldestActiveXid = InvalidTransactionId;
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
@@ -6781,8 +6779,8 @@ StartupXLOG(void)
(uint32) (checkPoint.redo >> 32), (uint32) checkPoint.redo,
wasShutdown ? "true" : "false")));
ereport(DEBUG1,
- (errmsg_internal("next transaction ID: %u:%u; next OID: %u",
- checkPoint.nextXidEpoch, checkPoint.nextXid,
+ (errmsg_internal("next transaction ID: " UINT64_FORMAT "; next OID: %u",
+ U64FromFullTransactionId(checkPoint.nextFullXid),
checkPoint.nextOid)));
ereport(DEBUG1,
(errmsg_internal("next MultiXactId: %u; next MultiXactOffset: %u",
@@ -6797,12 +6795,12 @@ StartupXLOG(void)
(errmsg_internal("commit timestamp Xid oldest/newest: %u/%u",
checkPoint.oldestCommitTsXid,
checkPoint.newestCommitTsXid)));
- if (!TransactionIdIsNormal(checkPoint.nextXid))
+ if (!TransactionIdIsNormal(XidFromFullTransactionId(checkPoint.nextFullXid)))
ereport(PANIC,
(errmsg("invalid next transaction ID")));
/* initialize shared memory variables from the checkpoint record */
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
@@ -6811,8 +6809,7 @@ StartupXLOG(void)
SetMultiXactIdLimit(checkPoint.oldestMulti, checkPoint.oldestMultiDB, true);
SetCommitTsLimit(checkPoint.oldestCommitTsXid,
checkPoint.newestCommitTsXid);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
/*
* Initialize replication slots, before there's a chance to remove
@@ -7082,7 +7079,7 @@ StartupXLOG(void)
Assert(TransactionIdIsValid(oldestActiveXID));
/* Tell procarray about the range of xids it has to deal with */
- ProcArrayInitRecovery(ShmemVariableCache->nextXid);
+ ProcArrayInitRecovery(XidFromFullTransactionId(ShmemVariableCache->nextFullXid));
/*
* Startup commit log and subtrans only. MultiXact and commit
@@ -7112,9 +7109,9 @@ StartupXLOG(void)
running.xcnt = nxids;
running.subxcnt = 0;
running.subxid_overflow = false;
- running.nextXid = checkPoint.nextXid;
+ running.nextXid = XidFromFullTransactionId(checkPoint.nextFullXid);
running.oldestRunningXid = oldestActiveXID;
- latestCompletedXid = checkPoint.nextXid;
+ latestCompletedXid = XidFromFullTransactionId(checkPoint.nextFullXid);
TransactionIdRetreat(latestCompletedXid);
Assert(TransactionIdIsNormal(latestCompletedXid));
running.latestCompletedXid = latestCompletedXid;
@@ -7290,14 +7287,7 @@ StartupXLOG(void)
* don't need to hold a lock while examining it. We still
* acquire the lock to modify it, though.
*/
- if (TransactionIdFollowsOrEquals(record->xl_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = record->xl_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPastXid(record->xl_xid, true);
/*
* Before replaying this record, check if this record causes
@@ -7877,7 +7867,7 @@ StartupXLOG(void)
/* also initialize latestCompletedXid, to nextXid - 1 */
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- ShmemVariableCache->latestCompletedXid = ShmemVariableCache->nextXid;
+ ShmemVariableCache->latestCompletedXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
TransactionIdRetreat(ShmemVariableCache->latestCompletedXid);
LWLockRelease(ProcArrayLock);
@@ -8470,41 +8460,6 @@ GetLastSegSwitchData(XLogRecPtr *lastSwitchLSN)
return result;
}
-/*
- * GetNextXidAndEpoch - get the current nextXid value and associated epoch
- *
- * This is exported for use by code that would like to have 64-bit XIDs.
- * We don't really support such things, but all XIDs within the system
- * can be presumed "close to" the result, and thus the epoch associated
- * with them can be determined.
- */
-void
-GetNextXidAndEpoch(TransactionId *xid, uint32 *epoch)
-{
- uint32 ckptXidEpoch;
- TransactionId ckptXid;
- TransactionId nextXid;
-
- /* Must read checkpoint info first, else have race condition */
- SpinLockAcquire(&XLogCtl->info_lck);
- ckptXidEpoch = XLogCtl->ckptXidEpoch;
- ckptXid = XLogCtl->ckptXid;
- SpinLockRelease(&XLogCtl->info_lck);
-
- /* Now fetch current nextXid */
- nextXid = ReadNewTransactionId();
-
- /*
- * nextXid is certainly logically later than ckptXid. So if it's
- * numerically less, it must have wrapped into the next epoch.
- */
- if (nextXid < ckptXid)
- ckptXidEpoch++;
-
- *xid = nextXid;
- *epoch = ckptXidEpoch;
-}
-
/*
* This must be called ONCE during postmaster or standalone-backend shutdown
*/
@@ -8924,7 +8879,7 @@ CreateCheckPoint(int flags)
* there.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- checkPoint.nextXid = ShmemVariableCache->nextXid;
+ checkPoint.nextFullXid = ShmemVariableCache->nextFullXid;
checkPoint.oldestXid = ShmemVariableCache->oldestXid;
checkPoint.oldestXidDB = ShmemVariableCache->oldestXidDB;
LWLockRelease(XidGenLock);
@@ -8934,11 +8889,6 @@ CreateCheckPoint(int flags)
checkPoint.newestCommitTsXid = ShmemVariableCache->newestCommitTsXid;
LWLockRelease(CommitTsLock);
- /* Increase XID epoch if we've wrapped around since last checkpoint */
- checkPoint.nextXidEpoch = ControlFile->checkPointCopy.nextXidEpoch;
- if (checkPoint.nextXid < ControlFile->checkPointCopy.nextXid)
- checkPoint.nextXidEpoch++;
-
LWLockAcquire(OidGenLock, LW_SHARED);
checkPoint.nextOid = ShmemVariableCache->nextOid;
if (!shutdown)
@@ -9082,8 +9032,7 @@ CreateCheckPoint(int flags)
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/*
@@ -9834,7 +9783,7 @@ xlog_redo(XLogReaderState *record)
memcpy(&checkPoint, XLogRecGetData(record), sizeof(CheckPoint));
/* In a SHUTDOWN checkpoint, believe the counters exactly */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
LWLockRelease(XidGenLock);
LWLockAcquire(OidGenLock, LW_EXCLUSIVE);
ShmemVariableCache->nextOid = checkPoint.nextOid;
@@ -9888,9 +9837,9 @@ xlog_redo(XLogReaderState *record)
running.xcnt = nxids;
running.subxcnt = 0;
running.subxid_overflow = false;
- running.nextXid = checkPoint.nextXid;
+ running.nextXid = XidFromFullTransactionId(checkPoint.nextFullXid);
running.oldestRunningXid = oldestActiveXID;
- latestCompletedXid = checkPoint.nextXid;
+ latestCompletedXid = XidFromFullTransactionId(checkPoint.nextFullXid);
TransactionIdRetreat(latestCompletedXid);
Assert(TransactionIdIsNormal(latestCompletedXid));
running.latestCompletedXid = latestCompletedXid;
@@ -9902,13 +9851,11 @@ xlog_redo(XLogReaderState *record)
}
/* ControlFile->checkPointCopy always tracks the latest ckpt XID */
- ControlFile->checkPointCopy.nextXidEpoch = checkPoint.nextXidEpoch;
- ControlFile->checkPointCopy.nextXid = checkPoint.nextXid;
+ ControlFile->checkPointCopy.nextFullXid = checkPoint.nextFullXid;
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/*
@@ -9929,9 +9876,9 @@ xlog_redo(XLogReaderState *record)
memcpy(&checkPoint, XLogRecGetData(record), sizeof(CheckPoint));
/* In an ONLINE checkpoint, treat the XID counter as a minimum */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- if (TransactionIdPrecedes(ShmemVariableCache->nextXid,
- checkPoint.nextXid))
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ if (FullTransactionIdPrecedes(ShmemVariableCache->nextFullXid,
+ checkPoint.nextFullXid))
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
LWLockRelease(XidGenLock);
/*
@@ -9961,13 +9908,11 @@ xlog_redo(XLogReaderState *record)
SetTransactionIdLimit(checkPoint.oldestXid,
checkPoint.oldestXidDB);
/* ControlFile->checkPointCopy always tracks the latest ckpt XID */
- ControlFile->checkPointCopy.nextXidEpoch = checkPoint.nextXidEpoch;
- ControlFile->checkPointCopy.nextXid = checkPoint.nextXid;
+ ControlFile->checkPointCopy.nextFullXid = checkPoint.nextFullXid;
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/* TLI should not change in an on-line checkpoint */
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 6f4b3538ac4..f94ba25e900 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -1182,6 +1182,7 @@ static void
XLogWalRcvSendHSFeedback(bool immed)
{
TimestampTz now;
+ FullTransactionId nextFullXid;
TransactionId nextXid;
uint32 xmin_epoch,
catalog_xmin_epoch;
@@ -1260,7 +1261,9 @@ XLogWalRcvSendHSFeedback(bool immed)
* Get epoch and adjust if nextXid and oldestXmin are different sides of
* the epoch boundary.
*/
- GetNextXidAndEpoch(&nextXid, &xmin_epoch);
+ nextFullXid = ReadNextFullTransactionId();
+ nextXid = XidFromFullTransactionId(nextFullXid);
+ xmin_epoch = EpochFromFullTransactionId(nextFullXid);
catalog_xmin_epoch = xmin_epoch;
if (nextXid < xmin)
xmin_epoch--;
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 370429d746c..0f44ef4a173 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1907,10 +1907,13 @@ PhysicalReplicationSlotNewXmin(TransactionId feedbackXmin, TransactionId feedbac
static bool
TransactionIdInRecentPast(TransactionId xid, uint32 epoch)
{
+ FullTransactionId nextFullXid;
TransactionId nextXid;
uint32 nextEpoch;
- GetNextXidAndEpoch(&nextXid, &nextEpoch);
+ nextFullXid = ReadNextFullTransactionId();
+ nextXid = XidFromFullTransactionId(nextFullXid);
+ nextEpoch = EpochFromFullTransactionId(nextFullXid);
if (xid <= nextXid)
{
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index bd20497d81a..b8e6721c960 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -669,7 +669,6 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
{
TransactionId *xids;
int nxids;
- TransactionId nextXid;
int i;
Assert(standbyState >= STANDBY_INITIALIZED);
@@ -882,16 +881,9 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
* hold a lock while examining it. We still acquire the lock to modify
* it, though.
*/
- nextXid = latestObservedXid;
- TransactionIdAdvance(nextXid);
- if (TransactionIdFollows(nextXid, ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = nextXid;
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPastXid(latestObservedXid, true);
- Assert(TransactionIdIsValid(ShmemVariableCache->nextXid));
+ Assert(TransactionIdIsValid(XidFromFullTransactionId(ShmemVariableCache->nextFullXid)));
KnownAssignedXidsDisplay(trace_recovery(DEBUG3));
if (standbyState == STANDBY_SNAPSHOT_READY)
@@ -1979,7 +1971,7 @@ GetRunningTransactionData(void)
latestCompletedXid = ShmemVariableCache->latestCompletedXid;
- oldestRunningXid = ShmemVariableCache->nextXid;
+ oldestRunningXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* Spin over procArray collecting all xids
@@ -2068,7 +2060,7 @@ GetRunningTransactionData(void)
CurrentRunningXacts->xcnt = count - subcount;
CurrentRunningXacts->subxcnt = subcount;
CurrentRunningXacts->subxid_overflow = suboverflowed;
- CurrentRunningXacts->nextXid = ShmemVariableCache->nextXid;
+ CurrentRunningXacts->nextXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
CurrentRunningXacts->oldestRunningXid = oldestRunningXid;
CurrentRunningXacts->latestCompletedXid = latestCompletedXid;
@@ -2113,7 +2105,7 @@ GetOldestActiveTransactionId(void)
* have already completed), when we spin over it.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- oldestRunningXid = ShmemVariableCache->nextXid;
+ oldestRunningXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
LWLockRelease(XidGenLock);
/*
@@ -2181,7 +2173,7 @@ GetOldestSafeDecodingTransactionId(bool catalogOnly)
* a safe, albeit pessimal, value.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- oldestSafeXid = ShmemVariableCache->nextXid;
+ oldestSafeXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* If there's already a slot pegging the xmin horizon, we can start with
@@ -3235,12 +3227,10 @@ RecordKnownAssignedTransactionIds(TransactionId xid)
*/
latestObservedXid = xid;
- /* ShmemVariableCache->nextXid must be beyond any observed xid */
+ /* ShmemVariableCache->nextFullXid must be beyond any observed xid */
+ AdvanceNextFullTransactionIdPastXid(latestObservedXid, false);
next_expected_xid = latestObservedXid;
TransactionIdAdvance(next_expected_xid);
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = next_expected_xid;
- LWLockRelease(XidGenLock);
}
}
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index f9c12e603b1..f69a58d84fd 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -868,7 +868,7 @@ standby_redo(XLogReaderState *record)
* up from a checkpoint and are immediately at our starting point, we
* unconditionally move to STANDBY_INITIALIZED. After this point we
* must do 4 things:
- * * move shared nextXid forwards as we see new xids
+ * * move shared nextFullXid forwards as we see new xids
* * extend the clog and subtrans with each new xid
* * keep track of uncommitted known assigned xids
* * keep track of uncommitted AccessExclusiveLocks
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index e8390311d03..880e6c14ef1 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -3270,7 +3270,7 @@ ReleasePredicateLocks(bool isCommit)
* transaction to complete before freeing some RAM; correctness of visible
* behavior is not affected.
*/
- MySerializableXact->finishedBefore = ShmemVariableCache->nextXid;
+ MySerializableXact->finishedBefore = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* If it's not a commit it's a rollback, and we can clear our locks
diff --git a/src/backend/utils/adt/txid.c b/src/backend/utils/adt/txid.c
index 7974c0bd3d8..4c34e215d26 100644
--- a/src/backend/utils/adt/txid.c
+++ b/src/backend/utils/adt/txid.c
@@ -91,7 +91,10 @@ typedef struct
static void
load_xid_epoch(TxidEpoch *state)
{
- GetNextXidAndEpoch(&state->last_xid, &state->epoch);
+ FullTransactionId fullXid = ReadNextFullTransactionId();
+
+ state->last_xid = XidFromFullTransactionId(fullXid);
+ state->epoch = EpochFromFullTransactionId(fullXid);
}
/*
@@ -114,8 +117,11 @@ TransactionIdInRecentPast(uint64 xid_with_epoch, TransactionId *extracted_xid)
TransactionId xid = (TransactionId) xid_with_epoch;
uint32 now_epoch;
TransactionId now_epoch_last_xid;
+ FullTransactionId now_xid;
- GetNextXidAndEpoch(&now_epoch_last_xid, &now_epoch);
+ now_xid = ReadNextFullTransactionId();
+ now_epoch_last_xid = XidFromFullTransactionId(now_xid);
+ now_epoch = EpochFromFullTransactionId(now_xid);
if (extracted_xid != NULL)
*extracted_xid = xid;
@@ -128,8 +134,7 @@ TransactionIdInRecentPast(uint64 xid_with_epoch, TransactionId *extracted_xid)
return true;
/* If the transaction ID is in the future, throw an error. */
- if (xid_epoch > now_epoch
- || (xid_epoch == now_epoch && xid > now_epoch_last_xid))
+ if (xid_with_epoch > U64FromFullTransactionId(now_xid))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("transaction ID %s is in the future",
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 3fc8b6a8a84..37fb18130f4 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -16,6 +16,7 @@
#include "postgres.h"
#include "access/htup_details.h"
+#include "access/transam.h"
#include "access/xlog_internal.h"
#include "access/xlog.h"
#include "catalog/pg_control.h"
@@ -164,8 +165,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
nulls[5] = false;
values[6] = CStringGetTextDatum(psprintf("%u:%u",
- ControlFile->checkPointCopy.nextXidEpoch,
- ControlFile->checkPointCopy.nextXid));
+ EpochFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid),
+ XidFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid)));
nulls[6] = false;
values[7] = ObjectIdGetDatum(ControlFile->checkPointCopy.nextOid);
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index 895a51f89d5..f6731bfd28d 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -20,6 +20,7 @@
#include <time.h>
+#include "access/transam.h"
#include "access/xlog.h"
#include "access/xlog_internal.h"
#include "catalog/pg_control.h"
@@ -256,8 +257,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's full_page_writes: %s\n"),
ControlFile->checkPointCopy.fullPageWrites ? _("on") : _("off"));
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
- ControlFile->checkPointCopy.nextXidEpoch,
- ControlFile->checkPointCopy.nextXid);
+ EpochFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid),
+ XidFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid));
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index 6fb403a5a8a..a68e72eb158 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -431,11 +431,15 @@ main(int argc, char *argv[])
* if any, includes these values.)
*/
if (set_xid_epoch != -1)
- ControlFile.checkPointCopy.nextXidEpoch = set_xid_epoch;
+ ControlFile.checkPointCopy.nextFullXid =
+ MakeFullTransactionId(set_xid_epoch,
+ XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
if (set_xid != 0)
{
- ControlFile.checkPointCopy.nextXid = set_xid;
+ ControlFile.checkPointCopy.nextFullXid =
+ MakeFullTransactionId(EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid),
+ set_xid);
/*
* For the moment, just set oldestXid to a value that will force
@@ -705,8 +709,7 @@ GuessControlValues(void)
ControlFile.checkPointCopy.ThisTimeLineID = 1;
ControlFile.checkPointCopy.PrevTimeLineID = 1;
ControlFile.checkPointCopy.fullPageWrites = false;
- ControlFile.checkPointCopy.nextXidEpoch = 0;
- ControlFile.checkPointCopy.nextXid = FirstNormalTransactionId;
+ ControlFile.checkPointCopy.nextFullXid = MakeFullTransactionId(0, FirstNormalTransactionId);
ControlFile.checkPointCopy.nextOid = FirstBootstrapObjectId;
ControlFile.checkPointCopy.nextMulti = FirstMultiXactId;
ControlFile.checkPointCopy.nextMultiOffset = 0;
@@ -786,8 +789,8 @@ PrintControlValues(bool guessed)
printf(_("Latest checkpoint's full_page_writes: %s\n"),
ControlFile.checkPointCopy.fullPageWrites ? _("on") : _("off"));
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
- ControlFile.checkPointCopy.nextXidEpoch,
- ControlFile.checkPointCopy.nextXid);
+ EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid),
+ XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile.checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
@@ -879,7 +882,7 @@ PrintNewControlValues(void)
if (set_xid != 0)
{
printf(_("NextXID: %u\n"),
- ControlFile.checkPointCopy.nextXid);
+ XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
printf(_("OldestXID: %u\n"),
ControlFile.checkPointCopy.oldestXid);
printf(_("OldestXID's DB: %u\n"),
@@ -889,7 +892,7 @@ PrintNewControlValues(void)
if (set_xid_epoch != -1)
{
printf(_("NextXID epoch: %u\n"),
- ControlFile.checkPointCopy.nextXidEpoch);
+ EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
}
if (set_oldest_commit_ts_xid != 0)
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 83ec3f19797..814becf96d7 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -44,6 +44,32 @@
#define TransactionIdStore(xid, dest) (*(dest) = (xid))
#define StoreInvalidTransactionId(dest) (*(dest) = InvalidTransactionId)
+#define EpochFromFullTransactionId(x) ((uint32) ((x).value >> 32))
+#define XidFromFullTransactionId(x) ((uint32) (x).value)
+#define U64FromFullTransactionId(x) ((x).value)
+#define FullTransactionIdPrecedes(a, b) ((a).value < (b).value)
+#define FullTransactionIdPrecedesOrEquals(a, b) ((a).value <= (b).value)
+
+/*
+ * A 64 bit value that contains an epoch and a TransactionId. This is
+ * wrapped in a struct to prevent implicit conversion to/from TransactionId.
+ * Allowing such conversions seems likely to be error-prone.
+ */
+typedef struct FullTransactionId
+{
+ uint64 value;
+} FullTransactionId;
+
+static inline FullTransactionId
+MakeFullTransactionId(uint32 epoch, TransactionId xid)
+{
+ FullTransactionId result;
+
+ result.value = ((uint64) epoch) << 32 | xid;
+
+ return result;
+}
+
/* advance a transaction ID variable, handling wraparound correctly */
#define TransactionIdAdvance(dest) \
do { \
@@ -52,6 +78,16 @@
(dest) = FirstNormalTransactionId; \
} while(0)
+/* advance a FullTransactionId variable, stepping over special XIDs */
+static inline void
+FullTransactionIdAdvance(FullTransactionId *dest)
+{
+ dest->value++;
+ if (XidFromFullTransactionId(*dest) < FirstNormalTransactionId)
+ *dest = MakeFullTransactionId(EpochFromFullTransactionId(*dest),
+ FirstNormalTransactionId);
+}
+
/* back up a transaction ID variable, handling wraparound correctly */
#define TransactionIdRetreat(dest) \
do { \
@@ -114,12 +150,12 @@ typedef struct VariableCacheData
/*
* These fields are protected by XidGenLock.
*/
- TransactionId nextXid; /* next XID to assign */
+ FullTransactionId nextFullXid; /* next XID to assign */
TransactionId oldestXid; /* cluster-wide minimum datfrozenxid */
TransactionId xidVacLimit; /* start forcing autovacuums here */
TransactionId xidWarnLimit; /* start complaining here */
- TransactionId xidStopLimit; /* refuse to advance nextXid beyond here */
+ TransactionId xidStopLimit; /* refuse to advance nextFullXid beyond here */
TransactionId xidWrapLimit; /* where the world ends */
Oid oldestXidDB; /* database with minimum datfrozenxid */
@@ -145,6 +181,7 @@ typedef struct VariableCacheData
typedef VariableCacheData *VariableCache;
+
/* ----------------
* extern declarations
* ----------------
@@ -176,11 +213,21 @@ extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
/* in transam/varsup.c */
extern TransactionId GetNewTransactionId(bool isSubXact);
-extern TransactionId ReadNewTransactionId(void);
+extern void AdvanceNextFullTransactionIdPastXid(TransactionId xid, bool lock_free_check);
+extern FullTransactionId ReadNextFullTransactionId(void);
extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
Oid oldest_datoid);
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+/*
+ * For callers that just need the XID part of the next transaction ID.
+ */
+static inline TransactionId
+ReadNewTransactionId(void)
+{
+ return XidFromFullTransactionId(ReadNextFullTransactionId());
+}
+
#endif /* TRAMSAM_H */
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 421ba6d7755..3c9d3401df5 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -276,7 +276,6 @@ extern XLogRecPtr GetRedoRecPtr(void);
extern XLogRecPtr GetInsertRecPtr(void);
extern XLogRecPtr GetFlushRecPtr(void);
extern XLogRecPtr GetLastImportantRecPtr(void);
-extern void GetNextXidAndEpoch(TransactionId *xid, uint32 *epoch);
extern void RemovePromoteSignalFiles(void);
extern bool CheckPromoteSignal(void);
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 773d9e6ebae..f3ee525cec3 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -15,6 +15,7 @@
#ifndef PG_CONTROL_H
#define PG_CONTROL_H
+#include "access/transam.h"
#include "access/xlogdefs.h"
#include "pgtime.h" /* for pg_time_t */
#include "port/pg_crc32c.h"
@@ -39,8 +40,7 @@ typedef struct CheckPoint
TimeLineID PrevTimeLineID; /* previous TLI, if this record begins a new
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
- uint32 nextXidEpoch; /* higher-order bits of nextXid */
- TransactionId nextXid; /* next free XID */
+ FullTransactionId nextFullXid; /* next free full transaction ID */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index 1fcd8cf1b59..d1116454095 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -72,7 +72,7 @@ typedef struct RunningTransactionsData
int xcnt; /* # of xact ids in xids[] */
int subxcnt; /* # of subxact ids in xids[] */
bool subxid_overflow; /* snapshot overflowed, subxids missing */
- TransactionId nextXid; /* copy of ShmemVariableCache->nextXid */
+ TransactionId nextXid; /* xid from ShmemVariableCache->nextFullXid */
TransactionId oldestRunningXid; /* *not* oldestXmin */
TransactionId latestCompletedXid; /* so we can set xmax */
--
2.17.0
On Tue, Jul 24, 2018 at 7:25 AM Thomas Munro <thomas.munro@enterprisedb.com> wrote:
Here's a new version. I did some cosmetic clean-up, and I dropped the
changes to pg_controldata output, replication epoch/xid processing
code and various similar non-essential changes. For this patch I want
just the new type + next xid generator + appropriate conversions.I propose that we get this committed early in the cycle. That'd
maximise testing time in the tree, fix the bug identified by Amit, and
leave plenty of time for later patches to use FullTransactionId in
more places as appropriate.
Then probably it's the good time to do so. Any opinions or more reviews here?
I'll move it to the next CF for now.
On Fri, Nov 30, 2018 at 12:12 AM Dmitry Dolgov <9erthalion6@gmail.com> wrote:
On Tue, Jul 24, 2018 at 7:25 AM Thomas Munro <thomas.munro@enterprisedb.com> wrote:
Here's a new version. I did some cosmetic clean-up, and I dropped the
changes to pg_controldata output, replication epoch/xid processing
code and various similar non-essential changes. For this patch I want
just the new type + next xid generator + appropriate conversions.I propose that we get this committed early in the cycle. That'd
maximise testing time in the tree, fix the bug identified by Amit, and
leave plenty of time for later patches to use FullTransactionId in
more places as appropriate.Then probably it's the good time to do so. Any opinions or more reviews here?
I'll move it to the next CF for now.
If there are no objections, I'm planning to do a round of testing and
commit this shortly.
--
Thomas Munro
http://www.enterprisedb.com
On Sun, Feb 03, 2019 at 10:58:02PM +1100, Thomas Munro wrote:
If there are no objections, I'm planning to do a round of testing and
commit this shortly.
Hm. That looks sane to me at quick glance. I am a bit on the edge
regaring the naming "FullTransactionId", which is actually a 64-bit
value with a 32-bit XID and a 32-bit epoch. Something like
TransactionIdWithEpoch or EpochTransactionId sounds a bit better to
me. My point is that "Full" is too generic for that.
Moved to next CF for now.
--
Michael
Michael Paquier <michael@paquier.xyz> writes:
Hm. That looks sane to me at quick glance. I am a bit on the edge
regaring the naming "FullTransactionId", which is actually a 64-bit
value with a 32-bit XID and a 32-bit epoch. Something like
TransactionIdWithEpoch or EpochTransactionId sounds a bit better to
me. My point is that "Full" is too generic for that.
WideTransactionId, maybe? I agree that "Full" seems like a poor
adjective here.
regards, tom lane
On February 4, 2019 6:43:44 AM GMT+01:00, Michael Paquier <michael@paquier.xyz> wrote:
On Sun, Feb 03, 2019 at 10:58:02PM +1100, Thomas Munro wrote:
If there are no objections, I'm planning to do a round of testing and
commit this shortly.Hm. That looks sane to me at quick glance. I am a bit on the edge
regaring the naming "FullTransactionId", which is actually a 64-bit
value with a 32-bit XID and a 32-bit epoch. Something like
TransactionIdWithEpoch or EpochTransactionId sounds a bit better to
me. My point is that "Full" is too generic for that.
I'm not a fan of names with epoch in it - these are the real transaction IDs now. Conflating them with the until-now inferred epochs sounds like a bad idea to me. We IMO should just treat the new type as a 64bit uint, and the 32bit as a truncated version. Like, we could just add 64 as a prefix.
Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
Hi,
On 2018-09-19 13:58:36 +1200, Thomas Munro wrote:
+/* + * Advance nextFullXid to the value after a given xid. The epoch is inferred. + * If lock_free_check is true, then the caller must be sure that it's safe to + * read nextFullXid without holding XidGenLock (ie during recovery). + */ +void +AdvanceNextFullTransactionIdPastXid(TransactionId xid, bool lock_free_check) +{ + TransactionId current_xid; + uint32 epoch; + + if (lock_free_check && + !TransactionIdFollowsOrEquals(xid, + XidFromFullTransactionId(ShmemVariableCache->nextFullXid))) + return; + + LWLockAcquire(XidGenLock, LW_EXCLUSIVE); + current_xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid); + if (TransactionIdFollowsOrEquals(xid, current_xid)) + { + epoch = EpochFromFullTransactionId(ShmemVariableCache->nextFullXid); + if (xid < current_xid) + ++epoch; /* epoch wrapped */ + ShmemVariableCache->nextFullXid = MakeFullTransactionId(epoch, xid); + FullTransactionIdAdvance(&ShmemVariableCache->nextFullXid); + } + LWLockRelease(XidGenLock); }
Is this really a good idea? Seems like it's going to perpetuate the
problematic epoch logic we had before?
--- a/src/bin/pg_resetwal/pg_resetwal.c +++ b/src/bin/pg_resetwal/pg_resetwal.c @@ -431,11 +431,15 @@ main(int argc, char *argv[]) * if any, includes these values.) */ if (set_xid_epoch != -1) - ControlFile.checkPointCopy.nextXidEpoch = set_xid_epoch; + ControlFile.checkPointCopy.nextFullXid = + MakeFullTransactionId(set_xid_epoch, + XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));if (set_xid != 0) { - ControlFile.checkPointCopy.nextXid = set_xid; + ControlFile.checkPointCopy.nextFullXid = + MakeFullTransactionId(EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid), + set_xid);
/* * For the moment, just set oldestXid to a value that will force @@ -705,8 +709,7 @@ GuessControlValues(void) ControlFile.checkPointCopy.ThisTimeLineID = 1; ControlFile.checkPointCopy.PrevTimeLineID = 1; ControlFile.checkPointCopy.fullPageWrites = false; - ControlFile.checkPointCopy.nextXidEpoch = 0; - ControlFile.checkPointCopy.nextXid = FirstNormalTransactionId; + ControlFile.checkPointCopy.nextFullXid = MakeFullTransactionId(0, FirstNormalTransactionId); ControlFile.checkPointCopy.nextOid = FirstBootstrapObjectId; ControlFile.checkPointCopy.nextMulti = FirstMultiXactId; ControlFile.checkPointCopy.nextMultiOffset = 0; @@ -786,8 +789,8 @@ PrintControlValues(bool guessed) printf(_("Latest checkpoint's full_page_writes: %s\n"), ControlFile.checkPointCopy.fullPageWrites ? _("on") : _("off")); printf(_("Latest checkpoint's NextXID: %u:%u\n"), - ControlFile.checkPointCopy.nextXidEpoch, - ControlFile.checkPointCopy.nextXid); + EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid), + XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid)); printf(_("Latest checkpoint's NextOID: %u\n"), ControlFile.checkPointCopy.nextOid); printf(_("Latest checkpoint's NextMultiXactId: %u\n"), @@ -879,7 +882,7 @@ PrintNewControlValues(void) if (set_xid != 0) { printf(_("NextXID: %u\n"), - ControlFile.checkPointCopy.nextXid); + XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid)); printf(_("OldestXID: %u\n"), ControlFile.checkPointCopy.oldestXid); printf(_("OldestXID's DB: %u\n"), @@ -889,7 +892,7 @@ PrintNewControlValues(void) if (set_xid_epoch != -1) { printf(_("NextXID epoch: %u\n"), - ControlFile.checkPointCopy.nextXidEpoch); + EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid)); }
I still think it's a mistake to not display the full xid here, and
rather perpetuate the epoch stuff. I'm ok with splitting that into a
separate commit, but this ought to be fixed in the same release imo.
diff --git a/src/include/access/transam.h b/src/include/access/transam.h index 83ec3f19797..814becf96d7 100644 --- a/src/include/access/transam.h +++ b/src/include/access/transam.h @@ -44,6 +44,32 @@ #define TransactionIdStore(xid, dest) (*(dest) = (xid)) #define StoreInvalidTransactionId(dest) (*(dest) = InvalidTransactionId)+#define EpochFromFullTransactionId(x) ((uint32) ((x).value >> 32)) +#define XidFromFullTransactionId(x) ((uint32) (x).value) +#define U64FromFullTransactionId(x) ((x).value) +#define FullTransactionIdPrecedes(a, b) ((a).value < (b).value) +#define FullTransactionIdPrecedesOrEquals(a, b) ((a).value <= (b).value) + +/* + * A 64 bit value that contains an epoch and a TransactionId. This is + * wrapped in a struct to prevent implicit conversion to/from TransactionId. + * Allowing such conversions seems likely to be error-prone. + */ +typedef struct FullTransactionId +{ + uint64 value; +} FullTransactionId;
Probably good to note here that not all values are valid normal xids.
+static inline FullTransactionId +MakeFullTransactionId(uint32 epoch, TransactionId xid) +{ + FullTransactionId result; + + result.value = ((uint64) epoch) << 32 | xid; + + return result; +}
Make sounds a bit like it's allocating...
/* advance a transaction ID variable, handling wraparound correctly */
#define TransactionIdAdvance(dest) \
do { \
@@ -52,6 +78,16 @@
(dest) = FirstNormalTransactionId; \
} while(0)+/* advance a FullTransactionId variable, stepping over special XIDs */ +static inline void +FullTransactionIdAdvance(FullTransactionId *dest) +{ + dest->value++; + if (XidFromFullTransactionId(*dest) < FirstNormalTransactionId) + *dest = MakeFullTransactionId(EpochFromFullTransactionId(*dest), + FirstNormalTransactionId); +}
Hm, this seems pretty odd coding to me. Why not do something like
dest->value++;
while (XidFromFullTransactionId(*dest) < FirstNormalTransactionId)
dest->value++;
That'd a) be a bit more readable, b) possible to do in a lockfree way at
a later point.
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h index 1fcd8cf1b59..d1116454095 100644 --- a/src/include/storage/standby.h +++ b/src/include/storage/standby.h @@ -72,7 +72,7 @@ typedef struct RunningTransactionsData int xcnt; /* # of xact ids in xids[] */ int subxcnt; /* # of subxact ids in xids[] */ bool subxid_overflow; /* snapshot overflowed, subxids missing */ - TransactionId nextXid; /* copy of ShmemVariableCache->nextXid */ + TransactionId nextXid; /* xid from ShmemVariableCache->nextFullXid */
Probably worthwhile to pgindent partially.
Greetings,
Andres Freund
On Mon, Feb 4, 2019 at 8:41 PM Andres Freund <andres@anarazel.de> wrote:
On 2018-09-19 13:58:36 +1200, Thomas Munro wrote:
+/* + * Advance nextFullXid to the value after a given xid. The epoch is inferred. + * If lock_free_check is true, then the caller must be sure that it's safe to + * read nextFullXid without holding XidGenLock (ie during recovery). + */ +void +AdvanceNextFullTransactionIdPastXid(TransactionId xid, bool lock_free_check) +{ + TransactionId current_xid; + uint32 epoch; + + if (lock_free_check && + !TransactionIdFollowsOrEquals(xid, + XidFromFullTransactionId(ShmemVariableCache->nextFullXid))) + return; + + LWLockAcquire(XidGenLock, LW_EXCLUSIVE); + current_xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid); + if (TransactionIdFollowsOrEquals(xid, current_xid)) + { + epoch = EpochFromFullTransactionId(ShmemVariableCache->nextFullXid); + if (xid < current_xid) + ++epoch; /* epoch wrapped */ + ShmemVariableCache->nextFullXid = MakeFullTransactionId(epoch, xid); + FullTransactionIdAdvance(&ShmemVariableCache->nextFullXid); + } + LWLockRelease(XidGenLock); }Is this really a good idea? Seems like it's going to perpetuate the
problematic epoch logic we had before?
We could get rid of this in future, if certain WAL records and 2PC
state start carrying full xids. That would be much bigger surgery
than I wanted to do in this patch. This is the only place that
converts 32 bit -> 64 bit with an inferred epoch component. I have
explained why I think that's correct in new comments along with a new
assertion.
The theory is that the xids encountered in recovery and 2PC startup
can never be too far out of phase with the current nextFullXid. In
contrast, the original epoch tracking from commit 35af5422 wasn't
bounded in the same sort of way. Certainly no other code should be
allowed to do this sort of thing without very careful consideration of
how the epoch is bounded. The patch deliberately provides no general
purpose make-me-a-FullTransactionId-by-guessing-the-epoch facility.
While reviewing this, I also realised that the "lock_free_check"
function argument was unnecessary. The only place that was setting
that to false, namely RecordKnownAssignedTransactionIds(), might as
well just use the same behaviour. I've now rewritten
AdvanceNextFullTransactionIdPastXid() completely in light of all that;
please review.
printf(_("Latest checkpoint's full_page_writes: %s\n"), ControlFile.checkPointCopy.fullPageWrites ? _("on") : _("off")); printf(_("Latest checkpoint's NextXID: %u:%u\n"), - ControlFile.checkPointCopy.nextXidEpoch, - ControlFile.checkPointCopy.nextXid); + EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid), + XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
I still think it's a mistake to not display the full xid here, and
rather perpetuate the epoch stuff. I'm ok with splitting that into a
separate commit, but this ought to be fixed in the same release imo.
Ok.
+/* + * A 64 bit value that contains an epoch and a TransactionId. This is + * wrapped in a struct to prevent implicit conversion to/from TransactionId. + * Allowing such conversions seems likely to be error-prone. + */ +typedef struct FullTransactionId +{ + uint64 value; +} FullTransactionId;Probably good to note here that not all values are valid normal xids.
Done.
+static inline FullTransactionId +MakeFullTransactionId(uint32 epoch, TransactionId xid) +{ + FullTransactionId result; + + result.value = ((uint64) epoch) << 32 | xid; + + return result; +}Make sounds a bit like it's allocating...
Changed to FullTransactionIdFromEpochAndXid().
+ dest->value++; + if (XidFromFullTransactionId(*dest) < FirstNormalTransactionId) + *dest = MakeFullTransactionId(EpochFromFullTransactionId(*dest),
Hm, this seems pretty odd coding to me. Why not do something like
dest->value++;
while (XidFromFullTransactionId(*dest) < FirstNormalTransactionId)
dest->value++;That'd a) be a bit more readable, b) possible to do in a lockfree way at
a later point.
Done.
- TransactionId nextXid; /* copy of ShmemVariableCache->nextXid */ + TransactionId nextXid; /* xid from ShmemVariableCache->nextFullXid */Probably worthwhile to pgindent partially.
Done.
I also added FullTransactionId to typedefs.list, bumped
PG_CONTROL_VERSION and did some peformance testing of tight loops of
GetNewTransactionId(), which showed no measurable change (~10 million
single-threaded calls per second either way on the machine I tested).
New version attached. I'd like to commit this for PG12.
--
Thomas Munro
https://enterprisedb.com
Attachments:
0001-Track-the-next-transaction-ID-using-64-bits-v7.patchapplication/octet-stream; name=0001-Track-the-next-transaction-ID-using-64-bits-v7.patchDownload
From 403ea57794b2719d260d591a4ccefc42b02fbd9f Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Mon, 25 Mar 2019 12:29:23 +1300
Subject: [PATCH] Track the next transaction ID using 64 bits.
Instead of inferring epoch advancement from xids and checkpoints,
introduce a 64 bit FullTransactionId type and use it to track xid
generation. This fixes an unlikely bug where epoch information is
corrupted if you somehow manage to make TransactionId wrap around
more than once between checkpoints.
This commit creates some very basic infrastructure allowing for later
patches to adopt 64 bit transaction IDs in more places, potentially
including table access methods that don't need to freeze tables.
Author: Thomas Munro
Reported-by: Amit Kapila
Reviewed-by: Andres Freund, Tom Lane
Discussion: https://postgr.es/m/CAA4eK1%2BMv%2Bmb0HFfWM9Srtc6MVe160WFurXV68iAFMcagRZ0dQ%40mail.gmail.com
---
src/backend/access/rmgrdesc/xlogdesc.c | 4 +-
src/backend/access/transam/clog.c | 8 +-
src/backend/access/transam/commit_ts.c | 4 +-
src/backend/access/transam/multixact.c | 20 +----
src/backend/access/transam/subtrans.c | 8 +-
src/backend/access/transam/twophase.c | 40 +++------
src/backend/access/transam/varsup.c | 76 ++++++++++++----
src/backend/access/transam/xact.c | 35 +-------
src/backend/access/transam/xlog.c | 113 ++++++------------------
src/backend/replication/walreceiver.c | 5 +-
src/backend/replication/walsender.c | 5 +-
src/backend/storage/ipc/procarray.c | 34 ++-----
src/backend/storage/ipc/standby.c | 2 +-
src/backend/storage/lmgr/predicate.c | 2 +-
src/backend/utils/adt/txid.c | 13 ++-
src/backend/utils/misc/pg_controldata.c | 5 +-
src/bin/pg_controldata/pg_controldata.c | 5 +-
src/bin/pg_resetwal/pg_resetwal.c | 20 +++--
src/include/access/transam.h | 52 ++++++++++-
src/include/access/xlog.h | 1 -
src/include/catalog/pg_control.h | 6 +-
src/include/storage/standby.h | 2 +-
src/include/storage/standbydefs.h | 2 +-
src/tools/pgindent/typedefs.list | 1 +
24 files changed, 225 insertions(+), 238 deletions(-)
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index bfad284be08..33060f30429 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -14,6 +14,7 @@
*/
#include "postgres.h"
+#include "access/transam.h"
#include "access/xlog.h"
#include "access/xlog_internal.h"
#include "catalog/pg_control.h"
@@ -52,7 +53,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->ThisTimeLineID,
checkpoint->PrevTimeLineID,
checkpoint->fullPageWrites ? "true" : "false",
- checkpoint->nextXidEpoch, checkpoint->nextXid,
+ EpochFromFullTransactionId(checkpoint->nextFullXid),
+ XidFromFullTransactionId(checkpoint->nextFullXid),
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index aa089d83fa8..3bd55fbdd33 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -749,12 +749,12 @@ ZeroCLOGPage(int pageno, bool writeXlog)
/*
* This must be called ONCE during postmaster or standalone-backend startup,
- * after StartupXLOG has initialized ShmemVariableCache->nextXid.
+ * after StartupXLOG has initialized ShmemVariableCache->nextFullXid.
*/
void
StartupCLOG(void)
{
- TransactionId xid = ShmemVariableCache->nextXid;
+ TransactionId xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
int pageno = TransactionIdToPage(xid);
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
@@ -773,7 +773,7 @@ StartupCLOG(void)
void
TrimCLOG(void)
{
- TransactionId xid = ShmemVariableCache->nextXid;
+ TransactionId xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
int pageno = TransactionIdToPage(xid);
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
@@ -792,7 +792,7 @@ TrimCLOG(void)
* but makes no WAL entry). Let's just be safe. (We need not worry about
* pages beyond the current one, since those will be zeroed when first
* used. For the same reason, there is no need to do anything when
- * nextXid is exactly at a page boundary; and it's likely that the
+ * nextFullXid is exactly at a page boundary; and it's likely that the
* "current" page doesn't exist yet in that case.)
*/
if (TransactionIdToPgIndex(xid) != 0)
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 9d7f15935dc..8162f884bd1 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -553,7 +553,7 @@ ZeroCommitTsPage(int pageno, bool writeXlog)
/*
* This must be called ONCE during postmaster or standalone-backend startup,
- * after StartupXLOG has initialized ShmemVariableCache->nextXid.
+ * after StartupXLOG has initialized ShmemVariableCache->nextFullXid.
*/
void
StartupCommitTs(void)
@@ -643,7 +643,7 @@ ActivateCommitTs(void)
}
LWLockRelease(CommitTsLock);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
pageno = TransactionIdToCTsPage(xid);
/*
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index c3998719405..763b9997071 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -3267,9 +3267,9 @@ multixact_redo(XLogReaderState *record)
xlrec->moff + xlrec->nmembers);
/*
- * Make sure nextXid is beyond any XID mentioned in the record. This
- * should be unnecessary, since any XID found here ought to have other
- * evidence in the XLOG, but let's be safe.
+ * Make sure nextFullXid is beyond any XID mentioned in the record.
+ * This should be unnecessary, since any XID found here ought to have
+ * other evidence in the XLOG, but let's be safe.
*/
max_xid = XLogRecGetXid(record);
for (i = 0; i < xlrec->nmembers; i++)
@@ -3278,19 +3278,7 @@ multixact_redo(XLogReaderState *record)
max_xid = xlrec->members[i].xid;
}
- /*
- * We don't expect anyone else to modify nextXid, hence startup
- * process doesn't need to hold a lock while checking this. We still
- * acquire the lock to modify it, though.
- */
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPastXid(max_xid);
}
else if (info == XLOG_MULTIXACT_TRUNCATE_ID)
{
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c
index cbc61294eb9..e667fd02385 100644
--- a/src/backend/access/transam/subtrans.c
+++ b/src/backend/access/transam/subtrans.c
@@ -241,14 +241,15 @@ ZeroSUBTRANSPage(int pageno)
/*
* This must be called ONCE during postmaster or standalone-backend startup,
- * after StartupXLOG has initialized ShmemVariableCache->nextXid.
+ * after StartupXLOG has initialized ShmemVariableCache->nextFullXid.
*
- * oldestActiveXID is the oldest XID of any prepared transaction, or nextXid
+ * oldestActiveXID is the oldest XID of any prepared transaction, or nextFullXid
* if there are none.
*/
void
StartupSUBTRANS(TransactionId oldestActiveXID)
{
+ FullTransactionId nextFullXid;
int startPage;
int endPage;
@@ -261,7 +262,8 @@ StartupSUBTRANS(TransactionId oldestActiveXID)
LWLockAcquire(SubtransControlLock, LW_EXCLUSIVE);
startPage = TransactionIdToPage(oldestActiveXID);
- endPage = TransactionIdToPage(ShmemVariableCache->nextXid);
+ nextFullXid = ShmemVariableCache->nextFullXid;
+ endPage = TransactionIdToPage(XidFromFullTransactionId(nextFullXid));
while (startPage != endPage)
{
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 21986e48fe2..11992f7447d 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -1878,16 +1878,16 @@ restoreTwoPhaseData(void)
*
* Scan the shared memory entries of TwoPhaseState and determine the range
* of valid XIDs present. This is run during database startup, after we
- * have completed reading WAL. ShmemVariableCache->nextXid has been set to
+ * have completed reading WAL. ShmemVariableCache->nextFullXid has been set to
* one more than the highest XID for which evidence exists in WAL.
*
- * We throw away any prepared xacts with main XID beyond nextXid --- if any
+ * We throw away any prepared xacts with main XID beyond nextFullXid --- if any
* are present, it suggests that the DBA has done a PITR recovery to an
* earlier point in time without cleaning out pg_twophase. We dare not
* try to recover such prepared xacts since they likely depend on database
* state that doesn't exist now.
*
- * However, we will advance nextXid beyond any subxact XIDs belonging to
+ * However, we will advance nextFullXid beyond any subxact XIDs belonging to
* valid prepared xacts. We need to do this since subxact commit doesn't
* write a WAL entry, and so there might be no evidence in WAL of those
* subxact XIDs.
@@ -1897,7 +1897,7 @@ restoreTwoPhaseData(void)
* backup should be rolled in.
*
* Our other responsibility is to determine and return the oldest valid XID
- * among the prepared xacts (if none, return ShmemVariableCache->nextXid).
+ * among the prepared xacts (if none, return ShmemVariableCache->nextFullXid).
* This is needed to synchronize pg_subtrans startup properly.
*
* If xids_p and nxids_p are not NULL, pointer to a palloc'd array of all
@@ -1907,7 +1907,8 @@ restoreTwoPhaseData(void)
TransactionId
PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
{
- TransactionId origNextXid = ShmemVariableCache->nextXid;
+ FullTransactionId nextFullXid = ShmemVariableCache->nextFullXid;
+ TransactionId origNextXid = XidFromFullTransactionId(nextFullXid);
TransactionId result = origNextXid;
TransactionId *xids = NULL;
int nxids = 0;
@@ -2123,7 +2124,7 @@ RecoverPreparedTransactions(void)
*
* If setParent is true, set up subtransaction parent linkages.
*
- * If setNextXid is true, set ShmemVariableCache->nextXid to the newest
+ * If setNextXid is true, set ShmemVariableCache->nextFullXid to the newest
* value scanned.
*/
static char *
@@ -2132,7 +2133,8 @@ ProcessTwoPhaseBuffer(TransactionId xid,
bool fromdisk,
bool setParent, bool setNextXid)
{
- TransactionId origNextXid = ShmemVariableCache->nextXid;
+ FullTransactionId nextFullXid = ShmemVariableCache->nextFullXid;
+ TransactionId origNextXid = XidFromFullTransactionId(nextFullXid);
TransactionId *subxids;
char *buf;
TwoPhaseFileHeader *hdr;
@@ -2212,7 +2214,7 @@ ProcessTwoPhaseBuffer(TransactionId xid,
/*
* Examine subtransaction XIDs ... they should all follow main XID, and
- * they may force us to advance nextXid.
+ * they may force us to advance nextFullXid.
*/
subxids = (TransactionId *) (buf +
MAXALIGN(sizeof(TwoPhaseFileHeader)) +
@@ -2223,25 +2225,9 @@ ProcessTwoPhaseBuffer(TransactionId xid,
Assert(TransactionIdFollows(subxid, xid));
- /* update nextXid if needed */
- if (setNextXid &&
- TransactionIdFollowsOrEquals(subxid,
- ShmemVariableCache->nextXid))
- {
- /*
- * We don't expect anyone else to modify nextXid, hence we don't
- * need to hold a lock while examining it. We still acquire the
- * lock to modify it, though, so we recheck.
- */
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- if (TransactionIdFollowsOrEquals(subxid,
- ShmemVariableCache->nextXid))
- {
- ShmemVariableCache->nextXid = subxid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- }
- LWLockRelease(XidGenLock);
- }
+ /* update nextFullXid if needed */
+ if (setNextXid)
+ AdvanceNextFullTransactionIdPastXid(subxid);
if (setParent)
SubTransSetParent(subxid, xid);
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index fe94fdaf049..db141d286dc 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -73,7 +73,7 @@ GetNewTransactionId(bool isSubXact)
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*----------
* Check to see if it's safe to assign another XID. This protects against
@@ -156,7 +156,7 @@ GetNewTransactionId(bool isSubXact)
/* Re-acquire lock and start over */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
}
/*
@@ -173,12 +173,12 @@ GetNewTransactionId(bool isSubXact)
ExtendSUBTRANS(xid);
/*
- * Now advance the nextXid counter. This must not happen until after we
- * have successfully completed ExtendCLOG() --- if that routine fails, we
- * want the next incoming transaction to try it again. We cannot assign
- * more XIDs until there is CLOG space for them.
+ * Now advance the nextFullXid counter. This must not happen until after
+ * we have successfully completed ExtendCLOG() --- if that routine fails,
+ * we want the next incoming transaction to try it again. We cannot
+ * assign more XIDs until there is CLOG space for them.
*/
- TransactionIdAdvance(ShmemVariableCache->nextXid);
+ FullTransactionIdAdvance(&ShmemVariableCache->nextFullXid);
/*
* We must store the new XID into the shared ProcArray before releasing
@@ -236,18 +236,64 @@ GetNewTransactionId(bool isSubXact)
}
/*
- * Read nextXid but don't allocate it.
+ * Read nextFullXid but don't allocate it.
*/
-TransactionId
-ReadNewTransactionId(void)
+FullTransactionId
+ReadNextFullTransactionId(void)
{
- TransactionId xid;
+ FullTransactionId fullXid;
LWLockAcquire(XidGenLock, LW_SHARED);
- xid = ShmemVariableCache->nextXid;
+ fullXid = ShmemVariableCache->nextFullXid;
LWLockRelease(XidGenLock);
- return xid;
+ return fullXid;
+}
+
+/*
+ * Advance nextFullXid to the value after a given xid. The epoch is inferred.
+ * This must only be called during recovery or from two-phase start-up code.
+ */
+void
+AdvanceNextFullTransactionIdPastXid(TransactionId xid)
+{
+ FullTransactionId newNextFullXid;
+ TransactionId next_xid;
+ uint32 epoch;
+
+ /*
+ * It is safe to read nextFullXid without a lock, because this is only
+ * called from the startup process or single process mode, meaning that no
+ * other process can modify it.
+ */
+ Assert(AmStartupProcess() || !IsUnderPostmaster);
+
+ /* Fast return if this isn't an xid high enough to move the needle. */
+ next_xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
+ if (!TransactionIdFollowsOrEquals(xid, next_xid))
+ return;
+
+ /*
+ * Compute the FullTransactionId that comes after the given xid. To do
+ * this, we preserve the existing epoch, but detect when we've wrapped
+ * into a new epoch. This is necessary because WAL records and 2PC state
+ * currently contain 32 bit xids. The wrap logic is safe in those cases
+ * because the span of active xids cannot exceed one epoch at any given
+ * point in the WAL stream.
+ */
+ TransactionIdAdvance(xid);
+ epoch = EpochFromFullTransactionId(ShmemVariableCache->nextFullXid);
+ if (unlikely(xid < next_xid))
+ ++epoch;
+ newNextFullXid = FullTransactionIdFromEpochAndXid(epoch, xid);
+
+ /*
+ * We still need to take a lock to modify the value against concurrent
+ * readers.
+ */
+ LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextFullXid = newNextFullXid;
+ LWLockRelease(XidGenLock);
}
/*
@@ -351,7 +397,7 @@ SetTransactionIdLimit(TransactionId oldest_datfrozenxid, Oid oldest_datoid)
ShmemVariableCache->xidStopLimit = xidStopLimit;
ShmemVariableCache->xidWrapLimit = xidWrapLimit;
ShmemVariableCache->oldestXidDB = oldest_datoid;
- curXid = ShmemVariableCache->nextXid;
+ curXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
LWLockRelease(XidGenLock);
/* Log the info */
@@ -427,7 +473,7 @@ ForceTransactionIdLimitUpdate(void)
/* Locking is probably not really necessary, but let's be careful */
LWLockAcquire(XidGenLock, LW_SHARED);
- nextXid = ShmemVariableCache->nextXid;
+ nextXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
xidVacLimit = ShmemVariableCache->xidVacLimit;
oldestXid = ShmemVariableCache->oldestXid;
oldestXidDB = ShmemVariableCache->oldestXidDB;
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index c3214d4f4d8..9b100050597 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -5636,21 +5636,8 @@ xact_redo_commit(xl_xact_parsed_commit *parsed,
max_xid = TransactionIdLatest(xid, parsed->nsubxacts, parsed->subxacts);
- /*
- * Make sure nextXid is beyond any XID mentioned in the record.
- *
- * We don't expect anyone else to modify nextXid, hence we don't need to
- * hold a lock while checking this. We still acquire the lock to modify
- * it, though.
- */
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ /* Make sure nextFullXid is beyond any XID mentioned in the record. */
+ AdvanceNextFullTransactionIdPastXid(max_xid);
Assert(((parsed->xinfo & XACT_XINFO_HAS_ORIGIN) == 0) ==
(origin_id == InvalidRepOriginId));
@@ -5792,25 +5779,11 @@ xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid)
Assert(TransactionIdIsValid(xid));
- /*
- * Make sure nextXid is beyond any XID mentioned in the record.
- *
- * We don't expect anyone else to modify nextXid, hence we don't need to
- * hold a lock while checking this. We still acquire the lock to modify
- * it, though.
- */
+ /* Make sure nextFullXid is beyond any XID mentioned in the record. */
max_xid = TransactionIdLatest(xid,
parsed->nsubxacts,
parsed->subxacts);
-
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPastXid(max_xid);
if (standbyState == STANDBY_DISABLED)
{
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index ad12ebc4269..19d7911ec50 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -590,8 +590,7 @@ typedef struct XLogCtlData
/* Protected by info_lck: */
XLogwrtRqst LogwrtRqst;
XLogRecPtr RedoRecPtr; /* a recent copy of Insert->RedoRecPtr */
- uint32 ckptXidEpoch; /* nextXID & epoch of latest checkpoint */
- TransactionId ckptXid;
+ FullTransactionId ckptFullXid; /* nextFullXid of latest checkpoint */
XLogRecPtr asyncXactLSN; /* LSN of newest async commit/abort */
XLogRecPtr replicationSlotMinLSN; /* oldest LSN needed by any slot */
@@ -5115,8 +5114,8 @@ BootStrapXLOG(void)
checkPoint.ThisTimeLineID = ThisTimeLineID;
checkPoint.PrevTimeLineID = ThisTimeLineID;
checkPoint.fullPageWrites = fullPageWrites;
- checkPoint.nextXidEpoch = 0;
- checkPoint.nextXid = FirstNormalTransactionId;
+ checkPoint.nextFullXid =
+ FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstBootstrapObjectId;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
@@ -5129,7 +5128,7 @@ BootStrapXLOG(void)
checkPoint.time = (pg_time_t) time(NULL);
checkPoint.oldestActiveXid = InvalidTransactionId;
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
@@ -6557,8 +6556,8 @@ StartupXLOG(void)
(uint32) (checkPoint.redo >> 32), (uint32) checkPoint.redo,
wasShutdown ? "true" : "false")));
ereport(DEBUG1,
- (errmsg_internal("next transaction ID: %u:%u; next OID: %u",
- checkPoint.nextXidEpoch, checkPoint.nextXid,
+ (errmsg_internal("next transaction ID: " UINT64_FORMAT "; next OID: %u",
+ U64FromFullTransactionId(checkPoint.nextFullXid),
checkPoint.nextOid)));
ereport(DEBUG1,
(errmsg_internal("next MultiXactId: %u; next MultiXactOffset: %u",
@@ -6573,12 +6572,12 @@ StartupXLOG(void)
(errmsg_internal("commit timestamp Xid oldest/newest: %u/%u",
checkPoint.oldestCommitTsXid,
checkPoint.newestCommitTsXid)));
- if (!TransactionIdIsNormal(checkPoint.nextXid))
+ if (!TransactionIdIsNormal(XidFromFullTransactionId(checkPoint.nextFullXid)))
ereport(PANIC,
(errmsg("invalid next transaction ID")));
/* initialize shared memory variables from the checkpoint record */
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
@@ -6587,8 +6586,7 @@ StartupXLOG(void)
SetMultiXactIdLimit(checkPoint.oldestMulti, checkPoint.oldestMultiDB, true);
SetCommitTsLimit(checkPoint.oldestCommitTsXid,
checkPoint.newestCommitTsXid);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
/*
* Initialize replication slots, before there's a chance to remove
@@ -6859,7 +6857,7 @@ StartupXLOG(void)
Assert(TransactionIdIsValid(oldestActiveXID));
/* Tell procarray about the range of xids it has to deal with */
- ProcArrayInitRecovery(ShmemVariableCache->nextXid);
+ ProcArrayInitRecovery(XidFromFullTransactionId(ShmemVariableCache->nextFullXid));
/*
* Startup commit log and subtrans only. MultiXact and commit
@@ -6889,9 +6887,9 @@ StartupXLOG(void)
running.xcnt = nxids;
running.subxcnt = 0;
running.subxid_overflow = false;
- running.nextXid = checkPoint.nextXid;
+ running.nextXid = XidFromFullTransactionId(checkPoint.nextFullXid);
running.oldestRunningXid = oldestActiveXID;
- latestCompletedXid = checkPoint.nextXid;
+ latestCompletedXid = XidFromFullTransactionId(checkPoint.nextFullXid);
TransactionIdRetreat(latestCompletedXid);
Assert(TransactionIdIsNormal(latestCompletedXid));
running.latestCompletedXid = latestCompletedXid;
@@ -7061,20 +7059,10 @@ StartupXLOG(void)
error_context_stack = &errcallback;
/*
- * ShmemVariableCache->nextXid must be beyond record's xid.
- *
- * We don't expect anyone else to modify nextXid, hence we
- * don't need to hold a lock while examining it. We still
- * acquire the lock to modify it, though.
+ * ShmemVariableCache->nextFullXid must be beyond record's
+ * xid.
*/
- if (TransactionIdFollowsOrEquals(record->xl_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = record->xl_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPastXid(record->xl_xid);
/*
* Before replaying this record, check if this record causes
@@ -7654,7 +7642,7 @@ StartupXLOG(void)
/* also initialize latestCompletedXid, to nextXid - 1 */
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- ShmemVariableCache->latestCompletedXid = ShmemVariableCache->nextXid;
+ ShmemVariableCache->latestCompletedXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
TransactionIdRetreat(ShmemVariableCache->latestCompletedXid);
LWLockRelease(ProcArrayLock);
@@ -8247,41 +8235,6 @@ GetLastSegSwitchData(XLogRecPtr *lastSwitchLSN)
return result;
}
-/*
- * GetNextXidAndEpoch - get the current nextXid value and associated epoch
- *
- * This is exported for use by code that would like to have 64-bit XIDs.
- * We don't really support such things, but all XIDs within the system
- * can be presumed "close to" the result, and thus the epoch associated
- * with them can be determined.
- */
-void
-GetNextXidAndEpoch(TransactionId *xid, uint32 *epoch)
-{
- uint32 ckptXidEpoch;
- TransactionId ckptXid;
- TransactionId nextXid;
-
- /* Must read checkpoint info first, else have race condition */
- SpinLockAcquire(&XLogCtl->info_lck);
- ckptXidEpoch = XLogCtl->ckptXidEpoch;
- ckptXid = XLogCtl->ckptXid;
- SpinLockRelease(&XLogCtl->info_lck);
-
- /* Now fetch current nextXid */
- nextXid = ReadNewTransactionId();
-
- /*
- * nextXid is certainly logically later than ckptXid. So if it's
- * numerically less, it must have wrapped into the next epoch.
- */
- if (nextXid < ckptXid)
- ckptXidEpoch++;
-
- *xid = nextXid;
- *epoch = ckptXidEpoch;
-}
-
/*
* This must be called ONCE during postmaster or standalone-backend shutdown
*/
@@ -8701,7 +8654,7 @@ CreateCheckPoint(int flags)
* there.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- checkPoint.nextXid = ShmemVariableCache->nextXid;
+ checkPoint.nextFullXid = ShmemVariableCache->nextFullXid;
checkPoint.oldestXid = ShmemVariableCache->oldestXid;
checkPoint.oldestXidDB = ShmemVariableCache->oldestXidDB;
LWLockRelease(XidGenLock);
@@ -8711,11 +8664,6 @@ CreateCheckPoint(int flags)
checkPoint.newestCommitTsXid = ShmemVariableCache->newestCommitTsXid;
LWLockRelease(CommitTsLock);
- /* Increase XID epoch if we've wrapped around since last checkpoint */
- checkPoint.nextXidEpoch = ControlFile->checkPointCopy.nextXidEpoch;
- if (checkPoint.nextXid < ControlFile->checkPointCopy.nextXid)
- checkPoint.nextXidEpoch++;
-
LWLockAcquire(OidGenLock, LW_SHARED);
checkPoint.nextOid = ShmemVariableCache->nextOid;
if (!shutdown)
@@ -8859,8 +8807,7 @@ CreateCheckPoint(int flags)
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/*
@@ -9622,7 +9569,7 @@ xlog_redo(XLogReaderState *record)
memcpy(&checkPoint, XLogRecGetData(record), sizeof(CheckPoint));
/* In a SHUTDOWN checkpoint, believe the counters exactly */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
LWLockRelease(XidGenLock);
LWLockAcquire(OidGenLock, LW_EXCLUSIVE);
ShmemVariableCache->nextOid = checkPoint.nextOid;
@@ -9676,9 +9623,9 @@ xlog_redo(XLogReaderState *record)
running.xcnt = nxids;
running.subxcnt = 0;
running.subxid_overflow = false;
- running.nextXid = checkPoint.nextXid;
+ running.nextXid = XidFromFullTransactionId(checkPoint.nextFullXid);
running.oldestRunningXid = oldestActiveXID;
- latestCompletedXid = checkPoint.nextXid;
+ latestCompletedXid = XidFromFullTransactionId(checkPoint.nextFullXid);
TransactionIdRetreat(latestCompletedXid);
Assert(TransactionIdIsNormal(latestCompletedXid));
running.latestCompletedXid = latestCompletedXid;
@@ -9690,13 +9637,11 @@ xlog_redo(XLogReaderState *record)
}
/* ControlFile->checkPointCopy always tracks the latest ckpt XID */
- ControlFile->checkPointCopy.nextXidEpoch = checkPoint.nextXidEpoch;
- ControlFile->checkPointCopy.nextXid = checkPoint.nextXid;
+ ControlFile->checkPointCopy.nextFullXid = checkPoint.nextFullXid;
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/*
@@ -9717,9 +9662,9 @@ xlog_redo(XLogReaderState *record)
memcpy(&checkPoint, XLogRecGetData(record), sizeof(CheckPoint));
/* In an ONLINE checkpoint, treat the XID counter as a minimum */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- if (TransactionIdPrecedes(ShmemVariableCache->nextXid,
- checkPoint.nextXid))
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ if (FullTransactionIdPrecedes(ShmemVariableCache->nextFullXid,
+ checkPoint.nextFullXid))
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
LWLockRelease(XidGenLock);
/*
@@ -9749,13 +9694,11 @@ xlog_redo(XLogReaderState *record)
SetTransactionIdLimit(checkPoint.oldestXid,
checkPoint.oldestXidDB);
/* ControlFile->checkPointCopy always tracks the latest ckpt XID */
- ControlFile->checkPointCopy.nextXidEpoch = checkPoint.nextXidEpoch;
- ControlFile->checkPointCopy.nextXid = checkPoint.nextXid;
+ ControlFile->checkPointCopy.nextFullXid = checkPoint.nextFullXid;
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/* TLI should not change in an on-line checkpoint */
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index d9959e568a8..f32cf91ffb3 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -1160,6 +1160,7 @@ static void
XLogWalRcvSendHSFeedback(bool immed)
{
TimestampTz now;
+ FullTransactionId nextFullXid;
TransactionId nextXid;
uint32 xmin_epoch,
catalog_xmin_epoch;
@@ -1238,7 +1239,9 @@ XLogWalRcvSendHSFeedback(bool immed)
* Get epoch and adjust if nextXid and oldestXmin are different sides of
* the epoch boundary.
*/
- GetNextXidAndEpoch(&nextXid, &xmin_epoch);
+ nextFullXid = ReadNextFullTransactionId();
+ nextXid = XidFromFullTransactionId(nextFullXid);
+ xmin_epoch = EpochFromFullTransactionId(nextFullXid);
catalog_xmin_epoch = xmin_epoch;
if (nextXid < xmin)
xmin_epoch--;
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 4bb98ef352a..21f5c868f18 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1912,10 +1912,13 @@ PhysicalReplicationSlotNewXmin(TransactionId feedbackXmin, TransactionId feedbac
static bool
TransactionIdInRecentPast(TransactionId xid, uint32 epoch)
{
+ FullTransactionId nextFullXid;
TransactionId nextXid;
uint32 nextEpoch;
- GetNextXidAndEpoch(&nextXid, &nextEpoch);
+ nextFullXid = ReadNextFullTransactionId();
+ nextXid = XidFromFullTransactionId(nextFullXid);
+ nextEpoch = EpochFromFullTransactionId(nextFullXid);
if (xid <= nextXid)
{
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index cf93357997c..76d6833b017 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -664,7 +664,6 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
{
TransactionId *xids;
int nxids;
- TransactionId nextXid;
int i;
Assert(standbyState >= STANDBY_INITIALIZED);
@@ -881,23 +880,10 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
LWLockRelease(ProcArrayLock);
- /*
- * ShmemVariableCache->nextXid must be beyond any observed xid.
- *
- * We don't expect anyone else to modify nextXid, hence we don't need to
- * hold a lock while examining it. We still acquire the lock to modify
- * it, though.
- */
- nextXid = latestObservedXid;
- TransactionIdAdvance(nextXid);
- if (TransactionIdFollows(nextXid, ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = nextXid;
- LWLockRelease(XidGenLock);
- }
+ /* ShmemVariableCache->nextFullXid must be beyond any observed xid. */
+ AdvanceNextFullTransactionIdPastXid(latestObservedXid);
- Assert(TransactionIdIsValid(ShmemVariableCache->nextXid));
+ Assert(TransactionIdIsValid(XidFromFullTransactionId(ShmemVariableCache->nextFullXid)));
KnownAssignedXidsDisplay(trace_recovery(DEBUG3));
if (standbyState == STANDBY_SNAPSHOT_READY)
@@ -2001,7 +1987,7 @@ GetRunningTransactionData(void)
latestCompletedXid = ShmemVariableCache->latestCompletedXid;
- oldestRunningXid = ShmemVariableCache->nextXid;
+ oldestRunningXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* Spin over procArray collecting all xids
@@ -2093,7 +2079,7 @@ GetRunningTransactionData(void)
CurrentRunningXacts->xcnt = count - subcount;
CurrentRunningXacts->subxcnt = subcount;
CurrentRunningXacts->subxid_overflow = suboverflowed;
- CurrentRunningXacts->nextXid = ShmemVariableCache->nextXid;
+ CurrentRunningXacts->nextXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
CurrentRunningXacts->oldestRunningXid = oldestRunningXid;
CurrentRunningXacts->latestCompletedXid = latestCompletedXid;
@@ -2138,7 +2124,7 @@ GetOldestActiveTransactionId(void)
* have already completed), when we spin over it.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- oldestRunningXid = ShmemVariableCache->nextXid;
+ oldestRunningXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
LWLockRelease(XidGenLock);
/*
@@ -2206,7 +2192,7 @@ GetOldestSafeDecodingTransactionId(bool catalogOnly)
* a safe, albeit pessimal, value.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- oldestSafeXid = ShmemVariableCache->nextXid;
+ oldestSafeXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* If there's already a slot pegging the xmin horizon, we can start with
@@ -3266,12 +3252,10 @@ RecordKnownAssignedTransactionIds(TransactionId xid)
*/
latestObservedXid = xid;
- /* ShmemVariableCache->nextXid must be beyond any observed xid */
+ /* ShmemVariableCache->nextFullXid must be beyond any observed xid */
+ AdvanceNextFullTransactionIdPastXid(latestObservedXid);
next_expected_xid = latestObservedXid;
TransactionIdAdvance(next_expected_xid);
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = next_expected_xid;
- LWLockRelease(XidGenLock);
}
}
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 4d10e57a803..cd56dca3aef 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -867,7 +867,7 @@ standby_redo(XLogReaderState *record)
* up from a checkpoint and are immediately at our starting point, we
* unconditionally move to STANDBY_INITIALIZED. After this point we
* must do 4 things:
- * * move shared nextXid forwards as we see new xids
+ * * move shared nextFullXid forwards as we see new xids
* * extend the clog and subtrans with each new xid
* * keep track of uncommitted known assigned xids
* * keep track of uncommitted AccessExclusiveLocks
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 92beaab5663..4e4d04bae37 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -3410,7 +3410,7 @@ ReleasePredicateLocks(bool isCommit, bool isReadOnlySafe)
* transaction to complete before freeing some RAM; correctness of visible
* behavior is not affected.
*/
- MySerializableXact->finishedBefore = ShmemVariableCache->nextXid;
+ MySerializableXact->finishedBefore = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* If it's not a commit it's either a rollback or a read-only transaction
diff --git a/src/backend/utils/adt/txid.c b/src/backend/utils/adt/txid.c
index 791bbf84024..daea21e84dd 100644
--- a/src/backend/utils/adt/txid.c
+++ b/src/backend/utils/adt/txid.c
@@ -91,7 +91,10 @@ typedef struct
static void
load_xid_epoch(TxidEpoch *state)
{
- GetNextXidAndEpoch(&state->last_xid, &state->epoch);
+ FullTransactionId fullXid = ReadNextFullTransactionId();
+
+ state->last_xid = XidFromFullTransactionId(fullXid);
+ state->epoch = EpochFromFullTransactionId(fullXid);
}
/*
@@ -114,8 +117,11 @@ TransactionIdInRecentPast(uint64 xid_with_epoch, TransactionId *extracted_xid)
TransactionId xid = (TransactionId) xid_with_epoch;
uint32 now_epoch;
TransactionId now_epoch_last_xid;
+ FullTransactionId now_xid;
- GetNextXidAndEpoch(&now_epoch_last_xid, &now_epoch);
+ now_xid = ReadNextFullTransactionId();
+ now_epoch_last_xid = XidFromFullTransactionId(now_xid);
+ now_epoch = EpochFromFullTransactionId(now_xid);
if (extracted_xid != NULL)
*extracted_xid = xid;
@@ -128,8 +134,7 @@ TransactionIdInRecentPast(uint64 xid_with_epoch, TransactionId *extracted_xid)
return true;
/* If the transaction ID is in the future, throw an error. */
- if (xid_epoch > now_epoch
- || (xid_epoch == now_epoch && xid > now_epoch_last_xid))
+ if (xid_with_epoch > U64FromFullTransactionId(now_xid))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("transaction ID %s is in the future",
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index e6742dc24b8..e675c33c547 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -16,6 +16,7 @@
#include "postgres.h"
#include "access/htup_details.h"
+#include "access/transam.h"
#include "access/xlog_internal.h"
#include "access/xlog.h"
#include "catalog/pg_control.h"
@@ -164,8 +165,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
nulls[5] = false;
values[6] = CStringGetTextDatum(psprintf("%u:%u",
- ControlFile->checkPointCopy.nextXidEpoch,
- ControlFile->checkPointCopy.nextXid));
+ EpochFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid),
+ XidFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid)));
nulls[6] = false;
values[7] = ObjectIdGetDatum(ControlFile->checkPointCopy.nextOid);
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index 1aa1db218ac..9a17d0f9c0f 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -20,6 +20,7 @@
#include <time.h>
+#include "access/transam.h"
#include "access/xlog.h"
#include "access/xlog_internal.h"
#include "catalog/pg_control.h"
@@ -256,8 +257,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's full_page_writes: %s\n"),
ControlFile->checkPointCopy.fullPageWrites ? _("on") : _("off"));
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
- ControlFile->checkPointCopy.nextXidEpoch,
- ControlFile->checkPointCopy.nextXid);
+ EpochFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid),
+ XidFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid));
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index a7b25ffe1cd..67fc646befb 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -430,11 +430,15 @@ main(int argc, char *argv[])
* if any, includes these values.)
*/
if (set_xid_epoch != -1)
- ControlFile.checkPointCopy.nextXidEpoch = set_xid_epoch;
+ ControlFile.checkPointCopy.nextFullXid =
+ FullTransactionIdFromEpochAndXid(set_xid_epoch,
+ XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
if (set_xid != 0)
{
- ControlFile.checkPointCopy.nextXid = set_xid;
+ ControlFile.checkPointCopy.nextFullXid =
+ FullTransactionIdFromEpochAndXid(EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid),
+ set_xid);
/*
* For the moment, just set oldestXid to a value that will force
@@ -704,8 +708,8 @@ GuessControlValues(void)
ControlFile.checkPointCopy.ThisTimeLineID = 1;
ControlFile.checkPointCopy.PrevTimeLineID = 1;
ControlFile.checkPointCopy.fullPageWrites = false;
- ControlFile.checkPointCopy.nextXidEpoch = 0;
- ControlFile.checkPointCopy.nextXid = FirstNormalTransactionId;
+ ControlFile.checkPointCopy.nextFullXid =
+ FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
ControlFile.checkPointCopy.nextOid = FirstBootstrapObjectId;
ControlFile.checkPointCopy.nextMulti = FirstMultiXactId;
ControlFile.checkPointCopy.nextMultiOffset = 0;
@@ -786,8 +790,8 @@ PrintControlValues(bool guessed)
printf(_("Latest checkpoint's full_page_writes: %s\n"),
ControlFile.checkPointCopy.fullPageWrites ? _("on") : _("off"));
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
- ControlFile.checkPointCopy.nextXidEpoch,
- ControlFile.checkPointCopy.nextXid);
+ EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid),
+ XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile.checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
@@ -879,7 +883,7 @@ PrintNewControlValues(void)
if (set_xid != 0)
{
printf(_("NextXID: %u\n"),
- ControlFile.checkPointCopy.nextXid);
+ XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
printf(_("OldestXID: %u\n"),
ControlFile.checkPointCopy.oldestXid);
printf(_("OldestXID's DB: %u\n"),
@@ -889,7 +893,7 @@ PrintNewControlValues(void)
if (set_xid_epoch != -1)
{
printf(_("NextXID epoch: %u\n"),
- ControlFile.checkPointCopy.nextXidEpoch);
+ EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
}
if (set_oldest_commit_ts_xid != 0)
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 78997e533e7..5715f0c5217 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -44,6 +44,32 @@
#define TransactionIdStore(xid, dest) (*(dest) = (xid))
#define StoreInvalidTransactionId(dest) (*(dest) = InvalidTransactionId)
+#define EpochFromFullTransactionId(x) ((uint32) ((x).value >> 32))
+#define XidFromFullTransactionId(x) ((uint32) (x).value)
+#define U64FromFullTransactionId(x) ((x).value)
+#define FullTransactionIdPrecedes(a, b) ((a).value < (b).value)
+#define FullTransactionIdPrecedesOrEquals(a, b) ((a).value <= (b).value)
+
+/*
+ * A 64 bit value that contains an epoch and a TransactionId. This is
+ * wrapped in a struct to prevent implicit conversion to/from TransactionId.
+ * Not all values represent valid normal XIDs.
+ */
+typedef struct FullTransactionId
+{
+ uint64 value;
+} FullTransactionId;
+
+static inline FullTransactionId
+FullTransactionIdFromEpochAndXid(uint32 epoch, TransactionId xid)
+{
+ FullTransactionId result;
+
+ result.value = ((uint64) epoch) << 32 | xid;
+
+ return result;
+}
+
/* advance a transaction ID variable, handling wraparound correctly */
#define TransactionIdAdvance(dest) \
do { \
@@ -52,6 +78,15 @@
(dest) = FirstNormalTransactionId; \
} while(0)
+/* advance a FullTransactionId variable, stepping over special XIDs */
+static inline void
+FullTransactionIdAdvance(FullTransactionId *dest)
+{
+ dest->value++;
+ while (XidFromFullTransactionId(*dest) < FirstNormalTransactionId)
+ dest->value++;
+}
+
/* back up a transaction ID variable, handling wraparound correctly */
#define TransactionIdRetreat(dest) \
do { \
@@ -125,12 +160,12 @@ typedef struct VariableCacheData
/*
* These fields are protected by XidGenLock.
*/
- TransactionId nextXid; /* next XID to assign */
+ FullTransactionId nextFullXid; /* next full XID to assign */
TransactionId oldestXid; /* cluster-wide minimum datfrozenxid */
TransactionId xidVacLimit; /* start forcing autovacuums here */
TransactionId xidWarnLimit; /* start complaining here */
- TransactionId xidStopLimit; /* refuse to advance nextXid beyond here */
+ TransactionId xidStopLimit; /* refuse to advance nextFullXid beyond here */
TransactionId xidWrapLimit; /* where the world ends */
Oid oldestXidDB; /* database with minimum datfrozenxid */
@@ -156,6 +191,7 @@ typedef struct VariableCacheData
typedef VariableCacheData *VariableCache;
+
/* ----------------
* extern declarations
* ----------------
@@ -187,11 +223,21 @@ extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
/* in transam/varsup.c */
extern TransactionId GetNewTransactionId(bool isSubXact);
-extern TransactionId ReadNewTransactionId(void);
+extern void AdvanceNextFullTransactionIdPastXid(TransactionId xid);
+extern FullTransactionId ReadNextFullTransactionId(void);
extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
Oid oldest_datoid);
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+/*
+ * For callers that just need the XID part of the next transaction ID.
+ */
+static inline TransactionId
+ReadNewTransactionId(void)
+{
+ return XidFromFullTransactionId(ReadNextFullTransactionId());
+}
+
#endif /* TRAMSAM_H */
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index bd74e7aaa03..eb6c44649dc 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -310,7 +310,6 @@ extern XLogRecPtr GetRedoRecPtr(void);
extern XLogRecPtr GetInsertRecPtr(void);
extern XLogRecPtr GetFlushRecPtr(void);
extern XLogRecPtr GetLastImportantRecPtr(void);
-extern void GetNextXidAndEpoch(TransactionId *xid, uint32 *epoch);
extern void RemovePromoteSignalFiles(void);
extern bool CheckPromoteSignal(void);
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index a3910a5f997..ff98d9e91a8 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -15,13 +15,14 @@
#ifndef PG_CONTROL_H
#define PG_CONTROL_H
+#include "access/transam.h"
#include "access/xlogdefs.h"
#include "pgtime.h" /* for pg_time_t */
#include "port/pg_crc32c.h"
/* Version identifier for this pg_control format */
-#define PG_CONTROL_VERSION 1200
+#define PG_CONTROL_VERSION 1201
/* Nonce key length, see below */
#define MOCK_AUTH_NONCE_LEN 32
@@ -39,8 +40,7 @@ typedef struct CheckPoint
TimeLineID PrevTimeLineID; /* previous TLI, if this record begins a new
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
- uint32 nextXidEpoch; /* higher-order bits of nextXid */
- TransactionId nextXid; /* next free XID */
+ FullTransactionId nextFullXid; /* next free full transaction ID */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index 346a3108bc1..23612435148 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -72,7 +72,7 @@ typedef struct RunningTransactionsData
int xcnt; /* # of xact ids in xids[] */
int subxcnt; /* # of subxact ids in xids[] */
bool subxid_overflow; /* snapshot overflowed, subxids missing */
- TransactionId nextXid; /* copy of ShmemVariableCache->nextXid */
+ TransactionId nextXid; /* xid from ShmemVariableCache->nextFullXid */
TransactionId oldestRunningXid; /* *not* oldestXmin */
TransactionId latestCompletedXid; /* so we can set xmax */
diff --git a/src/include/storage/standbydefs.h b/src/include/storage/standbydefs.h
index cc8ccd5d369..01d2db6ac6e 100644
--- a/src/include/storage/standbydefs.h
+++ b/src/include/storage/standbydefs.h
@@ -49,7 +49,7 @@ typedef struct xl_running_xacts
int xcnt; /* # of xact ids in xids[] */
int subxcnt; /* # of subxact ids in xids[] */
bool subxid_overflow; /* snapshot overflowed, subxids missing */
- TransactionId nextXid; /* copy of ShmemVariableCache->nextXid */
+ TransactionId nextXid; /* xid from ShmemVariableCache->nextFullXid */
TransactionId oldestRunningXid; /* *not* oldestXmin */
TransactionId latestCompletedXid; /* so we can set xmax */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 88fb396910c..d4022acd91e 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -796,6 +796,7 @@ FreePageManager
FreePageSpanLeader
FromCharDateMode
FromExpr
+FullTransactionId
FuncCall
FuncCallContext
FuncCandidateList
--
2.21.0
On Mon, Mar 25, 2019 at 5:01 PM Thomas Munro <thomas.munro@gmail.com> wrote:
New version attached. I'd like to commit this for PG12.
Here is a follow-up sketch patch that shows FullTransactionId being
used in the transaction stack, so you can call eg
GetCurrentFullTransactionId(). A table access method could use this
to avoid the need to freeze stuff later (eg zheap).
I suppose it's not strictly necessary, since you could use
GetCurrentTransactionId() and infer the epoch by comparing with
ReadNextFullTransactionId() (now that the epoch counting is reliable,
due to patch 0001 which I'm repeating again here just for cfbot). But
I suppose we want to get away from that sort of thing. Thoughts?
--
Thomas Munro
https://enterprisedb.com
Attachments:
0001-Track-the-next-transaction-ID-using-64-bits-v7.patchapplication/x-patch; name=0001-Track-the-next-transaction-ID-using-64-bits-v7.patchDownload
From 1f513f48a603c663753ddb5fa1d8e62abf500db9 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Mon, 25 Mar 2019 12:29:23 +1300
Subject: [PATCH 1/2] Track the next transaction ID using 64 bits.
Instead of inferring epoch advancement from xids and checkpoints,
introduce a 64 bit FullTransactionId type and use it to track xid
generation. This fixes an unlikely bug where epoch information is
corrupted if you somehow manage to make TransactionId wrap around
more than once between checkpoints.
This commit creates some very basic infrastructure allowing for later
patches to adopt 64 bit transaction IDs in more places, potentially
including table access methods that don't need to freeze tables.
Author: Thomas Munro
Reported-by: Amit Kapila
Reviewed-by: Andres Freund, Tom Lane
Discussion: https://postgr.es/m/CAA4eK1%2BMv%2Bmb0HFfWM9Srtc6MVe160WFurXV68iAFMcagRZ0dQ%40mail.gmail.com
---
src/backend/access/rmgrdesc/xlogdesc.c | 4 +-
src/backend/access/transam/clog.c | 8 +-
src/backend/access/transam/commit_ts.c | 4 +-
src/backend/access/transam/multixact.c | 20 +----
src/backend/access/transam/subtrans.c | 8 +-
src/backend/access/transam/twophase.c | 40 +++------
src/backend/access/transam/varsup.c | 76 ++++++++++++----
src/backend/access/transam/xact.c | 35 +-------
src/backend/access/transam/xlog.c | 113 ++++++------------------
src/backend/replication/walreceiver.c | 5 +-
src/backend/replication/walsender.c | 5 +-
src/backend/storage/ipc/procarray.c | 34 ++-----
src/backend/storage/ipc/standby.c | 2 +-
src/backend/storage/lmgr/predicate.c | 2 +-
src/backend/utils/adt/txid.c | 13 ++-
src/backend/utils/misc/pg_controldata.c | 5 +-
src/bin/pg_controldata/pg_controldata.c | 5 +-
src/bin/pg_resetwal/pg_resetwal.c | 20 +++--
src/include/access/transam.h | 52 ++++++++++-
src/include/access/xlog.h | 1 -
src/include/catalog/pg_control.h | 6 +-
src/include/storage/standby.h | 2 +-
src/include/storage/standbydefs.h | 2 +-
src/tools/pgindent/typedefs.list | 1 +
24 files changed, 225 insertions(+), 238 deletions(-)
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index bfad284be08..33060f30429 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -14,6 +14,7 @@
*/
#include "postgres.h"
+#include "access/transam.h"
#include "access/xlog.h"
#include "access/xlog_internal.h"
#include "catalog/pg_control.h"
@@ -52,7 +53,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->ThisTimeLineID,
checkpoint->PrevTimeLineID,
checkpoint->fullPageWrites ? "true" : "false",
- checkpoint->nextXidEpoch, checkpoint->nextXid,
+ EpochFromFullTransactionId(checkpoint->nextFullXid),
+ XidFromFullTransactionId(checkpoint->nextFullXid),
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index aa089d83fa8..3bd55fbdd33 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -749,12 +749,12 @@ ZeroCLOGPage(int pageno, bool writeXlog)
/*
* This must be called ONCE during postmaster or standalone-backend startup,
- * after StartupXLOG has initialized ShmemVariableCache->nextXid.
+ * after StartupXLOG has initialized ShmemVariableCache->nextFullXid.
*/
void
StartupCLOG(void)
{
- TransactionId xid = ShmemVariableCache->nextXid;
+ TransactionId xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
int pageno = TransactionIdToPage(xid);
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
@@ -773,7 +773,7 @@ StartupCLOG(void)
void
TrimCLOG(void)
{
- TransactionId xid = ShmemVariableCache->nextXid;
+ TransactionId xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
int pageno = TransactionIdToPage(xid);
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
@@ -792,7 +792,7 @@ TrimCLOG(void)
* but makes no WAL entry). Let's just be safe. (We need not worry about
* pages beyond the current one, since those will be zeroed when first
* used. For the same reason, there is no need to do anything when
- * nextXid is exactly at a page boundary; and it's likely that the
+ * nextFullXid is exactly at a page boundary; and it's likely that the
* "current" page doesn't exist yet in that case.)
*/
if (TransactionIdToPgIndex(xid) != 0)
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 9d7f15935dc..8162f884bd1 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -553,7 +553,7 @@ ZeroCommitTsPage(int pageno, bool writeXlog)
/*
* This must be called ONCE during postmaster or standalone-backend startup,
- * after StartupXLOG has initialized ShmemVariableCache->nextXid.
+ * after StartupXLOG has initialized ShmemVariableCache->nextFullXid.
*/
void
StartupCommitTs(void)
@@ -643,7 +643,7 @@ ActivateCommitTs(void)
}
LWLockRelease(CommitTsLock);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
pageno = TransactionIdToCTsPage(xid);
/*
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index c3998719405..763b9997071 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -3267,9 +3267,9 @@ multixact_redo(XLogReaderState *record)
xlrec->moff + xlrec->nmembers);
/*
- * Make sure nextXid is beyond any XID mentioned in the record. This
- * should be unnecessary, since any XID found here ought to have other
- * evidence in the XLOG, but let's be safe.
+ * Make sure nextFullXid is beyond any XID mentioned in the record.
+ * This should be unnecessary, since any XID found here ought to have
+ * other evidence in the XLOG, but let's be safe.
*/
max_xid = XLogRecGetXid(record);
for (i = 0; i < xlrec->nmembers; i++)
@@ -3278,19 +3278,7 @@ multixact_redo(XLogReaderState *record)
max_xid = xlrec->members[i].xid;
}
- /*
- * We don't expect anyone else to modify nextXid, hence startup
- * process doesn't need to hold a lock while checking this. We still
- * acquire the lock to modify it, though.
- */
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPastXid(max_xid);
}
else if (info == XLOG_MULTIXACT_TRUNCATE_ID)
{
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c
index cbc61294eb9..e667fd02385 100644
--- a/src/backend/access/transam/subtrans.c
+++ b/src/backend/access/transam/subtrans.c
@@ -241,14 +241,15 @@ ZeroSUBTRANSPage(int pageno)
/*
* This must be called ONCE during postmaster or standalone-backend startup,
- * after StartupXLOG has initialized ShmemVariableCache->nextXid.
+ * after StartupXLOG has initialized ShmemVariableCache->nextFullXid.
*
- * oldestActiveXID is the oldest XID of any prepared transaction, or nextXid
+ * oldestActiveXID is the oldest XID of any prepared transaction, or nextFullXid
* if there are none.
*/
void
StartupSUBTRANS(TransactionId oldestActiveXID)
{
+ FullTransactionId nextFullXid;
int startPage;
int endPage;
@@ -261,7 +262,8 @@ StartupSUBTRANS(TransactionId oldestActiveXID)
LWLockAcquire(SubtransControlLock, LW_EXCLUSIVE);
startPage = TransactionIdToPage(oldestActiveXID);
- endPage = TransactionIdToPage(ShmemVariableCache->nextXid);
+ nextFullXid = ShmemVariableCache->nextFullXid;
+ endPage = TransactionIdToPage(XidFromFullTransactionId(nextFullXid));
while (startPage != endPage)
{
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 21986e48fe2..11992f7447d 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -1878,16 +1878,16 @@ restoreTwoPhaseData(void)
*
* Scan the shared memory entries of TwoPhaseState and determine the range
* of valid XIDs present. This is run during database startup, after we
- * have completed reading WAL. ShmemVariableCache->nextXid has been set to
+ * have completed reading WAL. ShmemVariableCache->nextFullXid has been set to
* one more than the highest XID for which evidence exists in WAL.
*
- * We throw away any prepared xacts with main XID beyond nextXid --- if any
+ * We throw away any prepared xacts with main XID beyond nextFullXid --- if any
* are present, it suggests that the DBA has done a PITR recovery to an
* earlier point in time without cleaning out pg_twophase. We dare not
* try to recover such prepared xacts since they likely depend on database
* state that doesn't exist now.
*
- * However, we will advance nextXid beyond any subxact XIDs belonging to
+ * However, we will advance nextFullXid beyond any subxact XIDs belonging to
* valid prepared xacts. We need to do this since subxact commit doesn't
* write a WAL entry, and so there might be no evidence in WAL of those
* subxact XIDs.
@@ -1897,7 +1897,7 @@ restoreTwoPhaseData(void)
* backup should be rolled in.
*
* Our other responsibility is to determine and return the oldest valid XID
- * among the prepared xacts (if none, return ShmemVariableCache->nextXid).
+ * among the prepared xacts (if none, return ShmemVariableCache->nextFullXid).
* This is needed to synchronize pg_subtrans startup properly.
*
* If xids_p and nxids_p are not NULL, pointer to a palloc'd array of all
@@ -1907,7 +1907,8 @@ restoreTwoPhaseData(void)
TransactionId
PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
{
- TransactionId origNextXid = ShmemVariableCache->nextXid;
+ FullTransactionId nextFullXid = ShmemVariableCache->nextFullXid;
+ TransactionId origNextXid = XidFromFullTransactionId(nextFullXid);
TransactionId result = origNextXid;
TransactionId *xids = NULL;
int nxids = 0;
@@ -2123,7 +2124,7 @@ RecoverPreparedTransactions(void)
*
* If setParent is true, set up subtransaction parent linkages.
*
- * If setNextXid is true, set ShmemVariableCache->nextXid to the newest
+ * If setNextXid is true, set ShmemVariableCache->nextFullXid to the newest
* value scanned.
*/
static char *
@@ -2132,7 +2133,8 @@ ProcessTwoPhaseBuffer(TransactionId xid,
bool fromdisk,
bool setParent, bool setNextXid)
{
- TransactionId origNextXid = ShmemVariableCache->nextXid;
+ FullTransactionId nextFullXid = ShmemVariableCache->nextFullXid;
+ TransactionId origNextXid = XidFromFullTransactionId(nextFullXid);
TransactionId *subxids;
char *buf;
TwoPhaseFileHeader *hdr;
@@ -2212,7 +2214,7 @@ ProcessTwoPhaseBuffer(TransactionId xid,
/*
* Examine subtransaction XIDs ... they should all follow main XID, and
- * they may force us to advance nextXid.
+ * they may force us to advance nextFullXid.
*/
subxids = (TransactionId *) (buf +
MAXALIGN(sizeof(TwoPhaseFileHeader)) +
@@ -2223,25 +2225,9 @@ ProcessTwoPhaseBuffer(TransactionId xid,
Assert(TransactionIdFollows(subxid, xid));
- /* update nextXid if needed */
- if (setNextXid &&
- TransactionIdFollowsOrEquals(subxid,
- ShmemVariableCache->nextXid))
- {
- /*
- * We don't expect anyone else to modify nextXid, hence we don't
- * need to hold a lock while examining it. We still acquire the
- * lock to modify it, though, so we recheck.
- */
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- if (TransactionIdFollowsOrEquals(subxid,
- ShmemVariableCache->nextXid))
- {
- ShmemVariableCache->nextXid = subxid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- }
- LWLockRelease(XidGenLock);
- }
+ /* update nextFullXid if needed */
+ if (setNextXid)
+ AdvanceNextFullTransactionIdPastXid(subxid);
if (setParent)
SubTransSetParent(subxid, xid);
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index fe94fdaf049..db141d286dc 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -73,7 +73,7 @@ GetNewTransactionId(bool isSubXact)
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*----------
* Check to see if it's safe to assign another XID. This protects against
@@ -156,7 +156,7 @@ GetNewTransactionId(bool isSubXact)
/* Re-acquire lock and start over */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
}
/*
@@ -173,12 +173,12 @@ GetNewTransactionId(bool isSubXact)
ExtendSUBTRANS(xid);
/*
- * Now advance the nextXid counter. This must not happen until after we
- * have successfully completed ExtendCLOG() --- if that routine fails, we
- * want the next incoming transaction to try it again. We cannot assign
- * more XIDs until there is CLOG space for them.
+ * Now advance the nextFullXid counter. This must not happen until after
+ * we have successfully completed ExtendCLOG() --- if that routine fails,
+ * we want the next incoming transaction to try it again. We cannot
+ * assign more XIDs until there is CLOG space for them.
*/
- TransactionIdAdvance(ShmemVariableCache->nextXid);
+ FullTransactionIdAdvance(&ShmemVariableCache->nextFullXid);
/*
* We must store the new XID into the shared ProcArray before releasing
@@ -236,18 +236,64 @@ GetNewTransactionId(bool isSubXact)
}
/*
- * Read nextXid but don't allocate it.
+ * Read nextFullXid but don't allocate it.
*/
-TransactionId
-ReadNewTransactionId(void)
+FullTransactionId
+ReadNextFullTransactionId(void)
{
- TransactionId xid;
+ FullTransactionId fullXid;
LWLockAcquire(XidGenLock, LW_SHARED);
- xid = ShmemVariableCache->nextXid;
+ fullXid = ShmemVariableCache->nextFullXid;
LWLockRelease(XidGenLock);
- return xid;
+ return fullXid;
+}
+
+/*
+ * Advance nextFullXid to the value after a given xid. The epoch is inferred.
+ * This must only be called during recovery or from two-phase start-up code.
+ */
+void
+AdvanceNextFullTransactionIdPastXid(TransactionId xid)
+{
+ FullTransactionId newNextFullXid;
+ TransactionId next_xid;
+ uint32 epoch;
+
+ /*
+ * It is safe to read nextFullXid without a lock, because this is only
+ * called from the startup process or single process mode, meaning that no
+ * other process can modify it.
+ */
+ Assert(AmStartupProcess() || !IsUnderPostmaster);
+
+ /* Fast return if this isn't an xid high enough to move the needle. */
+ next_xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
+ if (!TransactionIdFollowsOrEquals(xid, next_xid))
+ return;
+
+ /*
+ * Compute the FullTransactionId that comes after the given xid. To do
+ * this, we preserve the existing epoch, but detect when we've wrapped
+ * into a new epoch. This is necessary because WAL records and 2PC state
+ * currently contain 32 bit xids. The wrap logic is safe in those cases
+ * because the span of active xids cannot exceed one epoch at any given
+ * point in the WAL stream.
+ */
+ TransactionIdAdvance(xid);
+ epoch = EpochFromFullTransactionId(ShmemVariableCache->nextFullXid);
+ if (unlikely(xid < next_xid))
+ ++epoch;
+ newNextFullXid = FullTransactionIdFromEpochAndXid(epoch, xid);
+
+ /*
+ * We still need to take a lock to modify the value against concurrent
+ * readers.
+ */
+ LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextFullXid = newNextFullXid;
+ LWLockRelease(XidGenLock);
}
/*
@@ -351,7 +397,7 @@ SetTransactionIdLimit(TransactionId oldest_datfrozenxid, Oid oldest_datoid)
ShmemVariableCache->xidStopLimit = xidStopLimit;
ShmemVariableCache->xidWrapLimit = xidWrapLimit;
ShmemVariableCache->oldestXidDB = oldest_datoid;
- curXid = ShmemVariableCache->nextXid;
+ curXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
LWLockRelease(XidGenLock);
/* Log the info */
@@ -427,7 +473,7 @@ ForceTransactionIdLimitUpdate(void)
/* Locking is probably not really necessary, but let's be careful */
LWLockAcquire(XidGenLock, LW_SHARED);
- nextXid = ShmemVariableCache->nextXid;
+ nextXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
xidVacLimit = ShmemVariableCache->xidVacLimit;
oldestXid = ShmemVariableCache->oldestXid;
oldestXidDB = ShmemVariableCache->oldestXidDB;
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index c3214d4f4d8..9b100050597 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -5636,21 +5636,8 @@ xact_redo_commit(xl_xact_parsed_commit *parsed,
max_xid = TransactionIdLatest(xid, parsed->nsubxacts, parsed->subxacts);
- /*
- * Make sure nextXid is beyond any XID mentioned in the record.
- *
- * We don't expect anyone else to modify nextXid, hence we don't need to
- * hold a lock while checking this. We still acquire the lock to modify
- * it, though.
- */
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ /* Make sure nextFullXid is beyond any XID mentioned in the record. */
+ AdvanceNextFullTransactionIdPastXid(max_xid);
Assert(((parsed->xinfo & XACT_XINFO_HAS_ORIGIN) == 0) ==
(origin_id == InvalidRepOriginId));
@@ -5792,25 +5779,11 @@ xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid)
Assert(TransactionIdIsValid(xid));
- /*
- * Make sure nextXid is beyond any XID mentioned in the record.
- *
- * We don't expect anyone else to modify nextXid, hence we don't need to
- * hold a lock while checking this. We still acquire the lock to modify
- * it, though.
- */
+ /* Make sure nextFullXid is beyond any XID mentioned in the record. */
max_xid = TransactionIdLatest(xid,
parsed->nsubxacts,
parsed->subxacts);
-
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPastXid(max_xid);
if (standbyState == STANDBY_DISABLED)
{
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index ad12ebc4269..19d7911ec50 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -590,8 +590,7 @@ typedef struct XLogCtlData
/* Protected by info_lck: */
XLogwrtRqst LogwrtRqst;
XLogRecPtr RedoRecPtr; /* a recent copy of Insert->RedoRecPtr */
- uint32 ckptXidEpoch; /* nextXID & epoch of latest checkpoint */
- TransactionId ckptXid;
+ FullTransactionId ckptFullXid; /* nextFullXid of latest checkpoint */
XLogRecPtr asyncXactLSN; /* LSN of newest async commit/abort */
XLogRecPtr replicationSlotMinLSN; /* oldest LSN needed by any slot */
@@ -5115,8 +5114,8 @@ BootStrapXLOG(void)
checkPoint.ThisTimeLineID = ThisTimeLineID;
checkPoint.PrevTimeLineID = ThisTimeLineID;
checkPoint.fullPageWrites = fullPageWrites;
- checkPoint.nextXidEpoch = 0;
- checkPoint.nextXid = FirstNormalTransactionId;
+ checkPoint.nextFullXid =
+ FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstBootstrapObjectId;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
@@ -5129,7 +5128,7 @@ BootStrapXLOG(void)
checkPoint.time = (pg_time_t) time(NULL);
checkPoint.oldestActiveXid = InvalidTransactionId;
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
@@ -6557,8 +6556,8 @@ StartupXLOG(void)
(uint32) (checkPoint.redo >> 32), (uint32) checkPoint.redo,
wasShutdown ? "true" : "false")));
ereport(DEBUG1,
- (errmsg_internal("next transaction ID: %u:%u; next OID: %u",
- checkPoint.nextXidEpoch, checkPoint.nextXid,
+ (errmsg_internal("next transaction ID: " UINT64_FORMAT "; next OID: %u",
+ U64FromFullTransactionId(checkPoint.nextFullXid),
checkPoint.nextOid)));
ereport(DEBUG1,
(errmsg_internal("next MultiXactId: %u; next MultiXactOffset: %u",
@@ -6573,12 +6572,12 @@ StartupXLOG(void)
(errmsg_internal("commit timestamp Xid oldest/newest: %u/%u",
checkPoint.oldestCommitTsXid,
checkPoint.newestCommitTsXid)));
- if (!TransactionIdIsNormal(checkPoint.nextXid))
+ if (!TransactionIdIsNormal(XidFromFullTransactionId(checkPoint.nextFullXid)))
ereport(PANIC,
(errmsg("invalid next transaction ID")));
/* initialize shared memory variables from the checkpoint record */
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
@@ -6587,8 +6586,7 @@ StartupXLOG(void)
SetMultiXactIdLimit(checkPoint.oldestMulti, checkPoint.oldestMultiDB, true);
SetCommitTsLimit(checkPoint.oldestCommitTsXid,
checkPoint.newestCommitTsXid);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
/*
* Initialize replication slots, before there's a chance to remove
@@ -6859,7 +6857,7 @@ StartupXLOG(void)
Assert(TransactionIdIsValid(oldestActiveXID));
/* Tell procarray about the range of xids it has to deal with */
- ProcArrayInitRecovery(ShmemVariableCache->nextXid);
+ ProcArrayInitRecovery(XidFromFullTransactionId(ShmemVariableCache->nextFullXid));
/*
* Startup commit log and subtrans only. MultiXact and commit
@@ -6889,9 +6887,9 @@ StartupXLOG(void)
running.xcnt = nxids;
running.subxcnt = 0;
running.subxid_overflow = false;
- running.nextXid = checkPoint.nextXid;
+ running.nextXid = XidFromFullTransactionId(checkPoint.nextFullXid);
running.oldestRunningXid = oldestActiveXID;
- latestCompletedXid = checkPoint.nextXid;
+ latestCompletedXid = XidFromFullTransactionId(checkPoint.nextFullXid);
TransactionIdRetreat(latestCompletedXid);
Assert(TransactionIdIsNormal(latestCompletedXid));
running.latestCompletedXid = latestCompletedXid;
@@ -7061,20 +7059,10 @@ StartupXLOG(void)
error_context_stack = &errcallback;
/*
- * ShmemVariableCache->nextXid must be beyond record's xid.
- *
- * We don't expect anyone else to modify nextXid, hence we
- * don't need to hold a lock while examining it. We still
- * acquire the lock to modify it, though.
+ * ShmemVariableCache->nextFullXid must be beyond record's
+ * xid.
*/
- if (TransactionIdFollowsOrEquals(record->xl_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = record->xl_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPastXid(record->xl_xid);
/*
* Before replaying this record, check if this record causes
@@ -7654,7 +7642,7 @@ StartupXLOG(void)
/* also initialize latestCompletedXid, to nextXid - 1 */
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- ShmemVariableCache->latestCompletedXid = ShmemVariableCache->nextXid;
+ ShmemVariableCache->latestCompletedXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
TransactionIdRetreat(ShmemVariableCache->latestCompletedXid);
LWLockRelease(ProcArrayLock);
@@ -8247,41 +8235,6 @@ GetLastSegSwitchData(XLogRecPtr *lastSwitchLSN)
return result;
}
-/*
- * GetNextXidAndEpoch - get the current nextXid value and associated epoch
- *
- * This is exported for use by code that would like to have 64-bit XIDs.
- * We don't really support such things, but all XIDs within the system
- * can be presumed "close to" the result, and thus the epoch associated
- * with them can be determined.
- */
-void
-GetNextXidAndEpoch(TransactionId *xid, uint32 *epoch)
-{
- uint32 ckptXidEpoch;
- TransactionId ckptXid;
- TransactionId nextXid;
-
- /* Must read checkpoint info first, else have race condition */
- SpinLockAcquire(&XLogCtl->info_lck);
- ckptXidEpoch = XLogCtl->ckptXidEpoch;
- ckptXid = XLogCtl->ckptXid;
- SpinLockRelease(&XLogCtl->info_lck);
-
- /* Now fetch current nextXid */
- nextXid = ReadNewTransactionId();
-
- /*
- * nextXid is certainly logically later than ckptXid. So if it's
- * numerically less, it must have wrapped into the next epoch.
- */
- if (nextXid < ckptXid)
- ckptXidEpoch++;
-
- *xid = nextXid;
- *epoch = ckptXidEpoch;
-}
-
/*
* This must be called ONCE during postmaster or standalone-backend shutdown
*/
@@ -8701,7 +8654,7 @@ CreateCheckPoint(int flags)
* there.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- checkPoint.nextXid = ShmemVariableCache->nextXid;
+ checkPoint.nextFullXid = ShmemVariableCache->nextFullXid;
checkPoint.oldestXid = ShmemVariableCache->oldestXid;
checkPoint.oldestXidDB = ShmemVariableCache->oldestXidDB;
LWLockRelease(XidGenLock);
@@ -8711,11 +8664,6 @@ CreateCheckPoint(int flags)
checkPoint.newestCommitTsXid = ShmemVariableCache->newestCommitTsXid;
LWLockRelease(CommitTsLock);
- /* Increase XID epoch if we've wrapped around since last checkpoint */
- checkPoint.nextXidEpoch = ControlFile->checkPointCopy.nextXidEpoch;
- if (checkPoint.nextXid < ControlFile->checkPointCopy.nextXid)
- checkPoint.nextXidEpoch++;
-
LWLockAcquire(OidGenLock, LW_SHARED);
checkPoint.nextOid = ShmemVariableCache->nextOid;
if (!shutdown)
@@ -8859,8 +8807,7 @@ CreateCheckPoint(int flags)
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/*
@@ -9622,7 +9569,7 @@ xlog_redo(XLogReaderState *record)
memcpy(&checkPoint, XLogRecGetData(record), sizeof(CheckPoint));
/* In a SHUTDOWN checkpoint, believe the counters exactly */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
LWLockRelease(XidGenLock);
LWLockAcquire(OidGenLock, LW_EXCLUSIVE);
ShmemVariableCache->nextOid = checkPoint.nextOid;
@@ -9676,9 +9623,9 @@ xlog_redo(XLogReaderState *record)
running.xcnt = nxids;
running.subxcnt = 0;
running.subxid_overflow = false;
- running.nextXid = checkPoint.nextXid;
+ running.nextXid = XidFromFullTransactionId(checkPoint.nextFullXid);
running.oldestRunningXid = oldestActiveXID;
- latestCompletedXid = checkPoint.nextXid;
+ latestCompletedXid = XidFromFullTransactionId(checkPoint.nextFullXid);
TransactionIdRetreat(latestCompletedXid);
Assert(TransactionIdIsNormal(latestCompletedXid));
running.latestCompletedXid = latestCompletedXid;
@@ -9690,13 +9637,11 @@ xlog_redo(XLogReaderState *record)
}
/* ControlFile->checkPointCopy always tracks the latest ckpt XID */
- ControlFile->checkPointCopy.nextXidEpoch = checkPoint.nextXidEpoch;
- ControlFile->checkPointCopy.nextXid = checkPoint.nextXid;
+ ControlFile->checkPointCopy.nextFullXid = checkPoint.nextFullXid;
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/*
@@ -9717,9 +9662,9 @@ xlog_redo(XLogReaderState *record)
memcpy(&checkPoint, XLogRecGetData(record), sizeof(CheckPoint));
/* In an ONLINE checkpoint, treat the XID counter as a minimum */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- if (TransactionIdPrecedes(ShmemVariableCache->nextXid,
- checkPoint.nextXid))
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ if (FullTransactionIdPrecedes(ShmemVariableCache->nextFullXid,
+ checkPoint.nextFullXid))
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
LWLockRelease(XidGenLock);
/*
@@ -9749,13 +9694,11 @@ xlog_redo(XLogReaderState *record)
SetTransactionIdLimit(checkPoint.oldestXid,
checkPoint.oldestXidDB);
/* ControlFile->checkPointCopy always tracks the latest ckpt XID */
- ControlFile->checkPointCopy.nextXidEpoch = checkPoint.nextXidEpoch;
- ControlFile->checkPointCopy.nextXid = checkPoint.nextXid;
+ ControlFile->checkPointCopy.nextFullXid = checkPoint.nextFullXid;
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/* TLI should not change in an on-line checkpoint */
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index d9959e568a8..f32cf91ffb3 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -1160,6 +1160,7 @@ static void
XLogWalRcvSendHSFeedback(bool immed)
{
TimestampTz now;
+ FullTransactionId nextFullXid;
TransactionId nextXid;
uint32 xmin_epoch,
catalog_xmin_epoch;
@@ -1238,7 +1239,9 @@ XLogWalRcvSendHSFeedback(bool immed)
* Get epoch and adjust if nextXid and oldestXmin are different sides of
* the epoch boundary.
*/
- GetNextXidAndEpoch(&nextXid, &xmin_epoch);
+ nextFullXid = ReadNextFullTransactionId();
+ nextXid = XidFromFullTransactionId(nextFullXid);
+ xmin_epoch = EpochFromFullTransactionId(nextFullXid);
catalog_xmin_epoch = xmin_epoch;
if (nextXid < xmin)
xmin_epoch--;
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 4bb98ef352a..21f5c868f18 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1912,10 +1912,13 @@ PhysicalReplicationSlotNewXmin(TransactionId feedbackXmin, TransactionId feedbac
static bool
TransactionIdInRecentPast(TransactionId xid, uint32 epoch)
{
+ FullTransactionId nextFullXid;
TransactionId nextXid;
uint32 nextEpoch;
- GetNextXidAndEpoch(&nextXid, &nextEpoch);
+ nextFullXid = ReadNextFullTransactionId();
+ nextXid = XidFromFullTransactionId(nextFullXid);
+ nextEpoch = EpochFromFullTransactionId(nextFullXid);
if (xid <= nextXid)
{
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index cf93357997c..76d6833b017 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -664,7 +664,6 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
{
TransactionId *xids;
int nxids;
- TransactionId nextXid;
int i;
Assert(standbyState >= STANDBY_INITIALIZED);
@@ -881,23 +880,10 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
LWLockRelease(ProcArrayLock);
- /*
- * ShmemVariableCache->nextXid must be beyond any observed xid.
- *
- * We don't expect anyone else to modify nextXid, hence we don't need to
- * hold a lock while examining it. We still acquire the lock to modify
- * it, though.
- */
- nextXid = latestObservedXid;
- TransactionIdAdvance(nextXid);
- if (TransactionIdFollows(nextXid, ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = nextXid;
- LWLockRelease(XidGenLock);
- }
+ /* ShmemVariableCache->nextFullXid must be beyond any observed xid. */
+ AdvanceNextFullTransactionIdPastXid(latestObservedXid);
- Assert(TransactionIdIsValid(ShmemVariableCache->nextXid));
+ Assert(TransactionIdIsValid(XidFromFullTransactionId(ShmemVariableCache->nextFullXid)));
KnownAssignedXidsDisplay(trace_recovery(DEBUG3));
if (standbyState == STANDBY_SNAPSHOT_READY)
@@ -2001,7 +1987,7 @@ GetRunningTransactionData(void)
latestCompletedXid = ShmemVariableCache->latestCompletedXid;
- oldestRunningXid = ShmemVariableCache->nextXid;
+ oldestRunningXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* Spin over procArray collecting all xids
@@ -2093,7 +2079,7 @@ GetRunningTransactionData(void)
CurrentRunningXacts->xcnt = count - subcount;
CurrentRunningXacts->subxcnt = subcount;
CurrentRunningXacts->subxid_overflow = suboverflowed;
- CurrentRunningXacts->nextXid = ShmemVariableCache->nextXid;
+ CurrentRunningXacts->nextXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
CurrentRunningXacts->oldestRunningXid = oldestRunningXid;
CurrentRunningXacts->latestCompletedXid = latestCompletedXid;
@@ -2138,7 +2124,7 @@ GetOldestActiveTransactionId(void)
* have already completed), when we spin over it.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- oldestRunningXid = ShmemVariableCache->nextXid;
+ oldestRunningXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
LWLockRelease(XidGenLock);
/*
@@ -2206,7 +2192,7 @@ GetOldestSafeDecodingTransactionId(bool catalogOnly)
* a safe, albeit pessimal, value.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- oldestSafeXid = ShmemVariableCache->nextXid;
+ oldestSafeXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* If there's already a slot pegging the xmin horizon, we can start with
@@ -3266,12 +3252,10 @@ RecordKnownAssignedTransactionIds(TransactionId xid)
*/
latestObservedXid = xid;
- /* ShmemVariableCache->nextXid must be beyond any observed xid */
+ /* ShmemVariableCache->nextFullXid must be beyond any observed xid */
+ AdvanceNextFullTransactionIdPastXid(latestObservedXid);
next_expected_xid = latestObservedXid;
TransactionIdAdvance(next_expected_xid);
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = next_expected_xid;
- LWLockRelease(XidGenLock);
}
}
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 4d10e57a803..cd56dca3aef 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -867,7 +867,7 @@ standby_redo(XLogReaderState *record)
* up from a checkpoint and are immediately at our starting point, we
* unconditionally move to STANDBY_INITIALIZED. After this point we
* must do 4 things:
- * * move shared nextXid forwards as we see new xids
+ * * move shared nextFullXid forwards as we see new xids
* * extend the clog and subtrans with each new xid
* * keep track of uncommitted known assigned xids
* * keep track of uncommitted AccessExclusiveLocks
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 92beaab5663..4e4d04bae37 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -3410,7 +3410,7 @@ ReleasePredicateLocks(bool isCommit, bool isReadOnlySafe)
* transaction to complete before freeing some RAM; correctness of visible
* behavior is not affected.
*/
- MySerializableXact->finishedBefore = ShmemVariableCache->nextXid;
+ MySerializableXact->finishedBefore = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* If it's not a commit it's either a rollback or a read-only transaction
diff --git a/src/backend/utils/adt/txid.c b/src/backend/utils/adt/txid.c
index 791bbf84024..daea21e84dd 100644
--- a/src/backend/utils/adt/txid.c
+++ b/src/backend/utils/adt/txid.c
@@ -91,7 +91,10 @@ typedef struct
static void
load_xid_epoch(TxidEpoch *state)
{
- GetNextXidAndEpoch(&state->last_xid, &state->epoch);
+ FullTransactionId fullXid = ReadNextFullTransactionId();
+
+ state->last_xid = XidFromFullTransactionId(fullXid);
+ state->epoch = EpochFromFullTransactionId(fullXid);
}
/*
@@ -114,8 +117,11 @@ TransactionIdInRecentPast(uint64 xid_with_epoch, TransactionId *extracted_xid)
TransactionId xid = (TransactionId) xid_with_epoch;
uint32 now_epoch;
TransactionId now_epoch_last_xid;
+ FullTransactionId now_xid;
- GetNextXidAndEpoch(&now_epoch_last_xid, &now_epoch);
+ now_xid = ReadNextFullTransactionId();
+ now_epoch_last_xid = XidFromFullTransactionId(now_xid);
+ now_epoch = EpochFromFullTransactionId(now_xid);
if (extracted_xid != NULL)
*extracted_xid = xid;
@@ -128,8 +134,7 @@ TransactionIdInRecentPast(uint64 xid_with_epoch, TransactionId *extracted_xid)
return true;
/* If the transaction ID is in the future, throw an error. */
- if (xid_epoch > now_epoch
- || (xid_epoch == now_epoch && xid > now_epoch_last_xid))
+ if (xid_with_epoch > U64FromFullTransactionId(now_xid))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("transaction ID %s is in the future",
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index e6742dc24b8..e675c33c547 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -16,6 +16,7 @@
#include "postgres.h"
#include "access/htup_details.h"
+#include "access/transam.h"
#include "access/xlog_internal.h"
#include "access/xlog.h"
#include "catalog/pg_control.h"
@@ -164,8 +165,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
nulls[5] = false;
values[6] = CStringGetTextDatum(psprintf("%u:%u",
- ControlFile->checkPointCopy.nextXidEpoch,
- ControlFile->checkPointCopy.nextXid));
+ EpochFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid),
+ XidFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid)));
nulls[6] = false;
values[7] = ObjectIdGetDatum(ControlFile->checkPointCopy.nextOid);
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index 1aa1db218ac..9a17d0f9c0f 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -20,6 +20,7 @@
#include <time.h>
+#include "access/transam.h"
#include "access/xlog.h"
#include "access/xlog_internal.h"
#include "catalog/pg_control.h"
@@ -256,8 +257,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's full_page_writes: %s\n"),
ControlFile->checkPointCopy.fullPageWrites ? _("on") : _("off"));
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
- ControlFile->checkPointCopy.nextXidEpoch,
- ControlFile->checkPointCopy.nextXid);
+ EpochFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid),
+ XidFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid));
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index a7b25ffe1cd..67fc646befb 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -430,11 +430,15 @@ main(int argc, char *argv[])
* if any, includes these values.)
*/
if (set_xid_epoch != -1)
- ControlFile.checkPointCopy.nextXidEpoch = set_xid_epoch;
+ ControlFile.checkPointCopy.nextFullXid =
+ FullTransactionIdFromEpochAndXid(set_xid_epoch,
+ XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
if (set_xid != 0)
{
- ControlFile.checkPointCopy.nextXid = set_xid;
+ ControlFile.checkPointCopy.nextFullXid =
+ FullTransactionIdFromEpochAndXid(EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid),
+ set_xid);
/*
* For the moment, just set oldestXid to a value that will force
@@ -704,8 +708,8 @@ GuessControlValues(void)
ControlFile.checkPointCopy.ThisTimeLineID = 1;
ControlFile.checkPointCopy.PrevTimeLineID = 1;
ControlFile.checkPointCopy.fullPageWrites = false;
- ControlFile.checkPointCopy.nextXidEpoch = 0;
- ControlFile.checkPointCopy.nextXid = FirstNormalTransactionId;
+ ControlFile.checkPointCopy.nextFullXid =
+ FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
ControlFile.checkPointCopy.nextOid = FirstBootstrapObjectId;
ControlFile.checkPointCopy.nextMulti = FirstMultiXactId;
ControlFile.checkPointCopy.nextMultiOffset = 0;
@@ -786,8 +790,8 @@ PrintControlValues(bool guessed)
printf(_("Latest checkpoint's full_page_writes: %s\n"),
ControlFile.checkPointCopy.fullPageWrites ? _("on") : _("off"));
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
- ControlFile.checkPointCopy.nextXidEpoch,
- ControlFile.checkPointCopy.nextXid);
+ EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid),
+ XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile.checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
@@ -879,7 +883,7 @@ PrintNewControlValues(void)
if (set_xid != 0)
{
printf(_("NextXID: %u\n"),
- ControlFile.checkPointCopy.nextXid);
+ XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
printf(_("OldestXID: %u\n"),
ControlFile.checkPointCopy.oldestXid);
printf(_("OldestXID's DB: %u\n"),
@@ -889,7 +893,7 @@ PrintNewControlValues(void)
if (set_xid_epoch != -1)
{
printf(_("NextXID epoch: %u\n"),
- ControlFile.checkPointCopy.nextXidEpoch);
+ EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
}
if (set_oldest_commit_ts_xid != 0)
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 78997e533e7..5715f0c5217 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -44,6 +44,32 @@
#define TransactionIdStore(xid, dest) (*(dest) = (xid))
#define StoreInvalidTransactionId(dest) (*(dest) = InvalidTransactionId)
+#define EpochFromFullTransactionId(x) ((uint32) ((x).value >> 32))
+#define XidFromFullTransactionId(x) ((uint32) (x).value)
+#define U64FromFullTransactionId(x) ((x).value)
+#define FullTransactionIdPrecedes(a, b) ((a).value < (b).value)
+#define FullTransactionIdPrecedesOrEquals(a, b) ((a).value <= (b).value)
+
+/*
+ * A 64 bit value that contains an epoch and a TransactionId. This is
+ * wrapped in a struct to prevent implicit conversion to/from TransactionId.
+ * Not all values represent valid normal XIDs.
+ */
+typedef struct FullTransactionId
+{
+ uint64 value;
+} FullTransactionId;
+
+static inline FullTransactionId
+FullTransactionIdFromEpochAndXid(uint32 epoch, TransactionId xid)
+{
+ FullTransactionId result;
+
+ result.value = ((uint64) epoch) << 32 | xid;
+
+ return result;
+}
+
/* advance a transaction ID variable, handling wraparound correctly */
#define TransactionIdAdvance(dest) \
do { \
@@ -52,6 +78,15 @@
(dest) = FirstNormalTransactionId; \
} while(0)
+/* advance a FullTransactionId variable, stepping over special XIDs */
+static inline void
+FullTransactionIdAdvance(FullTransactionId *dest)
+{
+ dest->value++;
+ while (XidFromFullTransactionId(*dest) < FirstNormalTransactionId)
+ dest->value++;
+}
+
/* back up a transaction ID variable, handling wraparound correctly */
#define TransactionIdRetreat(dest) \
do { \
@@ -125,12 +160,12 @@ typedef struct VariableCacheData
/*
* These fields are protected by XidGenLock.
*/
- TransactionId nextXid; /* next XID to assign */
+ FullTransactionId nextFullXid; /* next full XID to assign */
TransactionId oldestXid; /* cluster-wide minimum datfrozenxid */
TransactionId xidVacLimit; /* start forcing autovacuums here */
TransactionId xidWarnLimit; /* start complaining here */
- TransactionId xidStopLimit; /* refuse to advance nextXid beyond here */
+ TransactionId xidStopLimit; /* refuse to advance nextFullXid beyond here */
TransactionId xidWrapLimit; /* where the world ends */
Oid oldestXidDB; /* database with minimum datfrozenxid */
@@ -156,6 +191,7 @@ typedef struct VariableCacheData
typedef VariableCacheData *VariableCache;
+
/* ----------------
* extern declarations
* ----------------
@@ -187,11 +223,21 @@ extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
/* in transam/varsup.c */
extern TransactionId GetNewTransactionId(bool isSubXact);
-extern TransactionId ReadNewTransactionId(void);
+extern void AdvanceNextFullTransactionIdPastXid(TransactionId xid);
+extern FullTransactionId ReadNextFullTransactionId(void);
extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
Oid oldest_datoid);
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+/*
+ * For callers that just need the XID part of the next transaction ID.
+ */
+static inline TransactionId
+ReadNewTransactionId(void)
+{
+ return XidFromFullTransactionId(ReadNextFullTransactionId());
+}
+
#endif /* TRAMSAM_H */
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index bd74e7aaa03..eb6c44649dc 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -310,7 +310,6 @@ extern XLogRecPtr GetRedoRecPtr(void);
extern XLogRecPtr GetInsertRecPtr(void);
extern XLogRecPtr GetFlushRecPtr(void);
extern XLogRecPtr GetLastImportantRecPtr(void);
-extern void GetNextXidAndEpoch(TransactionId *xid, uint32 *epoch);
extern void RemovePromoteSignalFiles(void);
extern bool CheckPromoteSignal(void);
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index a3910a5f997..ff98d9e91a8 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -15,13 +15,14 @@
#ifndef PG_CONTROL_H
#define PG_CONTROL_H
+#include "access/transam.h"
#include "access/xlogdefs.h"
#include "pgtime.h" /* for pg_time_t */
#include "port/pg_crc32c.h"
/* Version identifier for this pg_control format */
-#define PG_CONTROL_VERSION 1200
+#define PG_CONTROL_VERSION 1201
/* Nonce key length, see below */
#define MOCK_AUTH_NONCE_LEN 32
@@ -39,8 +40,7 @@ typedef struct CheckPoint
TimeLineID PrevTimeLineID; /* previous TLI, if this record begins a new
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
- uint32 nextXidEpoch; /* higher-order bits of nextXid */
- TransactionId nextXid; /* next free XID */
+ FullTransactionId nextFullXid; /* next free full transaction ID */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index 346a3108bc1..23612435148 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -72,7 +72,7 @@ typedef struct RunningTransactionsData
int xcnt; /* # of xact ids in xids[] */
int subxcnt; /* # of subxact ids in xids[] */
bool subxid_overflow; /* snapshot overflowed, subxids missing */
- TransactionId nextXid; /* copy of ShmemVariableCache->nextXid */
+ TransactionId nextXid; /* xid from ShmemVariableCache->nextFullXid */
TransactionId oldestRunningXid; /* *not* oldestXmin */
TransactionId latestCompletedXid; /* so we can set xmax */
diff --git a/src/include/storage/standbydefs.h b/src/include/storage/standbydefs.h
index cc8ccd5d369..01d2db6ac6e 100644
--- a/src/include/storage/standbydefs.h
+++ b/src/include/storage/standbydefs.h
@@ -49,7 +49,7 @@ typedef struct xl_running_xacts
int xcnt; /* # of xact ids in xids[] */
int subxcnt; /* # of subxact ids in xids[] */
bool subxid_overflow; /* snapshot overflowed, subxids missing */
- TransactionId nextXid; /* copy of ShmemVariableCache->nextXid */
+ TransactionId nextXid; /* xid from ShmemVariableCache->nextFullXid */
TransactionId oldestRunningXid; /* *not* oldestXmin */
TransactionId latestCompletedXid; /* so we can set xmax */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 88fb396910c..d4022acd91e 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -796,6 +796,7 @@ FreePageManager
FreePageSpanLeader
FromCharDateMode
FromExpr
+FullTransactionId
FuncCall
FuncCallContext
FuncCandidateList
--
2.21.0
0002-Use-64-bit-transaction-IDs-for-the-transaction-st-v7.patchapplication/x-patch; name=0002-Use-64-bit-transaction-IDs-for-the-transaction-st-v7.patchDownload
From f9845bb8bf9a501c7c0d19e8faf394e8a42cc6d8 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Mon, 25 Mar 2019 22:45:22 +1300
Subject: [PATCH 2/2] Use 64 bit transaction IDs for the transaction stack.
Provide GetTopFullTransactionId() and GetCurrentFullTransactionId().
The intended users of these interfaces are table access methods that
use xids for visibility checks but don't want to have to "freeze"
tuples.
Discussion: https://postgr.es/m/CAA4eK1%2BMv%2Bmb0HFfWM9Srtc6MVe160WFurXV68iAFMcagRZ0dQ%40mail.gmail.com
---
src/backend/access/transam/varsup.c | 24 +++--
src/backend/access/transam/xact.c | 160 +++++++++++++++++++---------
src/include/access/transam.h | 4 +
src/include/access/xact.h | 5 +
4 files changed, 136 insertions(+), 57 deletions(-)
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index db141d286dc..0286eb99b60 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -35,7 +35,8 @@ VariableCache ShmemVariableCache = NULL;
/*
- * Allocate the next XID for a new transaction or subtransaction.
+ * Allocate the next FullTransactionId for a new transaction or
+ * subtransaction.
*
* The new XID is also stored into MyPgXact before returning.
*
@@ -44,9 +45,10 @@ VariableCache ShmemVariableCache = NULL;
* does something. So it is safe to do a database lookup if we want to
* issue a warning about XID wrap.
*/
-TransactionId
-GetNewTransactionId(bool isSubXact)
+FullTransactionId
+GetNewFullTransactionId(bool isSubXact)
{
+ FullTransactionId full_xid;
TransactionId xid;
/*
@@ -64,7 +66,7 @@ GetNewTransactionId(bool isSubXact)
{
Assert(!isSubXact);
MyPgXact->xid = BootstrapTransactionId;
- return BootstrapTransactionId;
+ return FullTransactionIdFromEpochAndXid(0, BootstrapTransactionId);
}
/* safety check, we should never get this far in a HS standby */
@@ -73,7 +75,8 @@ GetNewTransactionId(bool isSubXact)
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
+ full_xid = ShmemVariableCache->nextFullXid;
+ xid = XidFromFullTransactionId(full_xid);
/*----------
* Check to see if it's safe to assign another XID. This protects against
@@ -232,7 +235,16 @@ GetNewTransactionId(bool isSubXact)
LWLockRelease(XidGenLock);
- return xid;
+ return full_xid;
+}
+
+/*
+ * Allocate the next FullTransactionId, but return only the XID part.
+ */
+TransactionId
+GetNewTransactionId(bool isSubXact)
+{
+ return XidFromFullTransactionId(GetNewFullTransactionId(isSubXact));
}
/*
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 9b100050597..fa5787c67f0 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -105,7 +105,7 @@ int synchronous_commit = SYNCHRONOUS_COMMIT_ON;
* The XIDs are stored sorted in numerical order (not logical order) to make
* lookups as fast as possible.
*/
-TransactionId XactTopTransactionId = InvalidTransactionId;
+FullTransactionId XactTopFullTransactionId = {InvalidTransactionId};
int nParallelCurrentXids = 0;
TransactionId *ParallelCurrentXids;
@@ -171,7 +171,7 @@ typedef enum TBlockState
*/
typedef struct TransactionStateData
{
- TransactionId transactionId; /* my XID, or Invalid if none */
+ FullTransactionId fullTransactionId; /* my FullTransactionId */
SubTransactionId subTransactionId; /* my subxact ID */
char *name; /* savepoint name, if any */
int savepointLevel; /* savepoint level */
@@ -372,9 +372,9 @@ IsAbortedTransactionBlockState(void)
TransactionId
GetTopTransactionId(void)
{
- if (!TransactionIdIsValid(XactTopTransactionId))
+ if (!FullTransactionIdIsValid(XactTopFullTransactionId))
AssignTransactionId(&TopTransactionStateData);
- return XactTopTransactionId;
+ return XidFromFullTransactionId(XactTopFullTransactionId);
}
/*
@@ -387,7 +387,7 @@ GetTopTransactionId(void)
TransactionId
GetTopTransactionIdIfAny(void)
{
- return XactTopTransactionId;
+ return XidFromFullTransactionId(XactTopFullTransactionId);
}
/*
@@ -400,11 +400,7 @@ GetTopTransactionIdIfAny(void)
TransactionId
GetCurrentTransactionId(void)
{
- TransactionState s = CurrentTransactionState;
-
- if (!TransactionIdIsValid(s->transactionId))
- AssignTransactionId(s);
- return s->transactionId;
+ return XidFromFullTransactionId(GetCurrentFullTransactionId());
}
/*
@@ -417,7 +413,66 @@ GetCurrentTransactionId(void)
TransactionId
GetCurrentTransactionIdIfAny(void)
{
- return CurrentTransactionState->transactionId;
+ return XidFromFullTransactionId(GetCurrentFullTransactionIdIfAny());
+}
+
+/*
+ * GetTopFullTransactionId
+ *
+ * This will return the XID of the main transaction, assigning one if
+ * it's not yet set. Be careful to call this only inside a valid xact.
+ */
+FullTransactionId
+GetTopFullTransactionId(void)
+{
+ if (!FullTransactionIdIsValid(XactTopFullTransactionId))
+ AssignTransactionId(&TopTransactionStateData);
+ return XactTopFullTransactionId;
+}
+
+/*
+ * GetTopFullTransactionIdIfAny
+ *
+ * This will return the FullTransactionId of the main transaction, if one is
+ * assigned. It will return a value for which FullTransactionIdIsValid()
+ * returns false if we are not currently inside a transaction, or inside a
+ * transaction that hasn't yet been assigned an XID.
+ */
+FullTransactionId
+GetTopFullTransactionIdIfAny(void)
+{
+ return XactTopFullTransactionId;
+}
+
+/*
+ * GetCurrentFullTransactionId
+ *
+ * This will return the FullTransactionId of the current transaction (main or
+ * sub transaction), assigning one if it's not yet set. Be careful to call
+ * this only inside a valid xact.
+ */
+FullTransactionId
+GetCurrentFullTransactionId(void)
+{
+ TransactionState s = CurrentTransactionState;
+
+ if (!FullTransactionIdIsValid(s->fullTransactionId))
+ AssignTransactionId(s);
+ return s->fullTransactionId;
+}
+
+/*
+ * GetCurrentFullTransactionIdIfAny
+ *
+ * This will return the FullTransactionId of the current sub xact, if one is
+ * assigned. It will return a value for which FullTransactionIdIsValid()
+ * returns false if we are not currently inside a transaction, or inside a
+ * transaction that hasn't been assigned an XID yet.
+ */
+FullTransactionId
+GetCurrentFullTransactionIdIfAny(void)
+{
+ return CurrentTransactionState->fullTransactionId;
}
/*
@@ -428,7 +483,7 @@ GetCurrentTransactionIdIfAny(void)
void
MarkCurrentTransactionIdLoggedIfAny(void)
{
- if (TransactionIdIsValid(CurrentTransactionState->transactionId))
+ if (FullTransactionIdIsValid(CurrentTransactionState->fullTransactionId))
CurrentTransactionState->didLogXid = true;
}
@@ -477,7 +532,7 @@ AssignTransactionId(TransactionState s)
bool log_unknown_top = false;
/* Assert that caller didn't screw up */
- Assert(!TransactionIdIsValid(s->transactionId));
+ Assert(!FullTransactionIdIsValid(s->fullTransactionId));
Assert(s->state == TRANS_INPROGRESS);
/*
@@ -493,14 +548,14 @@ AssignTransactionId(TransactionState s)
* if we're at the bottom of a huge stack of subtransactions none of which
* have XIDs yet.
*/
- if (isSubXact && !TransactionIdIsValid(s->parent->transactionId))
+ if (isSubXact && !FullTransactionIdIsValid(s->parent->fullTransactionId))
{
TransactionState p = s->parent;
TransactionState *parents;
size_t parentOffset = 0;
parents = palloc(sizeof(TransactionState) * s->nestingLevel);
- while (p != NULL && !TransactionIdIsValid(p->transactionId))
+ while (p != NULL && !FullTransactionIdIsValid(p->fullTransactionId))
{
parents[parentOffset++] = p;
p = p->parent;
@@ -538,19 +593,20 @@ AssignTransactionId(TransactionState s)
* PG_PROC, the subtrans entry is needed to ensure that other backends see
* the Xid as "running". See GetNewTransactionId.
*/
- s->transactionId = GetNewTransactionId(isSubXact);
+ s->fullTransactionId = GetNewFullTransactionId(isSubXact);
if (!isSubXact)
- XactTopTransactionId = s->transactionId;
+ XactTopFullTransactionId = s->fullTransactionId;
if (isSubXact)
- SubTransSetParent(s->transactionId, s->parent->transactionId);
+ SubTransSetParent(XidFromFullTransactionId(s->fullTransactionId),
+ XidFromFullTransactionId(s->parent->fullTransactionId));
/*
* If it's a top-level transaction, the predicate locking system needs to
* be told about it too.
*/
if (!isSubXact)
- RegisterPredicateLockingXid(s->transactionId);
+ RegisterPredicateLockingXid(XidFromFullTransactionId(s->fullTransactionId));
/*
* Acquire lock on the transaction XID. (We assume this cannot block.) We
@@ -560,7 +616,7 @@ AssignTransactionId(TransactionState s)
currentOwner = CurrentResourceOwner;
CurrentResourceOwner = s->curTransactionOwner;
- XactLockTableInsert(s->transactionId);
+ XactLockTableInsert(XidFromFullTransactionId(s->fullTransactionId));
CurrentResourceOwner = currentOwner;
@@ -584,7 +640,7 @@ AssignTransactionId(TransactionState s)
*/
if (isSubXact && XLogStandbyInfoActive())
{
- unreportedXids[nUnreportedXids] = s->transactionId;
+ unreportedXids[nUnreportedXids] = XidFromFullTransactionId(s->fullTransactionId);
nUnreportedXids++;
/*
@@ -832,9 +888,9 @@ TransactionIdIsCurrentTransactionId(TransactionId xid)
if (s->state == TRANS_ABORT)
continue;
- if (!TransactionIdIsValid(s->transactionId))
+ if (!FullTransactionIdIsValid(s->fullTransactionId))
continue; /* it can't have any child XIDs either */
- if (TransactionIdEquals(xid, s->transactionId))
+ if (TransactionIdEquals(xid, XidFromFullTransactionId(s->fullTransactionId)))
return true;
/* As the childXids array is ordered, we can use binary search */
low = 0;
@@ -1495,7 +1551,7 @@ AtSubCommit_childXids(void)
* all XIDs already in the array belong to subtransactions started and
* subcommitted before us, so their XIDs must precede ours.
*/
- s->parent->childXids[s->parent->nChildXids] = s->transactionId;
+ s->parent->childXids[s->parent->nChildXids] = XidFromFullTransactionId(s->fullTransactionId);
if (s->nChildXids > 0)
memcpy(&s->parent->childXids[s->parent->nChildXids + 1],
@@ -1809,7 +1865,7 @@ StartTransaction(void)
s = &TopTransactionStateData;
CurrentTransactionState = s;
- Assert(XactTopTransactionId == InvalidTransactionId);
+ Assert(XidFromFullTransactionId(XactTopFullTransactionId) == InvalidTransactionId);
/* check the current transaction state */
Assert(s->state == TRANS_DEFAULT);
@@ -1821,7 +1877,7 @@ StartTransaction(void)
* flags are fetched below.
*/
s->state = TRANS_START;
- s->transactionId = InvalidTransactionId; /* until assigned */
+ s->fullTransactionId = InvalidFullTransactionId; /* until assigned */
/*
* initialize current transaction state fields
@@ -2165,7 +2221,7 @@ CommitTransaction(void)
AtCommit_Memory();
- s->transactionId = InvalidTransactionId;
+ s->fullTransactionId = InvalidFullTransactionId;
s->subTransactionId = InvalidSubTransactionId;
s->nestingLevel = 0;
s->gucNestLevel = 0;
@@ -2173,7 +2229,7 @@ CommitTransaction(void)
s->nChildXids = 0;
s->maxChildXids = 0;
- XactTopTransactionId = InvalidTransactionId;
+ XactTopFullTransactionId = InvalidFullTransactionId;
nParallelCurrentXids = 0;
/*
@@ -2448,7 +2504,7 @@ PrepareTransaction(void)
AtCommit_Memory();
- s->transactionId = InvalidTransactionId;
+ s->fullTransactionId = InvalidFullTransactionId;
s->subTransactionId = InvalidSubTransactionId;
s->nestingLevel = 0;
s->gucNestLevel = 0;
@@ -2456,7 +2512,7 @@ PrepareTransaction(void)
s->nChildXids = 0;
s->maxChildXids = 0;
- XactTopTransactionId = InvalidTransactionId;
+ XactTopFullTransactionId = InvalidFullTransactionId;
nParallelCurrentXids = 0;
/*
@@ -2686,7 +2742,7 @@ CleanupTransaction(void)
AtCleanup_Memory(); /* and transaction memory */
- s->transactionId = InvalidTransactionId;
+ s->fullTransactionId = InvalidFullTransactionId;
s->subTransactionId = InvalidSubTransactionId;
s->nestingLevel = 0;
s->gucNestLevel = 0;
@@ -2695,7 +2751,7 @@ CleanupTransaction(void)
s->maxChildXids = 0;
s->parallelModeLevel = 0;
- XactTopTransactionId = InvalidTransactionId;
+ XactTopFullTransactionId = InvalidFullTransactionId;
nParallelCurrentXids = 0;
/*
@@ -4693,7 +4749,7 @@ CommitSubTransaction(void)
*/
/* Post-commit cleanup */
- if (TransactionIdIsValid(s->transactionId))
+ if (FullTransactionIdIsValid(s->fullTransactionId))
AtSubCommit_childXids();
AfterTriggerEndSubXact(true);
AtSubCommit_Portals(s->subTransactionId,
@@ -4718,8 +4774,8 @@ CommitSubTransaction(void)
* The only lock we actually release here is the subtransaction XID lock.
*/
CurrentResourceOwner = s->curTransactionOwner;
- if (TransactionIdIsValid(s->transactionId))
- XactLockTableDelete(s->transactionId);
+ if (FullTransactionIdIsValid(s->fullTransactionId))
+ XactLockTableDelete(XidFromFullTransactionId(s->fullTransactionId));
/*
* Other locks should get transferred to their parent resource owner.
@@ -4872,7 +4928,7 @@ AbortSubTransaction(void)
(void) RecordTransactionAbort(true);
/* Post-abort cleanup */
- if (TransactionIdIsValid(s->transactionId))
+ if (FullTransactionIdIsValid(s->fullTransactionId))
AtSubAbort_childXids();
CallSubXactCallbacks(SUBXACT_EVENT_ABORT_SUB, s->subTransactionId,
@@ -4985,7 +5041,7 @@ PushTransaction(void)
* We can now stack a minimally valid subtransaction without fear of
* failure.
*/
- s->transactionId = InvalidTransactionId; /* until assigned */
+ s->fullTransactionId = InvalidFullTransactionId; /* until assigned */
s->subTransactionId = currentSubTransactionId;
s->parent = p;
s->nestingLevel = p->nestingLevel + 1;
@@ -5052,12 +5108,12 @@ Size
EstimateTransactionStateSpace(void)
{
TransactionState s;
- Size nxids = 6; /* iso level, deferrable, top & current XID,
+ Size nxids = 8; /* iso level, deferrable, top & current XID,
* command counter, XID count */
for (s = CurrentTransactionState; s != NULL; s = s->parent)
{
- if (TransactionIdIsValid(s->transactionId))
+ if (FullTransactionIdIsValid(s->fullTransactionId))
nxids = add_size(nxids, 1);
nxids = add_size(nxids, s->nChildXids);
}
@@ -5073,7 +5129,7 @@ EstimateTransactionStateSpace(void)
*
* We need to save and restore XactDeferrable, XactIsoLevel, and the XIDs
* associated with this transaction. The first eight bytes of the result
- * contain XactDeferrable and XactIsoLevel; the next twelve bytes contain the
+ * contain XactDeferrable and XactIsoLevel; the next 18 bytes contain the
* XID of the top-level transaction, the XID of the current transaction
* (or, in each case, InvalidTransactionId if none), and the current command
* counter. After that, the next 4 bytes contain a count of how many
@@ -5093,8 +5149,10 @@ SerializeTransactionState(Size maxsize, char *start_address)
result[c++] = (TransactionId) XactIsoLevel;
result[c++] = (TransactionId) XactDeferrable;
- result[c++] = XactTopTransactionId;
- result[c++] = CurrentTransactionState->transactionId;
+ result[c++] = EpochFromFullTransactionId(XactTopFullTransactionId);
+ result[c++] = XidFromFullTransactionId(XactTopFullTransactionId);
+ result[c++] = EpochFromFullTransactionId(CurrentTransactionState->fullTransactionId);
+ result[c++] = XidFromFullTransactionId(CurrentTransactionState->fullTransactionId);
result[c++] = (TransactionId) currentCommandId;
Assert(maxsize >= c * sizeof(TransactionId));
@@ -5118,7 +5176,7 @@ SerializeTransactionState(Size maxsize, char *start_address)
*/
for (s = CurrentTransactionState; s != NULL; s = s->parent)
{
- if (TransactionIdIsValid(s->transactionId))
+ if (FullTransactionIdIsValid(s->fullTransactionId))
nxids = add_size(nxids, 1);
nxids = add_size(nxids, s->nChildXids);
}
@@ -5128,8 +5186,8 @@ SerializeTransactionState(Size maxsize, char *start_address)
workspace = palloc(nxids * sizeof(TransactionId));
for (s = CurrentTransactionState; s != NULL; s = s->parent)
{
- if (TransactionIdIsValid(s->transactionId))
- workspace[i++] = s->transactionId;
+ if (FullTransactionIdIsValid(s->fullTransactionId))
+ workspace[i++] = XidFromFullTransactionId(s->fullTransactionId);
memcpy(&workspace[i], s->childXids,
s->nChildXids * sizeof(TransactionId));
i += s->nChildXids;
@@ -5159,11 +5217,11 @@ StartParallelWorkerTransaction(char *tstatespace)
XactIsoLevel = (int) tstate[0];
XactDeferrable = (bool) tstate[1];
- XactTopTransactionId = tstate[2];
- CurrentTransactionState->transactionId = tstate[3];
- currentCommandId = tstate[4];
- nParallelCurrentXids = (int) tstate[5];
- ParallelCurrentXids = &tstate[6];
+ XactTopFullTransactionId = FullTransactionIdFromEpochAndXid(tstate[2], tstate[3]);
+ CurrentTransactionState->fullTransactionId = FullTransactionIdFromEpochAndXid(tstate[4], tstate[5]);
+ currentCommandId = tstate[6];
+ nParallelCurrentXids = (int) tstate[7];
+ ParallelCurrentXids = &tstate[8];
CurrentTransactionState->blockState = TBLOCK_PARALLEL_INPROGRESS;
}
@@ -5222,7 +5280,7 @@ ShowTransactionStateRec(const char *str, TransactionState s)
PointerIsValid(s->name) ? s->name : "unnamed",
BlockStateAsString(s->blockState),
TransStateAsString(s->state),
- (unsigned int) s->transactionId,
+ (unsigned int) XidFromFullTransactionId(s->fullTransactionId),
(unsigned int) s->subTransactionId,
(unsigned int) currentCommandId,
currentCommandIdUsed ? " (used)" : "",
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 5715f0c5217..614565a47b8 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -49,6 +49,9 @@
#define U64FromFullTransactionId(x) ((x).value)
#define FullTransactionIdPrecedes(a, b) ((a).value < (b).value)
#define FullTransactionIdPrecedesOrEquals(a, b) ((a).value <= (b).value)
+#define FullTransactionIdIsValid(x) TransactionIdIsValid(XidFromFullTransactionId(x))
+#define InvalidFullTransactionId \
+ FullTransactionIdFromEpochAndXid(0, InvalidTransactionId)
/*
* A 64 bit value that contains an epoch and a TransactionId. This is
@@ -223,6 +226,7 @@ extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
/* in transam/varsup.c */
extern TransactionId GetNewTransactionId(bool isSubXact);
+extern FullTransactionId GetNewFullTransactionId(bool isSubXact);
extern void AdvanceNextFullTransactionIdPastXid(TransactionId xid);
extern FullTransactionId ReadNextFullTransactionId(void);
extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
diff --git a/src/include/access/xact.h b/src/include/access/xact.h
index e8579dcd478..b550343c4db 100644
--- a/src/include/access/xact.h
+++ b/src/include/access/xact.h
@@ -14,6 +14,7 @@
#ifndef XACT_H
#define XACT_H
+#include "access/transam.h"
#include "access/xlogreader.h"
#include "lib/stringinfo.h"
#include "nodes/pg_list.h"
@@ -355,6 +356,10 @@ extern TransactionId GetCurrentTransactionId(void);
extern TransactionId GetCurrentTransactionIdIfAny(void);
extern TransactionId GetStableLatestTransactionId(void);
extern SubTransactionId GetCurrentSubTransactionId(void);
+extern FullTransactionId GetTopFullTransactionId(void);
+extern FullTransactionId GetTopFullTransactionIdIfAny(void);
+extern FullTransactionId GetCurrentFullTransactionId(void);
+extern FullTransactionId GetCurrentFullTransactionIdIfAny(void);
extern void MarkCurrentTransactionIdLoggedIfAny(void);
extern bool SubTransactionIsActive(SubTransactionId subxid);
extern CommandId GetCurrentCommandId(bool used);
--
2.21.0
On 25/03/2019 12:49, Thomas Munro wrote:
On Mon, Mar 25, 2019 at 5:01 PM Thomas Munro <thomas.munro@gmail.com> wrote:
New version attached. I'd like to commit this for PG12.
Here is a follow-up sketch patch that shows FullTransactionId being
used in the transaction stack, so you can call eg
GetCurrentFullTransactionId(). A table access method could use this
to avoid the need to freeze stuff later (eg zheap).I suppose it's not strictly necessary, since you could use
GetCurrentTransactionId() and infer the epoch by comparing with
ReadNextFullTransactionId() (now that the epoch counting is reliable,
due to patch 0001 which I'm repeating again here just for cfbot). But
I suppose we want to get away from that sort of thing. Thoughts?
Looks good.
I started to write a patch to use XID & epoch in dealing with GiST page
deletions [1]/messages/by-id/5f7ed675-d1fc-66ef-f316-645080ff9625@iki.fi, and I really could've used an epoch to go with
RecentGlobalXmin. I presume that would be pretty straightforward to have
with this, too.
[1]: /messages/by-id/5f7ed675-d1fc-66ef-f316-645080ff9625@iki.fi
/messages/by-id/5f7ed675-d1fc-66ef-f316-645080ff9625@iki.fi
- Heikki
On Tue, Mar 26, 2019 at 3:23 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
Looks good.
I started to write a patch to use XID & epoch in dealing with GiST page
deletions [1], and I really could've used an epoch to go with
RecentGlobalXmin. I presume that would be pretty straightforward to have
with this, too.
Yeah. A simple way to compute that would be to use FullTransactionId
in PGXACT, so that GetSnapshotData() could find the minimum value and
put that into GlobalRecentFullXmin, and set GlobalRecentXmin with the
truncated version (or perhaps #define it as
(XidFromFullTransactionId(GlobalRecentFullXmin))). Increasing the
number of cachelines that GetSnapshotData() touches may not be
popular, though. I think you could probably reclaim that space by
using a more compact representation of vacuumFlags, overflowed,
delayChkpt, nxids (it's funny, the comment says "as tightly as
possible", which clearly isn't the case). You'd probably also need
atomic 64 bit reads for the FullTransactionIds, which would be ugly on
ancient systems but otherwise no problem.
Instead of all that you might want to just infer the epoch instead.
I'm not sure of the exact logic required for that off-hand. You'd
probably want nextFullXid as a reference to compute the epoch, but
you'd either need to acquire a lock to read it while you already hold
ProcArrayLock (!), or read it atomically after releasing ProcArrayLock
... but then a Deathstation 9000 might allocate 8 billion more xids
with all the necessary auto-vacuuming to allow that before scheduling
you back onto the CPU. Admittedly, I haven't though about this very
deeply at all, there may be a simple and correct way to do it.
--
Thomas Munro
https://enterprisedb.com
On Tue, Mar 26, 2019 at 12:58 PM Thomas Munro <thomas.munro@gmail.com> wrote:
... I think you could probably reclaim that space by
using a more compact representation of vacuumFlags, overflowed,
delayChkpt, nxids (it's funny, the comment says "as tightly as
possible", which clearly isn't the case).
Woops, I take that back. I was thinking of sizeof(bool) == 4, but
it's usually 1, so you could probably only squeeze a couple of bytes
out by moving those flags. allPgXact elements are currently 12 bytes
apart on this system. It's possible that expanding it to 16 bytes
would be OK, not sure, and I see there was a whole thread
investigating that a couple of years back:
/messages/by-id/CAPpHfdtJY4zOEDsjad6J5AyZMqZcv6gSY9AkKpA7qN3jyQ2+1Q@mail.gmail.com
--
Thomas Munro
https://enterprisedb.com
On Tue, Mar 26, 2019 at 12:58 PM Thomas Munro <thomas.munro@gmail.com> wrote:
On Tue, Mar 26, 2019 at 3:23 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
Looks good.
I did some testing and proof-reading and made a few minor changes:
* I tidied up the code that serialises transaction state. It was
already hammering round pegs into square holes, and the previous patch
made that even worse, so I added a new struct
SerializedTransactionState to do this properly.
* I open-coded Get{Current,Top}TransactionId[IfAny](), rather than
having them call the "Full" variants, so that nobody could accuse me
of adding an extra function call that might not be inlined. It's just
a couple of lines anyway.
* I kept the name GetNewTransactionId(), since it's referred to in
many places in comments etc. Previously I had called it
GetNewFullTransactionId() and had GetNewTransactionId() just call that
and truncate to 32 bits, but there wasn't much point without an
in-tree caller for the narrow version. If there is any out-of-tree
code calling this, it will now fail to compile thanks to our
non-convertible return type.
These are the patches I'm planning to push tomorrow.
I still need to look into Andres's suggestion about getting rid of
epoch from various user interfaces and showing 64 bit numbers. I
should probably also find a place in the relevant README to explain
this new scheme. I will post follow-up patches for those.
--
Thomas Munro
https://enterprisedb.com
Attachments:
0001-Add-basic-infrastructure-for-64-bit-transaction-I-v8.patchapplication/octet-stream; name=0001-Add-basic-infrastructure-for-64-bit-transaction-I-v8.patchDownload
From 3cba2a1270ac581274ab8a275765fb1ffea4b325 Mon Sep 17 00:00:00 2001
From: Thomas Munro <tmunro@postgresql.org>
Date: Wed, 27 Mar 2019 21:58:59 +1300
Subject: [PATCH 1/2] Add basic infrastructure for 64 bit transaction IDs.
Instead of inferring epoch progress from xids and checkpoints,
introduce a 64 bit FullTransactionId type and use it to track xid
generation. This fixes an unlikely bug where the epoch is reported
incorrectly if the range of active xids wraps around more than once
between checkpoints.
The type is wrapped in a struct that we pass by value, as a form of
strong typedef preventing accidental use where TransactionId is
expected or vice versa.
The only user-visible effect of this commit is to correct the epoch
used by txid_current() and txid_status(), and visible with
pg_controldata, in those rare circumstances. It also creates some
basic infrastructure so that later patches can use 64 bit
transaction IDs in more places.
Author: Thomas Munro
Reported-by: Amit Kapila
Reviewed-by: Andres Freund, Tom Lane
Discussion: https://postgr.es/m/CAA4eK1%2BMv%2Bmb0HFfWM9Srtc6MVe160WFurXV68iAFMcagRZ0dQ%40mail.gmail.com
---
src/backend/access/rmgrdesc/xlogdesc.c | 4 +-
src/backend/access/transam/clog.c | 8 +-
src/backend/access/transam/commit_ts.c | 4 +-
src/backend/access/transam/multixact.c | 20 +----
src/backend/access/transam/subtrans.c | 8 +-
src/backend/access/transam/twophase.c | 40 +++------
src/backend/access/transam/varsup.c | 76 ++++++++++++----
src/backend/access/transam/xact.c | 35 +-------
src/backend/access/transam/xlog.c | 113 ++++++------------------
src/backend/replication/walreceiver.c | 5 +-
src/backend/replication/walsender.c | 5 +-
src/backend/storage/ipc/procarray.c | 34 ++-----
src/backend/storage/ipc/standby.c | 2 +-
src/backend/storage/lmgr/predicate.c | 2 +-
src/backend/utils/adt/txid.c | 13 ++-
src/backend/utils/misc/pg_controldata.c | 5 +-
src/bin/pg_controldata/pg_controldata.c | 5 +-
src/bin/pg_resetwal/pg_resetwal.c | 20 +++--
src/include/access/transam.h | 51 ++++++++++-
src/include/access/xlog.h | 1 -
src/include/catalog/pg_control.h | 6 +-
src/include/storage/standby.h | 2 +-
src/include/storage/standbydefs.h | 2 +-
src/tools/pgindent/typedefs.list | 1 +
24 files changed, 224 insertions(+), 238 deletions(-)
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index bfad284be08..33060f30429 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -14,6 +14,7 @@
*/
#include "postgres.h"
+#include "access/transam.h"
#include "access/xlog.h"
#include "access/xlog_internal.h"
#include "catalog/pg_control.h"
@@ -52,7 +53,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
checkpoint->ThisTimeLineID,
checkpoint->PrevTimeLineID,
checkpoint->fullPageWrites ? "true" : "false",
- checkpoint->nextXidEpoch, checkpoint->nextXid,
+ EpochFromFullTransactionId(checkpoint->nextFullXid),
+ XidFromFullTransactionId(checkpoint->nextFullXid),
checkpoint->nextOid,
checkpoint->nextMulti,
checkpoint->nextMultiOffset,
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index aa089d83fa8..3bd55fbdd33 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -749,12 +749,12 @@ ZeroCLOGPage(int pageno, bool writeXlog)
/*
* This must be called ONCE during postmaster or standalone-backend startup,
- * after StartupXLOG has initialized ShmemVariableCache->nextXid.
+ * after StartupXLOG has initialized ShmemVariableCache->nextFullXid.
*/
void
StartupCLOG(void)
{
- TransactionId xid = ShmemVariableCache->nextXid;
+ TransactionId xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
int pageno = TransactionIdToPage(xid);
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
@@ -773,7 +773,7 @@ StartupCLOG(void)
void
TrimCLOG(void)
{
- TransactionId xid = ShmemVariableCache->nextXid;
+ TransactionId xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
int pageno = TransactionIdToPage(xid);
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
@@ -792,7 +792,7 @@ TrimCLOG(void)
* but makes no WAL entry). Let's just be safe. (We need not worry about
* pages beyond the current one, since those will be zeroed when first
* used. For the same reason, there is no need to do anything when
- * nextXid is exactly at a page boundary; and it's likely that the
+ * nextFullXid is exactly at a page boundary; and it's likely that the
* "current" page doesn't exist yet in that case.)
*/
if (TransactionIdToPgIndex(xid) != 0)
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 9d7f15935dc..8162f884bd1 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -553,7 +553,7 @@ ZeroCommitTsPage(int pageno, bool writeXlog)
/*
* This must be called ONCE during postmaster or standalone-backend startup,
- * after StartupXLOG has initialized ShmemVariableCache->nextXid.
+ * after StartupXLOG has initialized ShmemVariableCache->nextFullXid.
*/
void
StartupCommitTs(void)
@@ -643,7 +643,7 @@ ActivateCommitTs(void)
}
LWLockRelease(CommitTsLock);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
pageno = TransactionIdToCTsPage(xid);
/*
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index c3998719405..763b9997071 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -3267,9 +3267,9 @@ multixact_redo(XLogReaderState *record)
xlrec->moff + xlrec->nmembers);
/*
- * Make sure nextXid is beyond any XID mentioned in the record. This
- * should be unnecessary, since any XID found here ought to have other
- * evidence in the XLOG, but let's be safe.
+ * Make sure nextFullXid is beyond any XID mentioned in the record.
+ * This should be unnecessary, since any XID found here ought to have
+ * other evidence in the XLOG, but let's be safe.
*/
max_xid = XLogRecGetXid(record);
for (i = 0; i < xlrec->nmembers; i++)
@@ -3278,19 +3278,7 @@ multixact_redo(XLogReaderState *record)
max_xid = xlrec->members[i].xid;
}
- /*
- * We don't expect anyone else to modify nextXid, hence startup
- * process doesn't need to hold a lock while checking this. We still
- * acquire the lock to modify it, though.
- */
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPastXid(max_xid);
}
else if (info == XLOG_MULTIXACT_TRUNCATE_ID)
{
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c
index cbc61294eb9..e667fd02385 100644
--- a/src/backend/access/transam/subtrans.c
+++ b/src/backend/access/transam/subtrans.c
@@ -241,14 +241,15 @@ ZeroSUBTRANSPage(int pageno)
/*
* This must be called ONCE during postmaster or standalone-backend startup,
- * after StartupXLOG has initialized ShmemVariableCache->nextXid.
+ * after StartupXLOG has initialized ShmemVariableCache->nextFullXid.
*
- * oldestActiveXID is the oldest XID of any prepared transaction, or nextXid
+ * oldestActiveXID is the oldest XID of any prepared transaction, or nextFullXid
* if there are none.
*/
void
StartupSUBTRANS(TransactionId oldestActiveXID)
{
+ FullTransactionId nextFullXid;
int startPage;
int endPage;
@@ -261,7 +262,8 @@ StartupSUBTRANS(TransactionId oldestActiveXID)
LWLockAcquire(SubtransControlLock, LW_EXCLUSIVE);
startPage = TransactionIdToPage(oldestActiveXID);
- endPage = TransactionIdToPage(ShmemVariableCache->nextXid);
+ nextFullXid = ShmemVariableCache->nextFullXid;
+ endPage = TransactionIdToPage(XidFromFullTransactionId(nextFullXid));
while (startPage != endPage)
{
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 21986e48fe2..11992f7447d 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -1878,16 +1878,16 @@ restoreTwoPhaseData(void)
*
* Scan the shared memory entries of TwoPhaseState and determine the range
* of valid XIDs present. This is run during database startup, after we
- * have completed reading WAL. ShmemVariableCache->nextXid has been set to
+ * have completed reading WAL. ShmemVariableCache->nextFullXid has been set to
* one more than the highest XID for which evidence exists in WAL.
*
- * We throw away any prepared xacts with main XID beyond nextXid --- if any
+ * We throw away any prepared xacts with main XID beyond nextFullXid --- if any
* are present, it suggests that the DBA has done a PITR recovery to an
* earlier point in time without cleaning out pg_twophase. We dare not
* try to recover such prepared xacts since they likely depend on database
* state that doesn't exist now.
*
- * However, we will advance nextXid beyond any subxact XIDs belonging to
+ * However, we will advance nextFullXid beyond any subxact XIDs belonging to
* valid prepared xacts. We need to do this since subxact commit doesn't
* write a WAL entry, and so there might be no evidence in WAL of those
* subxact XIDs.
@@ -1897,7 +1897,7 @@ restoreTwoPhaseData(void)
* backup should be rolled in.
*
* Our other responsibility is to determine and return the oldest valid XID
- * among the prepared xacts (if none, return ShmemVariableCache->nextXid).
+ * among the prepared xacts (if none, return ShmemVariableCache->nextFullXid).
* This is needed to synchronize pg_subtrans startup properly.
*
* If xids_p and nxids_p are not NULL, pointer to a palloc'd array of all
@@ -1907,7 +1907,8 @@ restoreTwoPhaseData(void)
TransactionId
PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
{
- TransactionId origNextXid = ShmemVariableCache->nextXid;
+ FullTransactionId nextFullXid = ShmemVariableCache->nextFullXid;
+ TransactionId origNextXid = XidFromFullTransactionId(nextFullXid);
TransactionId result = origNextXid;
TransactionId *xids = NULL;
int nxids = 0;
@@ -2123,7 +2124,7 @@ RecoverPreparedTransactions(void)
*
* If setParent is true, set up subtransaction parent linkages.
*
- * If setNextXid is true, set ShmemVariableCache->nextXid to the newest
+ * If setNextXid is true, set ShmemVariableCache->nextFullXid to the newest
* value scanned.
*/
static char *
@@ -2132,7 +2133,8 @@ ProcessTwoPhaseBuffer(TransactionId xid,
bool fromdisk,
bool setParent, bool setNextXid)
{
- TransactionId origNextXid = ShmemVariableCache->nextXid;
+ FullTransactionId nextFullXid = ShmemVariableCache->nextFullXid;
+ TransactionId origNextXid = XidFromFullTransactionId(nextFullXid);
TransactionId *subxids;
char *buf;
TwoPhaseFileHeader *hdr;
@@ -2212,7 +2214,7 @@ ProcessTwoPhaseBuffer(TransactionId xid,
/*
* Examine subtransaction XIDs ... they should all follow main XID, and
- * they may force us to advance nextXid.
+ * they may force us to advance nextFullXid.
*/
subxids = (TransactionId *) (buf +
MAXALIGN(sizeof(TwoPhaseFileHeader)) +
@@ -2223,25 +2225,9 @@ ProcessTwoPhaseBuffer(TransactionId xid,
Assert(TransactionIdFollows(subxid, xid));
- /* update nextXid if needed */
- if (setNextXid &&
- TransactionIdFollowsOrEquals(subxid,
- ShmemVariableCache->nextXid))
- {
- /*
- * We don't expect anyone else to modify nextXid, hence we don't
- * need to hold a lock while examining it. We still acquire the
- * lock to modify it, though, so we recheck.
- */
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- if (TransactionIdFollowsOrEquals(subxid,
- ShmemVariableCache->nextXid))
- {
- ShmemVariableCache->nextXid = subxid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- }
- LWLockRelease(XidGenLock);
- }
+ /* update nextFullXid if needed */
+ if (setNextXid)
+ AdvanceNextFullTransactionIdPastXid(subxid);
if (setParent)
SubTransSetParent(subxid, xid);
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index fe94fdaf049..b5c361c224f 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -73,7 +73,7 @@ GetNewTransactionId(bool isSubXact)
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*----------
* Check to see if it's safe to assign another XID. This protects against
@@ -156,7 +156,7 @@ GetNewTransactionId(bool isSubXact)
/* Re-acquire lock and start over */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = ShmemVariableCache->nextXid;
+ xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
}
/*
@@ -173,12 +173,12 @@ GetNewTransactionId(bool isSubXact)
ExtendSUBTRANS(xid);
/*
- * Now advance the nextXid counter. This must not happen until after we
- * have successfully completed ExtendCLOG() --- if that routine fails, we
- * want the next incoming transaction to try it again. We cannot assign
- * more XIDs until there is CLOG space for them.
+ * Now advance the nextFullXid counter. This must not happen until after
+ * we have successfully completed ExtendCLOG() --- if that routine fails,
+ * we want the next incoming transaction to try it again. We cannot
+ * assign more XIDs until there is CLOG space for them.
*/
- TransactionIdAdvance(ShmemVariableCache->nextXid);
+ FullTransactionIdAdvance(&ShmemVariableCache->nextFullXid);
/*
* We must store the new XID into the shared ProcArray before releasing
@@ -236,18 +236,64 @@ GetNewTransactionId(bool isSubXact)
}
/*
- * Read nextXid but don't allocate it.
+ * Read nextFullXid but don't allocate it.
*/
-TransactionId
-ReadNewTransactionId(void)
+FullTransactionId
+ReadNextFullTransactionId(void)
{
- TransactionId xid;
+ FullTransactionId fullXid;
LWLockAcquire(XidGenLock, LW_SHARED);
- xid = ShmemVariableCache->nextXid;
+ fullXid = ShmemVariableCache->nextFullXid;
LWLockRelease(XidGenLock);
- return xid;
+ return fullXid;
+}
+
+/*
+ * Advance nextFullXid to the value after a given xid. The epoch is inferred.
+ * This must only be called during recovery or from two-phase start-up code.
+ */
+void
+AdvanceNextFullTransactionIdPastXid(TransactionId xid)
+{
+ FullTransactionId newNextFullXid;
+ TransactionId next_xid;
+ uint32 epoch;
+
+ /*
+ * It is safe to read nextFullXid without a lock, because this is only
+ * called from the startup process, meaning that no other process can
+ * modify it.
+ */
+ Assert(AmStartupProcess());
+
+ /* Fast return if this isn't an xid high enough to move the needle. */
+ next_xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
+ if (!TransactionIdFollowsOrEquals(xid, next_xid))
+ return;
+
+ /*
+ * Compute the FullTransactionId that comes after the given xid. To do
+ * this, we preserve the existing epoch, but detect when we've wrapped
+ * into a new epoch. This is necessary because WAL records and 2PC state
+ * currently contain 32 bit xids. The wrap logic is safe in those cases
+ * because the span of active xids cannot exceed one epoch at any given
+ * point in the WAL stream.
+ */
+ TransactionIdAdvance(xid);
+ epoch = EpochFromFullTransactionId(ShmemVariableCache->nextFullXid);
+ if (unlikely(xid < next_xid))
+ ++epoch;
+ newNextFullXid = FullTransactionIdFromEpochAndXid(epoch, xid);
+
+ /*
+ * We still need to take a lock to modify the value against concurrent
+ * readers.
+ */
+ LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
+ ShmemVariableCache->nextFullXid = newNextFullXid;
+ LWLockRelease(XidGenLock);
}
/*
@@ -351,7 +397,7 @@ SetTransactionIdLimit(TransactionId oldest_datfrozenxid, Oid oldest_datoid)
ShmemVariableCache->xidStopLimit = xidStopLimit;
ShmemVariableCache->xidWrapLimit = xidWrapLimit;
ShmemVariableCache->oldestXidDB = oldest_datoid;
- curXid = ShmemVariableCache->nextXid;
+ curXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
LWLockRelease(XidGenLock);
/* Log the info */
@@ -427,7 +473,7 @@ ForceTransactionIdLimitUpdate(void)
/* Locking is probably not really necessary, but let's be careful */
LWLockAcquire(XidGenLock, LW_SHARED);
- nextXid = ShmemVariableCache->nextXid;
+ nextXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
xidVacLimit = ShmemVariableCache->xidVacLimit;
oldestXid = ShmemVariableCache->oldestXid;
oldestXidDB = ShmemVariableCache->oldestXidDB;
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index c3214d4f4d8..9b100050597 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -5636,21 +5636,8 @@ xact_redo_commit(xl_xact_parsed_commit *parsed,
max_xid = TransactionIdLatest(xid, parsed->nsubxacts, parsed->subxacts);
- /*
- * Make sure nextXid is beyond any XID mentioned in the record.
- *
- * We don't expect anyone else to modify nextXid, hence we don't need to
- * hold a lock while checking this. We still acquire the lock to modify
- * it, though.
- */
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ /* Make sure nextFullXid is beyond any XID mentioned in the record. */
+ AdvanceNextFullTransactionIdPastXid(max_xid);
Assert(((parsed->xinfo & XACT_XINFO_HAS_ORIGIN) == 0) ==
(origin_id == InvalidRepOriginId));
@@ -5792,25 +5779,11 @@ xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid)
Assert(TransactionIdIsValid(xid));
- /*
- * Make sure nextXid is beyond any XID mentioned in the record.
- *
- * We don't expect anyone else to modify nextXid, hence we don't need to
- * hold a lock while checking this. We still acquire the lock to modify
- * it, though.
- */
+ /* Make sure nextFullXid is beyond any XID mentioned in the record. */
max_xid = TransactionIdLatest(xid,
parsed->nsubxacts,
parsed->subxacts);
-
- if (TransactionIdFollowsOrEquals(max_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = max_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPastXid(max_xid);
if (standbyState == STANDBY_DISABLED)
{
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index ad12ebc4269..19d7911ec50 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -590,8 +590,7 @@ typedef struct XLogCtlData
/* Protected by info_lck: */
XLogwrtRqst LogwrtRqst;
XLogRecPtr RedoRecPtr; /* a recent copy of Insert->RedoRecPtr */
- uint32 ckptXidEpoch; /* nextXID & epoch of latest checkpoint */
- TransactionId ckptXid;
+ FullTransactionId ckptFullXid; /* nextFullXid of latest checkpoint */
XLogRecPtr asyncXactLSN; /* LSN of newest async commit/abort */
XLogRecPtr replicationSlotMinLSN; /* oldest LSN needed by any slot */
@@ -5115,8 +5114,8 @@ BootStrapXLOG(void)
checkPoint.ThisTimeLineID = ThisTimeLineID;
checkPoint.PrevTimeLineID = ThisTimeLineID;
checkPoint.fullPageWrites = fullPageWrites;
- checkPoint.nextXidEpoch = 0;
- checkPoint.nextXid = FirstNormalTransactionId;
+ checkPoint.nextFullXid =
+ FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
checkPoint.nextOid = FirstBootstrapObjectId;
checkPoint.nextMulti = FirstMultiXactId;
checkPoint.nextMultiOffset = 0;
@@ -5129,7 +5128,7 @@ BootStrapXLOG(void)
checkPoint.time = (pg_time_t) time(NULL);
checkPoint.oldestActiveXid = InvalidTransactionId;
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
@@ -6557,8 +6556,8 @@ StartupXLOG(void)
(uint32) (checkPoint.redo >> 32), (uint32) checkPoint.redo,
wasShutdown ? "true" : "false")));
ereport(DEBUG1,
- (errmsg_internal("next transaction ID: %u:%u; next OID: %u",
- checkPoint.nextXidEpoch, checkPoint.nextXid,
+ (errmsg_internal("next transaction ID: " UINT64_FORMAT "; next OID: %u",
+ U64FromFullTransactionId(checkPoint.nextFullXid),
checkPoint.nextOid)));
ereport(DEBUG1,
(errmsg_internal("next MultiXactId: %u; next MultiXactOffset: %u",
@@ -6573,12 +6572,12 @@ StartupXLOG(void)
(errmsg_internal("commit timestamp Xid oldest/newest: %u/%u",
checkPoint.oldestCommitTsXid,
checkPoint.newestCommitTsXid)));
- if (!TransactionIdIsNormal(checkPoint.nextXid))
+ if (!TransactionIdIsNormal(XidFromFullTransactionId(checkPoint.nextFullXid)))
ereport(PANIC,
(errmsg("invalid next transaction ID")));
/* initialize shared memory variables from the checkpoint record */
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;
MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
@@ -6587,8 +6586,7 @@ StartupXLOG(void)
SetMultiXactIdLimit(checkPoint.oldestMulti, checkPoint.oldestMultiDB, true);
SetCommitTsLimit(checkPoint.oldestCommitTsXid,
checkPoint.newestCommitTsXid);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
/*
* Initialize replication slots, before there's a chance to remove
@@ -6859,7 +6857,7 @@ StartupXLOG(void)
Assert(TransactionIdIsValid(oldestActiveXID));
/* Tell procarray about the range of xids it has to deal with */
- ProcArrayInitRecovery(ShmemVariableCache->nextXid);
+ ProcArrayInitRecovery(XidFromFullTransactionId(ShmemVariableCache->nextFullXid));
/*
* Startup commit log and subtrans only. MultiXact and commit
@@ -6889,9 +6887,9 @@ StartupXLOG(void)
running.xcnt = nxids;
running.subxcnt = 0;
running.subxid_overflow = false;
- running.nextXid = checkPoint.nextXid;
+ running.nextXid = XidFromFullTransactionId(checkPoint.nextFullXid);
running.oldestRunningXid = oldestActiveXID;
- latestCompletedXid = checkPoint.nextXid;
+ latestCompletedXid = XidFromFullTransactionId(checkPoint.nextFullXid);
TransactionIdRetreat(latestCompletedXid);
Assert(TransactionIdIsNormal(latestCompletedXid));
running.latestCompletedXid = latestCompletedXid;
@@ -7061,20 +7059,10 @@ StartupXLOG(void)
error_context_stack = &errcallback;
/*
- * ShmemVariableCache->nextXid must be beyond record's xid.
- *
- * We don't expect anyone else to modify nextXid, hence we
- * don't need to hold a lock while examining it. We still
- * acquire the lock to modify it, though.
+ * ShmemVariableCache->nextFullXid must be beyond record's
+ * xid.
*/
- if (TransactionIdFollowsOrEquals(record->xl_xid,
- ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = record->xl_xid;
- TransactionIdAdvance(ShmemVariableCache->nextXid);
- LWLockRelease(XidGenLock);
- }
+ AdvanceNextFullTransactionIdPastXid(record->xl_xid);
/*
* Before replaying this record, check if this record causes
@@ -7654,7 +7642,7 @@ StartupXLOG(void)
/* also initialize latestCompletedXid, to nextXid - 1 */
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- ShmemVariableCache->latestCompletedXid = ShmemVariableCache->nextXid;
+ ShmemVariableCache->latestCompletedXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
TransactionIdRetreat(ShmemVariableCache->latestCompletedXid);
LWLockRelease(ProcArrayLock);
@@ -8247,41 +8235,6 @@ GetLastSegSwitchData(XLogRecPtr *lastSwitchLSN)
return result;
}
-/*
- * GetNextXidAndEpoch - get the current nextXid value and associated epoch
- *
- * This is exported for use by code that would like to have 64-bit XIDs.
- * We don't really support such things, but all XIDs within the system
- * can be presumed "close to" the result, and thus the epoch associated
- * with them can be determined.
- */
-void
-GetNextXidAndEpoch(TransactionId *xid, uint32 *epoch)
-{
- uint32 ckptXidEpoch;
- TransactionId ckptXid;
- TransactionId nextXid;
-
- /* Must read checkpoint info first, else have race condition */
- SpinLockAcquire(&XLogCtl->info_lck);
- ckptXidEpoch = XLogCtl->ckptXidEpoch;
- ckptXid = XLogCtl->ckptXid;
- SpinLockRelease(&XLogCtl->info_lck);
-
- /* Now fetch current nextXid */
- nextXid = ReadNewTransactionId();
-
- /*
- * nextXid is certainly logically later than ckptXid. So if it's
- * numerically less, it must have wrapped into the next epoch.
- */
- if (nextXid < ckptXid)
- ckptXidEpoch++;
-
- *xid = nextXid;
- *epoch = ckptXidEpoch;
-}
-
/*
* This must be called ONCE during postmaster or standalone-backend shutdown
*/
@@ -8701,7 +8654,7 @@ CreateCheckPoint(int flags)
* there.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- checkPoint.nextXid = ShmemVariableCache->nextXid;
+ checkPoint.nextFullXid = ShmemVariableCache->nextFullXid;
checkPoint.oldestXid = ShmemVariableCache->oldestXid;
checkPoint.oldestXidDB = ShmemVariableCache->oldestXidDB;
LWLockRelease(XidGenLock);
@@ -8711,11 +8664,6 @@ CreateCheckPoint(int flags)
checkPoint.newestCommitTsXid = ShmemVariableCache->newestCommitTsXid;
LWLockRelease(CommitTsLock);
- /* Increase XID epoch if we've wrapped around since last checkpoint */
- checkPoint.nextXidEpoch = ControlFile->checkPointCopy.nextXidEpoch;
- if (checkPoint.nextXid < ControlFile->checkPointCopy.nextXid)
- checkPoint.nextXidEpoch++;
-
LWLockAcquire(OidGenLock, LW_SHARED);
checkPoint.nextOid = ShmemVariableCache->nextOid;
if (!shutdown)
@@ -8859,8 +8807,7 @@ CreateCheckPoint(int flags)
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/*
@@ -9622,7 +9569,7 @@ xlog_redo(XLogReaderState *record)
memcpy(&checkPoint, XLogRecGetData(record), sizeof(CheckPoint));
/* In a SHUTDOWN checkpoint, believe the counters exactly */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
LWLockRelease(XidGenLock);
LWLockAcquire(OidGenLock, LW_EXCLUSIVE);
ShmemVariableCache->nextOid = checkPoint.nextOid;
@@ -9676,9 +9623,9 @@ xlog_redo(XLogReaderState *record)
running.xcnt = nxids;
running.subxcnt = 0;
running.subxid_overflow = false;
- running.nextXid = checkPoint.nextXid;
+ running.nextXid = XidFromFullTransactionId(checkPoint.nextFullXid);
running.oldestRunningXid = oldestActiveXID;
- latestCompletedXid = checkPoint.nextXid;
+ latestCompletedXid = XidFromFullTransactionId(checkPoint.nextFullXid);
TransactionIdRetreat(latestCompletedXid);
Assert(TransactionIdIsNormal(latestCompletedXid));
running.latestCompletedXid = latestCompletedXid;
@@ -9690,13 +9637,11 @@ xlog_redo(XLogReaderState *record)
}
/* ControlFile->checkPointCopy always tracks the latest ckpt XID */
- ControlFile->checkPointCopy.nextXidEpoch = checkPoint.nextXidEpoch;
- ControlFile->checkPointCopy.nextXid = checkPoint.nextXid;
+ ControlFile->checkPointCopy.nextFullXid = checkPoint.nextFullXid;
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/*
@@ -9717,9 +9662,9 @@ xlog_redo(XLogReaderState *record)
memcpy(&checkPoint, XLogRecGetData(record), sizeof(CheckPoint));
/* In an ONLINE checkpoint, treat the XID counter as a minimum */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- if (TransactionIdPrecedes(ShmemVariableCache->nextXid,
- checkPoint.nextXid))
- ShmemVariableCache->nextXid = checkPoint.nextXid;
+ if (FullTransactionIdPrecedes(ShmemVariableCache->nextFullXid,
+ checkPoint.nextFullXid))
+ ShmemVariableCache->nextFullXid = checkPoint.nextFullXid;
LWLockRelease(XidGenLock);
/*
@@ -9749,13 +9694,11 @@ xlog_redo(XLogReaderState *record)
SetTransactionIdLimit(checkPoint.oldestXid,
checkPoint.oldestXidDB);
/* ControlFile->checkPointCopy always tracks the latest ckpt XID */
- ControlFile->checkPointCopy.nextXidEpoch = checkPoint.nextXidEpoch;
- ControlFile->checkPointCopy.nextXid = checkPoint.nextXid;
+ ControlFile->checkPointCopy.nextFullXid = checkPoint.nextFullXid;
/* Update shared-memory copy of checkpoint XID/epoch */
SpinLockAcquire(&XLogCtl->info_lck);
- XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
- XLogCtl->ckptXid = checkPoint.nextXid;
+ XLogCtl->ckptFullXid = checkPoint.nextFullXid;
SpinLockRelease(&XLogCtl->info_lck);
/* TLI should not change in an on-line checkpoint */
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index d9959e568a8..f32cf91ffb3 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -1160,6 +1160,7 @@ static void
XLogWalRcvSendHSFeedback(bool immed)
{
TimestampTz now;
+ FullTransactionId nextFullXid;
TransactionId nextXid;
uint32 xmin_epoch,
catalog_xmin_epoch;
@@ -1238,7 +1239,9 @@ XLogWalRcvSendHSFeedback(bool immed)
* Get epoch and adjust if nextXid and oldestXmin are different sides of
* the epoch boundary.
*/
- GetNextXidAndEpoch(&nextXid, &xmin_epoch);
+ nextFullXid = ReadNextFullTransactionId();
+ nextXid = XidFromFullTransactionId(nextFullXid);
+ xmin_epoch = EpochFromFullTransactionId(nextFullXid);
catalog_xmin_epoch = xmin_epoch;
if (nextXid < xmin)
xmin_epoch--;
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 4bb98ef352a..21f5c868f18 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1912,10 +1912,13 @@ PhysicalReplicationSlotNewXmin(TransactionId feedbackXmin, TransactionId feedbac
static bool
TransactionIdInRecentPast(TransactionId xid, uint32 epoch)
{
+ FullTransactionId nextFullXid;
TransactionId nextXid;
uint32 nextEpoch;
- GetNextXidAndEpoch(&nextXid, &nextEpoch);
+ nextFullXid = ReadNextFullTransactionId();
+ nextXid = XidFromFullTransactionId(nextFullXid);
+ nextEpoch = EpochFromFullTransactionId(nextFullXid);
if (xid <= nextXid)
{
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index cf93357997c..010cc061c89 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -664,7 +664,6 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
{
TransactionId *xids;
int nxids;
- TransactionId nextXid;
int i;
Assert(standbyState >= STANDBY_INITIALIZED);
@@ -881,23 +880,10 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
LWLockRelease(ProcArrayLock);
- /*
- * ShmemVariableCache->nextXid must be beyond any observed xid.
- *
- * We don't expect anyone else to modify nextXid, hence we don't need to
- * hold a lock while examining it. We still acquire the lock to modify
- * it, though.
- */
- nextXid = latestObservedXid;
- TransactionIdAdvance(nextXid);
- if (TransactionIdFollows(nextXid, ShmemVariableCache->nextXid))
- {
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = nextXid;
- LWLockRelease(XidGenLock);
- }
+ /* ShmemVariableCache->nextFullXid must be beyond any observed xid. */
+ AdvanceNextFullTransactionIdPastXid(latestObservedXid);
- Assert(TransactionIdIsValid(ShmemVariableCache->nextXid));
+ Assert(FullTransactionIdIsValid(ShmemVariableCache->nextFullXid));
KnownAssignedXidsDisplay(trace_recovery(DEBUG3));
if (standbyState == STANDBY_SNAPSHOT_READY)
@@ -2001,7 +1987,7 @@ GetRunningTransactionData(void)
latestCompletedXid = ShmemVariableCache->latestCompletedXid;
- oldestRunningXid = ShmemVariableCache->nextXid;
+ oldestRunningXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* Spin over procArray collecting all xids
@@ -2093,7 +2079,7 @@ GetRunningTransactionData(void)
CurrentRunningXacts->xcnt = count - subcount;
CurrentRunningXacts->subxcnt = subcount;
CurrentRunningXacts->subxid_overflow = suboverflowed;
- CurrentRunningXacts->nextXid = ShmemVariableCache->nextXid;
+ CurrentRunningXacts->nextXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
CurrentRunningXacts->oldestRunningXid = oldestRunningXid;
CurrentRunningXacts->latestCompletedXid = latestCompletedXid;
@@ -2138,7 +2124,7 @@ GetOldestActiveTransactionId(void)
* have already completed), when we spin over it.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- oldestRunningXid = ShmemVariableCache->nextXid;
+ oldestRunningXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
LWLockRelease(XidGenLock);
/*
@@ -2206,7 +2192,7 @@ GetOldestSafeDecodingTransactionId(bool catalogOnly)
* a safe, albeit pessimal, value.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
- oldestSafeXid = ShmemVariableCache->nextXid;
+ oldestSafeXid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* If there's already a slot pegging the xmin horizon, we can start with
@@ -3266,12 +3252,10 @@ RecordKnownAssignedTransactionIds(TransactionId xid)
*/
latestObservedXid = xid;
- /* ShmemVariableCache->nextXid must be beyond any observed xid */
+ /* ShmemVariableCache->nextFullXid must be beyond any observed xid */
+ AdvanceNextFullTransactionIdPastXid(latestObservedXid);
next_expected_xid = latestObservedXid;
TransactionIdAdvance(next_expected_xid);
- LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- ShmemVariableCache->nextXid = next_expected_xid;
- LWLockRelease(XidGenLock);
}
}
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 4d10e57a803..cd56dca3aef 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -867,7 +867,7 @@ standby_redo(XLogReaderState *record)
* up from a checkpoint and are immediately at our starting point, we
* unconditionally move to STANDBY_INITIALIZED. After this point we
* must do 4 things:
- * * move shared nextXid forwards as we see new xids
+ * * move shared nextFullXid forwards as we see new xids
* * extend the clog and subtrans with each new xid
* * keep track of uncommitted known assigned xids
* * keep track of uncommitted AccessExclusiveLocks
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 92beaab5663..4e4d04bae37 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -3410,7 +3410,7 @@ ReleasePredicateLocks(bool isCommit, bool isReadOnlySafe)
* transaction to complete before freeing some RAM; correctness of visible
* behavior is not affected.
*/
- MySerializableXact->finishedBefore = ShmemVariableCache->nextXid;
+ MySerializableXact->finishedBefore = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
/*
* If it's not a commit it's either a rollback or a read-only transaction
diff --git a/src/backend/utils/adt/txid.c b/src/backend/utils/adt/txid.c
index 9958b1a55e8..4483db573f3 100644
--- a/src/backend/utils/adt/txid.c
+++ b/src/backend/utils/adt/txid.c
@@ -91,7 +91,10 @@ typedef struct
static void
load_xid_epoch(TxidEpoch *state)
{
- GetNextXidAndEpoch(&state->last_xid, &state->epoch);
+ FullTransactionId fullXid = ReadNextFullTransactionId();
+
+ state->last_xid = XidFromFullTransactionId(fullXid);
+ state->epoch = EpochFromFullTransactionId(fullXid);
}
/*
@@ -114,8 +117,11 @@ TransactionIdInRecentPast(uint64 xid_with_epoch, TransactionId *extracted_xid)
TransactionId xid = (TransactionId) xid_with_epoch;
uint32 now_epoch;
TransactionId now_epoch_next_xid;
+ FullTransactionId now_fullxid;
- GetNextXidAndEpoch(&now_epoch_next_xid, &now_epoch);
+ now_fullxid = ReadNextFullTransactionId();
+ now_epoch_next_xid = XidFromFullTransactionId(now_fullxid);
+ now_epoch = EpochFromFullTransactionId(now_fullxid);
if (extracted_xid != NULL)
*extracted_xid = xid;
@@ -128,8 +134,7 @@ TransactionIdInRecentPast(uint64 xid_with_epoch, TransactionId *extracted_xid)
return true;
/* If the transaction ID is in the future, throw an error. */
- if (xid_epoch > now_epoch
- || (xid_epoch == now_epoch && xid >= now_epoch_next_xid))
+ if (xid_with_epoch >= U64FromFullTransactionId(now_fullxid))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("transaction ID %s is in the future",
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index e6742dc24b8..e675c33c547 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -16,6 +16,7 @@
#include "postgres.h"
#include "access/htup_details.h"
+#include "access/transam.h"
#include "access/xlog_internal.h"
#include "access/xlog.h"
#include "catalog/pg_control.h"
@@ -164,8 +165,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
nulls[5] = false;
values[6] = CStringGetTextDatum(psprintf("%u:%u",
- ControlFile->checkPointCopy.nextXidEpoch,
- ControlFile->checkPointCopy.nextXid));
+ EpochFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid),
+ XidFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid)));
nulls[6] = false;
values[7] = ObjectIdGetDatum(ControlFile->checkPointCopy.nextOid);
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index 1aa1db218ac..9a17d0f9c0f 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -20,6 +20,7 @@
#include <time.h>
+#include "access/transam.h"
#include "access/xlog.h"
#include "access/xlog_internal.h"
#include "catalog/pg_control.h"
@@ -256,8 +257,8 @@ main(int argc, char *argv[])
printf(_("Latest checkpoint's full_page_writes: %s\n"),
ControlFile->checkPointCopy.fullPageWrites ? _("on") : _("off"));
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
- ControlFile->checkPointCopy.nextXidEpoch,
- ControlFile->checkPointCopy.nextXid);
+ EpochFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid),
+ XidFromFullTransactionId(ControlFile->checkPointCopy.nextFullXid));
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile->checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index a7b25ffe1cd..67fc646befb 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -430,11 +430,15 @@ main(int argc, char *argv[])
* if any, includes these values.)
*/
if (set_xid_epoch != -1)
- ControlFile.checkPointCopy.nextXidEpoch = set_xid_epoch;
+ ControlFile.checkPointCopy.nextFullXid =
+ FullTransactionIdFromEpochAndXid(set_xid_epoch,
+ XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
if (set_xid != 0)
{
- ControlFile.checkPointCopy.nextXid = set_xid;
+ ControlFile.checkPointCopy.nextFullXid =
+ FullTransactionIdFromEpochAndXid(EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid),
+ set_xid);
/*
* For the moment, just set oldestXid to a value that will force
@@ -704,8 +708,8 @@ GuessControlValues(void)
ControlFile.checkPointCopy.ThisTimeLineID = 1;
ControlFile.checkPointCopy.PrevTimeLineID = 1;
ControlFile.checkPointCopy.fullPageWrites = false;
- ControlFile.checkPointCopy.nextXidEpoch = 0;
- ControlFile.checkPointCopy.nextXid = FirstNormalTransactionId;
+ ControlFile.checkPointCopy.nextFullXid =
+ FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId);
ControlFile.checkPointCopy.nextOid = FirstBootstrapObjectId;
ControlFile.checkPointCopy.nextMulti = FirstMultiXactId;
ControlFile.checkPointCopy.nextMultiOffset = 0;
@@ -786,8 +790,8 @@ PrintControlValues(bool guessed)
printf(_("Latest checkpoint's full_page_writes: %s\n"),
ControlFile.checkPointCopy.fullPageWrites ? _("on") : _("off"));
printf(_("Latest checkpoint's NextXID: %u:%u\n"),
- ControlFile.checkPointCopy.nextXidEpoch,
- ControlFile.checkPointCopy.nextXid);
+ EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid),
+ XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
printf(_("Latest checkpoint's NextOID: %u\n"),
ControlFile.checkPointCopy.nextOid);
printf(_("Latest checkpoint's NextMultiXactId: %u\n"),
@@ -879,7 +883,7 @@ PrintNewControlValues(void)
if (set_xid != 0)
{
printf(_("NextXID: %u\n"),
- ControlFile.checkPointCopy.nextXid);
+ XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
printf(_("OldestXID: %u\n"),
ControlFile.checkPointCopy.oldestXid);
printf(_("OldestXID's DB: %u\n"),
@@ -889,7 +893,7 @@ PrintNewControlValues(void)
if (set_xid_epoch != -1)
{
printf(_("NextXID epoch: %u\n"),
- ControlFile.checkPointCopy.nextXidEpoch);
+ EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));
}
if (set_oldest_commit_ts_xid != 0)
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 78997e533e7..6a919084c8f 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -44,6 +44,32 @@
#define TransactionIdStore(xid, dest) (*(dest) = (xid))
#define StoreInvalidTransactionId(dest) (*(dest) = InvalidTransactionId)
+#define EpochFromFullTransactionId(x) ((uint32) ((x).value >> 32))
+#define XidFromFullTransactionId(x) ((uint32) (x).value)
+#define U64FromFullTransactionId(x) ((x).value)
+#define FullTransactionIdPrecedes(a, b) ((a).value < (b).value)
+#define FullTransactionIdIsValid(x) TransactionIdIsValid(XidFromFullTransactionId(x))
+
+/*
+ * A 64 bit value that contains an epoch and a TransactionId. This is
+ * wrapped in a struct to prevent implicit conversion to/from TransactionId.
+ * Not all values represent valid normal XIDs.
+ */
+typedef struct FullTransactionId
+{
+ uint64 value;
+} FullTransactionId;
+
+static inline FullTransactionId
+FullTransactionIdFromEpochAndXid(uint32 epoch, TransactionId xid)
+{
+ FullTransactionId result;
+
+ result.value = ((uint64) epoch) << 32 | xid;
+
+ return result;
+}
+
/* advance a transaction ID variable, handling wraparound correctly */
#define TransactionIdAdvance(dest) \
do { \
@@ -52,6 +78,15 @@
(dest) = FirstNormalTransactionId; \
} while(0)
+/* advance a FullTransactionId variable, stepping over special XIDs */
+static inline void
+FullTransactionIdAdvance(FullTransactionId *dest)
+{
+ dest->value++;
+ while (XidFromFullTransactionId(*dest) < FirstNormalTransactionId)
+ dest->value++;
+}
+
/* back up a transaction ID variable, handling wraparound correctly */
#define TransactionIdRetreat(dest) \
do { \
@@ -125,12 +160,12 @@ typedef struct VariableCacheData
/*
* These fields are protected by XidGenLock.
*/
- TransactionId nextXid; /* next XID to assign */
+ FullTransactionId nextFullXid; /* next full XID to assign */
TransactionId oldestXid; /* cluster-wide minimum datfrozenxid */
TransactionId xidVacLimit; /* start forcing autovacuums here */
TransactionId xidWarnLimit; /* start complaining here */
- TransactionId xidStopLimit; /* refuse to advance nextXid beyond here */
+ TransactionId xidStopLimit; /* refuse to advance nextFullXid beyond here */
TransactionId xidWrapLimit; /* where the world ends */
Oid oldestXidDB; /* database with minimum datfrozenxid */
@@ -187,11 +222,21 @@ extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
/* in transam/varsup.c */
extern TransactionId GetNewTransactionId(bool isSubXact);
-extern TransactionId ReadNewTransactionId(void);
+extern void AdvanceNextFullTransactionIdPastXid(TransactionId xid);
+extern FullTransactionId ReadNextFullTransactionId(void);
extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
Oid oldest_datoid);
extern void AdvanceOldestClogXid(TransactionId oldest_datfrozenxid);
extern bool ForceTransactionIdLimitUpdate(void);
extern Oid GetNewObjectId(void);
+/*
+ * For callers that just need the XID part of the next transaction ID.
+ */
+static inline TransactionId
+ReadNewTransactionId(void)
+{
+ return XidFromFullTransactionId(ReadNextFullTransactionId());
+}
+
#endif /* TRAMSAM_H */
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index bd74e7aaa03..eb6c44649dc 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -310,7 +310,6 @@ extern XLogRecPtr GetRedoRecPtr(void);
extern XLogRecPtr GetInsertRecPtr(void);
extern XLogRecPtr GetFlushRecPtr(void);
extern XLogRecPtr GetLastImportantRecPtr(void);
-extern void GetNextXidAndEpoch(TransactionId *xid, uint32 *epoch);
extern void RemovePromoteSignalFiles(void);
extern bool CheckPromoteSignal(void);
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index a3910a5f997..ff98d9e91a8 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -15,13 +15,14 @@
#ifndef PG_CONTROL_H
#define PG_CONTROL_H
+#include "access/transam.h"
#include "access/xlogdefs.h"
#include "pgtime.h" /* for pg_time_t */
#include "port/pg_crc32c.h"
/* Version identifier for this pg_control format */
-#define PG_CONTROL_VERSION 1200
+#define PG_CONTROL_VERSION 1201
/* Nonce key length, see below */
#define MOCK_AUTH_NONCE_LEN 32
@@ -39,8 +40,7 @@ typedef struct CheckPoint
TimeLineID PrevTimeLineID; /* previous TLI, if this record begins a new
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
- uint32 nextXidEpoch; /* higher-order bits of nextXid */
- TransactionId nextXid; /* next free XID */
+ FullTransactionId nextFullXid; /* next free full transaction ID */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index 346a3108bc1..23612435148 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -72,7 +72,7 @@ typedef struct RunningTransactionsData
int xcnt; /* # of xact ids in xids[] */
int subxcnt; /* # of subxact ids in xids[] */
bool subxid_overflow; /* snapshot overflowed, subxids missing */
- TransactionId nextXid; /* copy of ShmemVariableCache->nextXid */
+ TransactionId nextXid; /* xid from ShmemVariableCache->nextFullXid */
TransactionId oldestRunningXid; /* *not* oldestXmin */
TransactionId latestCompletedXid; /* so we can set xmax */
diff --git a/src/include/storage/standbydefs.h b/src/include/storage/standbydefs.h
index cc8ccd5d369..01d2db6ac6e 100644
--- a/src/include/storage/standbydefs.h
+++ b/src/include/storage/standbydefs.h
@@ -49,7 +49,7 @@ typedef struct xl_running_xacts
int xcnt; /* # of xact ids in xids[] */
int subxcnt; /* # of subxact ids in xids[] */
bool subxid_overflow; /* snapshot overflowed, subxids missing */
- TransactionId nextXid; /* copy of ShmemVariableCache->nextXid */
+ TransactionId nextXid; /* xid from ShmemVariableCache->nextFullXid */
TransactionId oldestRunningXid; /* *not* oldestXmin */
TransactionId latestCompletedXid; /* so we can set xmax */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index fc3cb6a40dd..a4dd0f46429 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -796,6 +796,7 @@ FreePageManager
FreePageSpanLeader
FromCharDateMode
FromExpr
+FullTransactionId
FuncCall
FuncCallContext
FuncCandidateList
--
2.21.0
0002-Use-FullTransactionId-for-the-transaction-stack-v8.patchapplication/octet-stream; name=0002-Use-FullTransactionId-for-the-transaction-stack-v8.patchDownload
From 737a8629640ba3bb7d24e032283e315c891c9ce2 Mon Sep 17 00:00:00 2001
From: Thomas Munro <tmunro@postgresql.org>
Date: Thu, 28 Mar 2019 00:03:47 +1300
Subject: [PATCH 2/2] Use FullTransactionId for the transaction stack.
Provide GetTopFullTransactionId() and GetCurrentFullTransactionId().
The intended users of these interfaces are access methods that use
xids for visibility checks but don't want to have to go back and
"freeze" existing references some time later before the 32 bit xid
counter wraps around.
Author: Thomas Munro
Reviewed-by: Heikki Linnakangas
Discussion: https://postgr.es/m/CAA4eK1%2BMv%2Bmb0HFfWM9Srtc6MVe160WFurXV68iAFMcagRZ0dQ%40mail.gmail.com
---
src/backend/access/transam/varsup.c | 13 +-
src/backend/access/transam/xact.c | 229 +++++++++++++++++++---------
src/include/access/transam.h | 3 +-
src/include/access/xact.h | 5 +
src/tools/pgindent/typedefs.list | 1 +
5 files changed, 170 insertions(+), 81 deletions(-)
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index b5c361c224f..dfb4a62a0cb 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -35,7 +35,8 @@ VariableCache ShmemVariableCache = NULL;
/*
- * Allocate the next XID for a new transaction or subtransaction.
+ * Allocate the next FullTransactionId for a new transaction or
+ * subtransaction.
*
* The new XID is also stored into MyPgXact before returning.
*
@@ -44,9 +45,10 @@ VariableCache ShmemVariableCache = NULL;
* does something. So it is safe to do a database lookup if we want to
* issue a warning about XID wrap.
*/
-TransactionId
+FullTransactionId
GetNewTransactionId(bool isSubXact)
{
+ FullTransactionId full_xid;
TransactionId xid;
/*
@@ -64,7 +66,7 @@ GetNewTransactionId(bool isSubXact)
{
Assert(!isSubXact);
MyPgXact->xid = BootstrapTransactionId;
- return BootstrapTransactionId;
+ return FullTransactionIdFromEpochAndXid(0, BootstrapTransactionId);
}
/* safety check, we should never get this far in a HS standby */
@@ -73,7 +75,8 @@ GetNewTransactionId(bool isSubXact)
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
+ full_xid = ShmemVariableCache->nextFullXid;
+ xid = XidFromFullTransactionId(full_xid);
/*----------
* Check to see if it's safe to assign another XID. This protects against
@@ -232,7 +235,7 @@ GetNewTransactionId(bool isSubXact)
LWLockRelease(XidGenLock);
- return xid;
+ return full_xid;
}
/*
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 9b100050597..18755c12cef 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -105,7 +105,7 @@ int synchronous_commit = SYNCHRONOUS_COMMIT_ON;
* The XIDs are stored sorted in numerical order (not logical order) to make
* lookups as fast as possible.
*/
-TransactionId XactTopTransactionId = InvalidTransactionId;
+FullTransactionId XactTopFullTransactionId = {InvalidTransactionId};
int nParallelCurrentXids = 0;
TransactionId *ParallelCurrentXids;
@@ -171,7 +171,7 @@ typedef enum TBlockState
*/
typedef struct TransactionStateData
{
- TransactionId transactionId; /* my XID, or Invalid if none */
+ FullTransactionId fullTransactionId; /* my FullTransactionId */
SubTransactionId subTransactionId; /* my subxact ID */
char *name; /* savepoint name, if any */
int savepointLevel; /* savepoint level */
@@ -196,6 +196,25 @@ typedef struct TransactionStateData
typedef TransactionStateData *TransactionState;
+/*
+ * Serialized representation used to transmit transaction state to parallel
+ * workers though shared memory.
+ */
+typedef struct SerializedTransactionState
+{
+ int xactIsoLevel;
+ bool xactDeferrable;
+ FullTransactionId topFullTransactionId;
+ FullTransactionId currentFullTransactionId;
+ CommandId currentCommandId;
+ int nParallelCurrentXids;
+ TransactionId parallelCurrentXids[FLEXIBLE_ARRAY_MEMBER];
+} SerializedTransactionState;
+
+/* The size of SerializedTransactionState, not including the final array. */
+#define SerializedTransactionStateHeaderSize \
+ offsetof(SerializedTransactionState, parallelCurrentXids)
+
/*
* CurrentTransactionState always points to the current transaction state
* block. It will point to TopTransactionStateData when not in a
@@ -372,9 +391,9 @@ IsAbortedTransactionBlockState(void)
TransactionId
GetTopTransactionId(void)
{
- if (!TransactionIdIsValid(XactTopTransactionId))
+ if (!FullTransactionIdIsValid(XactTopFullTransactionId))
AssignTransactionId(&TopTransactionStateData);
- return XactTopTransactionId;
+ return XidFromFullTransactionId(XactTopFullTransactionId);
}
/*
@@ -387,7 +406,7 @@ GetTopTransactionId(void)
TransactionId
GetTopTransactionIdIfAny(void)
{
- return XactTopTransactionId;
+ return XidFromFullTransactionId(XactTopFullTransactionId);
}
/*
@@ -402,9 +421,9 @@ GetCurrentTransactionId(void)
{
TransactionState s = CurrentTransactionState;
- if (!TransactionIdIsValid(s->transactionId))
+ if (!FullTransactionIdIsValid(s->fullTransactionId))
AssignTransactionId(s);
- return s->transactionId;
+ return XidFromFullTransactionId(s->fullTransactionId);
}
/*
@@ -417,7 +436,66 @@ GetCurrentTransactionId(void)
TransactionId
GetCurrentTransactionIdIfAny(void)
{
- return CurrentTransactionState->transactionId;
+ return XidFromFullTransactionId(CurrentTransactionState->fullTransactionId);
+}
+
+/*
+ * GetTopFullTransactionId
+ *
+ * This will return the FullTransactionId of the main transaction, assigning
+ * one if it's not yet set. Be careful to call this only inside a valid xact.
+ */
+FullTransactionId
+GetTopFullTransactionId(void)
+{
+ if (!FullTransactionIdIsValid(XactTopFullTransactionId))
+ AssignTransactionId(&TopTransactionStateData);
+ return XactTopFullTransactionId;
+}
+
+/*
+ * GetTopFullTransactionIdIfAny
+ *
+ * This will return the FullTransactionId of the main transaction, if one is
+ * assigned. It will return InvalidFullTransactionId if we are not currently
+ * inside a transaction, or inside a transaction that hasn't yet been assigned
+ * one.
+ */
+FullTransactionId
+GetTopFullTransactionIdIfAny(void)
+{
+ return XactTopFullTransactionId;
+}
+
+/*
+ * GetCurrentFullTransactionId
+ *
+ * This will return the FullTransactionId of the current transaction (main or
+ * sub transaction), assigning one if it's not yet set. Be careful to call
+ * this only inside a valid xact.
+ */
+FullTransactionId
+GetCurrentFullTransactionId(void)
+{
+ TransactionState s = CurrentTransactionState;
+
+ if (!FullTransactionIdIsValid(s->fullTransactionId))
+ AssignTransactionId(s);
+ return s->fullTransactionId;
+}
+
+/*
+ * GetCurrentFullTransactionIdIfAny
+ *
+ * This will return the FullTransactionId of the current sub xact, if one is
+ * assigned. It will return InvalidFullTransactionId if we are not currently
+ * inside a transaction, or inside a transaction that hasn't been assigned one
+ * yet.
+ */
+FullTransactionId
+GetCurrentFullTransactionIdIfAny(void)
+{
+ return CurrentTransactionState->fullTransactionId;
}
/*
@@ -428,7 +506,7 @@ GetCurrentTransactionIdIfAny(void)
void
MarkCurrentTransactionIdLoggedIfAny(void)
{
- if (TransactionIdIsValid(CurrentTransactionState->transactionId))
+ if (FullTransactionIdIsValid(CurrentTransactionState->fullTransactionId))
CurrentTransactionState->didLogXid = true;
}
@@ -463,7 +541,7 @@ GetStableLatestTransactionId(void)
/*
* AssignTransactionId
*
- * Assigns a new permanent XID to the given TransactionState.
+ * Assigns a new permanent FullTransactionId to the given TransactionState.
* We do not assign XIDs to transactions until/unless this is called.
* Also, any parent TransactionStates that don't yet have XIDs are assigned
* one; this maintains the invariant that a child transaction has an XID
@@ -477,7 +555,7 @@ AssignTransactionId(TransactionState s)
bool log_unknown_top = false;
/* Assert that caller didn't screw up */
- Assert(!TransactionIdIsValid(s->transactionId));
+ Assert(!FullTransactionIdIsValid(s->fullTransactionId));
Assert(s->state == TRANS_INPROGRESS);
/*
@@ -493,14 +571,14 @@ AssignTransactionId(TransactionState s)
* if we're at the bottom of a huge stack of subtransactions none of which
* have XIDs yet.
*/
- if (isSubXact && !TransactionIdIsValid(s->parent->transactionId))
+ if (isSubXact && !FullTransactionIdIsValid(s->parent->fullTransactionId))
{
TransactionState p = s->parent;
TransactionState *parents;
size_t parentOffset = 0;
parents = palloc(sizeof(TransactionState) * s->nestingLevel);
- while (p != NULL && !TransactionIdIsValid(p->transactionId))
+ while (p != NULL && !FullTransactionIdIsValid(p->fullTransactionId))
{
parents[parentOffset++] = p;
p = p->parent;
@@ -531,26 +609,28 @@ AssignTransactionId(TransactionState s)
log_unknown_top = true;
/*
- * Generate a new Xid and record it in PG_PROC and pg_subtrans.
+ * Generate a new FullTransactionId and record its xid in PG_PROC and
+ * pg_subtrans.
*
* NB: we must make the subtrans entry BEFORE the Xid appears anywhere in
* shared storage other than PG_PROC; because if there's no room for it in
* PG_PROC, the subtrans entry is needed to ensure that other backends see
* the Xid as "running". See GetNewTransactionId.
*/
- s->transactionId = GetNewTransactionId(isSubXact);
+ s->fullTransactionId = GetNewTransactionId(isSubXact);
if (!isSubXact)
- XactTopTransactionId = s->transactionId;
+ XactTopFullTransactionId = s->fullTransactionId;
if (isSubXact)
- SubTransSetParent(s->transactionId, s->parent->transactionId);
+ SubTransSetParent(XidFromFullTransactionId(s->fullTransactionId),
+ XidFromFullTransactionId(s->parent->fullTransactionId));
/*
* If it's a top-level transaction, the predicate locking system needs to
* be told about it too.
*/
if (!isSubXact)
- RegisterPredicateLockingXid(s->transactionId);
+ RegisterPredicateLockingXid(XidFromFullTransactionId(s->fullTransactionId));
/*
* Acquire lock on the transaction XID. (We assume this cannot block.) We
@@ -560,7 +640,7 @@ AssignTransactionId(TransactionState s)
currentOwner = CurrentResourceOwner;
CurrentResourceOwner = s->curTransactionOwner;
- XactLockTableInsert(s->transactionId);
+ XactLockTableInsert(XidFromFullTransactionId(s->fullTransactionId));
CurrentResourceOwner = currentOwner;
@@ -584,7 +664,7 @@ AssignTransactionId(TransactionState s)
*/
if (isSubXact && XLogStandbyInfoActive())
{
- unreportedXids[nUnreportedXids] = s->transactionId;
+ unreportedXids[nUnreportedXids] = XidFromFullTransactionId(s->fullTransactionId);
nUnreportedXids++;
/*
@@ -832,9 +912,9 @@ TransactionIdIsCurrentTransactionId(TransactionId xid)
if (s->state == TRANS_ABORT)
continue;
- if (!TransactionIdIsValid(s->transactionId))
+ if (!FullTransactionIdIsValid(s->fullTransactionId))
continue; /* it can't have any child XIDs either */
- if (TransactionIdEquals(xid, s->transactionId))
+ if (TransactionIdEquals(xid, XidFromFullTransactionId(s->fullTransactionId)))
return true;
/* As the childXids array is ordered, we can use binary search */
low = 0;
@@ -1495,7 +1575,7 @@ AtSubCommit_childXids(void)
* all XIDs already in the array belong to subtransactions started and
* subcommitted before us, so their XIDs must precede ours.
*/
- s->parent->childXids[s->parent->nChildXids] = s->transactionId;
+ s->parent->childXids[s->parent->nChildXids] = XidFromFullTransactionId(s->fullTransactionId);
if (s->nChildXids > 0)
memcpy(&s->parent->childXids[s->parent->nChildXids + 1],
@@ -1809,7 +1889,7 @@ StartTransaction(void)
s = &TopTransactionStateData;
CurrentTransactionState = s;
- Assert(XactTopTransactionId == InvalidTransactionId);
+ Assert(XidFromFullTransactionId(XactTopFullTransactionId) == InvalidTransactionId);
/* check the current transaction state */
Assert(s->state == TRANS_DEFAULT);
@@ -1821,7 +1901,7 @@ StartTransaction(void)
* flags are fetched below.
*/
s->state = TRANS_START;
- s->transactionId = InvalidTransactionId; /* until assigned */
+ s->fullTransactionId = InvalidFullTransactionId; /* until assigned */
/*
* initialize current transaction state fields
@@ -2165,7 +2245,7 @@ CommitTransaction(void)
AtCommit_Memory();
- s->transactionId = InvalidTransactionId;
+ s->fullTransactionId = InvalidFullTransactionId;
s->subTransactionId = InvalidSubTransactionId;
s->nestingLevel = 0;
s->gucNestLevel = 0;
@@ -2173,7 +2253,7 @@ CommitTransaction(void)
s->nChildXids = 0;
s->maxChildXids = 0;
- XactTopTransactionId = InvalidTransactionId;
+ XactTopFullTransactionId = InvalidFullTransactionId;
nParallelCurrentXids = 0;
/*
@@ -2448,7 +2528,7 @@ PrepareTransaction(void)
AtCommit_Memory();
- s->transactionId = InvalidTransactionId;
+ s->fullTransactionId = InvalidFullTransactionId;
s->subTransactionId = InvalidSubTransactionId;
s->nestingLevel = 0;
s->gucNestLevel = 0;
@@ -2456,7 +2536,7 @@ PrepareTransaction(void)
s->nChildXids = 0;
s->maxChildXids = 0;
- XactTopTransactionId = InvalidTransactionId;
+ XactTopFullTransactionId = InvalidFullTransactionId;
nParallelCurrentXids = 0;
/*
@@ -2686,7 +2766,7 @@ CleanupTransaction(void)
AtCleanup_Memory(); /* and transaction memory */
- s->transactionId = InvalidTransactionId;
+ s->fullTransactionId = InvalidFullTransactionId;
s->subTransactionId = InvalidSubTransactionId;
s->nestingLevel = 0;
s->gucNestLevel = 0;
@@ -2695,7 +2775,7 @@ CleanupTransaction(void)
s->maxChildXids = 0;
s->parallelModeLevel = 0;
- XactTopTransactionId = InvalidTransactionId;
+ XactTopFullTransactionId = InvalidFullTransactionId;
nParallelCurrentXids = 0;
/*
@@ -4693,7 +4773,7 @@ CommitSubTransaction(void)
*/
/* Post-commit cleanup */
- if (TransactionIdIsValid(s->transactionId))
+ if (FullTransactionIdIsValid(s->fullTransactionId))
AtSubCommit_childXids();
AfterTriggerEndSubXact(true);
AtSubCommit_Portals(s->subTransactionId,
@@ -4718,8 +4798,8 @@ CommitSubTransaction(void)
* The only lock we actually release here is the subtransaction XID lock.
*/
CurrentResourceOwner = s->curTransactionOwner;
- if (TransactionIdIsValid(s->transactionId))
- XactLockTableDelete(s->transactionId);
+ if (FullTransactionIdIsValid(s->fullTransactionId))
+ XactLockTableDelete(XidFromFullTransactionId(s->fullTransactionId));
/*
* Other locks should get transferred to their parent resource owner.
@@ -4872,7 +4952,7 @@ AbortSubTransaction(void)
(void) RecordTransactionAbort(true);
/* Post-abort cleanup */
- if (TransactionIdIsValid(s->transactionId))
+ if (FullTransactionIdIsValid(s->fullTransactionId))
AtSubAbort_childXids();
CallSubXactCallbacks(SUBXACT_EVENT_ABORT_SUB, s->subTransactionId,
@@ -4985,7 +5065,7 @@ PushTransaction(void)
* We can now stack a minimally valid subtransaction without fear of
* failure.
*/
- s->transactionId = InvalidTransactionId; /* until assigned */
+ s->fullTransactionId = InvalidFullTransactionId; /* until assigned */
s->subTransactionId = currentSubTransactionId;
s->parent = p;
s->nestingLevel = p->nestingLevel + 1;
@@ -5052,18 +5132,17 @@ Size
EstimateTransactionStateSpace(void)
{
TransactionState s;
- Size nxids = 6; /* iso level, deferrable, top & current XID,
- * command counter, XID count */
+ Size nxids = 0;
+ Size size = SerializedTransactionStateHeaderSize;
for (s = CurrentTransactionState; s != NULL; s = s->parent)
{
- if (TransactionIdIsValid(s->transactionId))
+ if (FullTransactionIdIsValid(s->fullTransactionId))
nxids = add_size(nxids, 1);
nxids = add_size(nxids, s->nChildXids);
}
- nxids = add_size(nxids, nParallelCurrentXids);
- return mul_size(nxids, sizeof(TransactionId));
+ return add_size(size, sizeof(SerializedTransactionState) * nxids);
}
/*
@@ -5072,14 +5151,10 @@ EstimateTransactionStateSpace(void)
* needed by a parallel worker.
*
* We need to save and restore XactDeferrable, XactIsoLevel, and the XIDs
- * associated with this transaction. The first eight bytes of the result
- * contain XactDeferrable and XactIsoLevel; the next twelve bytes contain the
- * XID of the top-level transaction, the XID of the current transaction
- * (or, in each case, InvalidTransactionId if none), and the current command
- * counter. After that, the next 4 bytes contain a count of how many
- * additional XIDs follow; this is followed by all of those XIDs one after
- * another. We emit the XIDs in sorted order for the convenience of the
- * receiving process.
+ * associated with this transaction. These are serialized into a
+ * caller-supplied buffer big enough to hold the number of bytes reported by
+ * EstimateTransactionStateSpace(). We emit the XIDs in sorted order for the
+ * convenience of the receiving process.
*/
void
SerializeTransactionState(Size maxsize, char *start_address)
@@ -5087,16 +5162,17 @@ SerializeTransactionState(Size maxsize, char *start_address)
TransactionState s;
Size nxids = 0;
Size i = 0;
- Size c = 0;
TransactionId *workspace;
- TransactionId *result = (TransactionId *) start_address;
+ SerializedTransactionState *result;
+
+ result = (SerializedTransactionState *) start_address;
- result[c++] = (TransactionId) XactIsoLevel;
- result[c++] = (TransactionId) XactDeferrable;
- result[c++] = XactTopTransactionId;
- result[c++] = CurrentTransactionState->transactionId;
- result[c++] = (TransactionId) currentCommandId;
- Assert(maxsize >= c * sizeof(TransactionId));
+ result->xactIsoLevel = XactIsoLevel;
+ result->xactDeferrable = XactDeferrable;
+ result->topFullTransactionId = XactTopFullTransactionId;
+ result->currentFullTransactionId =
+ CurrentTransactionState->fullTransactionId;
+ result->currentCommandId = currentCommandId;
/*
* If we're running in a parallel worker and launching a parallel worker
@@ -5105,9 +5181,8 @@ SerializeTransactionState(Size maxsize, char *start_address)
*/
if (nParallelCurrentXids > 0)
{
- result[c++] = nParallelCurrentXids;
- Assert(maxsize >= (nParallelCurrentXids + c) * sizeof(TransactionId));
- memcpy(&result[c], ParallelCurrentXids,
+ result->nParallelCurrentXids = nParallelCurrentXids;
+ memcpy(&result->parallelCurrentXids[0], ParallelCurrentXids,
nParallelCurrentXids * sizeof(TransactionId));
return;
}
@@ -5118,18 +5193,19 @@ SerializeTransactionState(Size maxsize, char *start_address)
*/
for (s = CurrentTransactionState; s != NULL; s = s->parent)
{
- if (TransactionIdIsValid(s->transactionId))
+ if (FullTransactionIdIsValid(s->fullTransactionId))
nxids = add_size(nxids, 1);
nxids = add_size(nxids, s->nChildXids);
}
- Assert((c + 1 + nxids) * sizeof(TransactionId) <= maxsize);
+ Assert(SerializedTransactionStateHeaderSize + nxids * sizeof(TransactionId)
+ <= maxsize);
/* Copy them to our scratch space. */
workspace = palloc(nxids * sizeof(TransactionId));
for (s = CurrentTransactionState; s != NULL; s = s->parent)
{
- if (TransactionIdIsValid(s->transactionId))
- workspace[i++] = s->transactionId;
+ if (FullTransactionIdIsValid(s->fullTransactionId))
+ workspace[i++] = XidFromFullTransactionId(s->fullTransactionId);
memcpy(&workspace[i], s->childXids,
s->nChildXids * sizeof(TransactionId));
i += s->nChildXids;
@@ -5140,8 +5216,9 @@ SerializeTransactionState(Size maxsize, char *start_address)
qsort(workspace, nxids, sizeof(TransactionId), xidComparator);
/* Copy data into output area. */
- result[c++] = (TransactionId) nxids;
- memcpy(&result[c], workspace, nxids * sizeof(TransactionId));
+ result->nParallelCurrentXids = nxids;
+ memcpy(&result->parallelCurrentXids[0], workspace,
+ nxids * sizeof(TransactionId));
}
/*
@@ -5152,18 +5229,20 @@ SerializeTransactionState(Size maxsize, char *start_address)
void
StartParallelWorkerTransaction(char *tstatespace)
{
- TransactionId *tstate = (TransactionId *) tstatespace;
+ SerializedTransactionState *tstate;
Assert(CurrentTransactionState->blockState == TBLOCK_DEFAULT);
StartTransaction();
- XactIsoLevel = (int) tstate[0];
- XactDeferrable = (bool) tstate[1];
- XactTopTransactionId = tstate[2];
- CurrentTransactionState->transactionId = tstate[3];
- currentCommandId = tstate[4];
- nParallelCurrentXids = (int) tstate[5];
- ParallelCurrentXids = &tstate[6];
+ tstate = (SerializedTransactionState *) tstatespace;
+ XactIsoLevel = tstate->xactIsoLevel;
+ XactDeferrable = tstate->xactDeferrable;
+ XactTopFullTransactionId = tstate->topFullTransactionId;
+ CurrentTransactionState->fullTransactionId =
+ tstate->currentFullTransactionId;
+ currentCommandId = tstate->currentCommandId;
+ nParallelCurrentXids = tstate->nParallelCurrentXids;
+ ParallelCurrentXids = &tstate->parallelCurrentXids[0];
CurrentTransactionState->blockState = TBLOCK_PARALLEL_INPROGRESS;
}
@@ -5222,7 +5301,7 @@ ShowTransactionStateRec(const char *str, TransactionState s)
PointerIsValid(s->name) ? s->name : "unnamed",
BlockStateAsString(s->blockState),
TransStateAsString(s->state),
- (unsigned int) s->transactionId,
+ (unsigned int) XidFromFullTransactionId(s->fullTransactionId),
(unsigned int) s->subTransactionId,
(unsigned int) currentCommandId,
currentCommandIdUsed ? " (used)" : "",
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 6a919084c8f..7966a9e90ba 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -49,6 +49,7 @@
#define U64FromFullTransactionId(x) ((x).value)
#define FullTransactionIdPrecedes(a, b) ((a).value < (b).value)
#define FullTransactionIdIsValid(x) TransactionIdIsValid(XidFromFullTransactionId(x))
+#define InvalidFullTransactionId FullTransactionIdFromEpochAndXid(0, InvalidTransactionId)
/*
* A 64 bit value that contains an epoch and a TransactionId. This is
@@ -221,7 +222,7 @@ extern TransactionId TransactionIdLatest(TransactionId mainxid,
extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
/* in transam/varsup.c */
-extern TransactionId GetNewTransactionId(bool isSubXact);
+extern FullTransactionId GetNewTransactionId(bool isSubXact);
extern void AdvanceNextFullTransactionIdPastXid(TransactionId xid);
extern FullTransactionId ReadNextFullTransactionId(void);
extern void SetTransactionIdLimit(TransactionId oldest_datfrozenxid,
diff --git a/src/include/access/xact.h b/src/include/access/xact.h
index e8579dcd478..b550343c4db 100644
--- a/src/include/access/xact.h
+++ b/src/include/access/xact.h
@@ -14,6 +14,7 @@
#ifndef XACT_H
#define XACT_H
+#include "access/transam.h"
#include "access/xlogreader.h"
#include "lib/stringinfo.h"
#include "nodes/pg_list.h"
@@ -355,6 +356,10 @@ extern TransactionId GetCurrentTransactionId(void);
extern TransactionId GetCurrentTransactionIdIfAny(void);
extern TransactionId GetStableLatestTransactionId(void);
extern SubTransactionId GetCurrentSubTransactionId(void);
+extern FullTransactionId GetTopFullTransactionId(void);
+extern FullTransactionId GetTopFullTransactionIdIfAny(void);
+extern FullTransactionId GetCurrentFullTransactionId(void);
+extern FullTransactionId GetCurrentFullTransactionIdIfAny(void);
extern void MarkCurrentTransactionIdLoggedIfAny(void);
extern bool SubTransactionIsActive(SubTransactionId subxid);
extern CommandId GetCurrentCommandId(bool used);
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a4dd0f46429..ac43aa3d5b4 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2107,6 +2107,7 @@ SeqTableData
SerCommitSeqNo
SerializedReindexState
SerializedSnapshotData
+SerializedTransactionState
Session
SessionBackupState
SetConstraintState
--
2.21.0
On 27/03/2019 13:44, Thomas Munro wrote:
* I tidied up the code that serialises transaction state. It was
already hammering round pegs into square holes, and the previous patch
made that even worse, so I added a new struct
SerializedTransactionState to do this properly.
Ah, good, it used to be confusing.
I still need to look into Andres's suggestion about getting rid of
epoch from various user interfaces and showing 64 bit numbers. I
should probably also find a place in the relevant README to explain
this new scheme. I will post follow-up patches for those.
Once we have the FullTransactionId type and basic macros in place, I'm
sure we could tidy up a bunch of code by using them. For example,
TransactionIdInRecentPast() in walsender.c would be simpler, if the
caller dealt with FullTransactionIds rather than xid+epoch. But it makes
sense to do that separately.
+/* + * Advance nextFullXid to the value after a given xid. The epoch is inferred. + * This must only be called during recovery or from two-phase start-up code. + */ +void +AdvanceNextFullTransactionIdPastXid(TransactionId xid) +{ + FullTransactionId newNextFullXid; + TransactionId next_xid; + uint32 epoch; + + /* + * It is safe to read nextFullXid without a lock, because this is only + * called from the startup process, meaning that no other process can + * modify it. + */ + Assert(AmStartupProcess()); +
This assertion fails on WAL replay in single-user mode:
$ bin/postgres --single -D data postgres
2019-03-27 14:32:35.058 EET [32359] LOG: database system was
interrupted; last known up at 2019-03-27 14:32:18 EET
2019-03-27 14:32:35.144 EET [32359] LOG: database system was not
properly shut down; automatic recovery in progress
2019-03-27 14:32:35.148 EET [32359] LOG: redo starts at 0/15BB7B0
TRAP: FailedAssertion("!((MyAuxProcType == StartupProcess))", File:
"varsup.c", Line: 269)
Aborted
- Heikki
On Thu, Mar 28, 2019 at 1:48 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
Once we have the FullTransactionId type and basic macros in place, I'm
sure we could tidy up a bunch of code by using them. For example,
TransactionIdInRecentPast() in walsender.c would be simpler, if the
caller dealt with FullTransactionIds rather than xid+epoch. But it makes
sense to do that separately.
+1
+ /* + * It is safe to read nextFullXid without a lock, because this is only + * called from the startup process, meaning that no other process can + * modify it. + */ + Assert(AmStartupProcess()); +This assertion fails on WAL replay in single-user mode:
Fixed. (Embarrassingly I had that working in v7 but broke it in v8).
I decided to do some testing on a 32 bit system, and ran into weird
new problem in heap_compute_xid_horizon_for_tuples() which I assumed
to be somehow my fault due to the mention of xid horizons, but I
eventually realised that master was broken on that machine and
followed that up elsewhere. Phew.
Thanks for the reviews! Pushed.
--
Thomas Munro
https://enterprisedb.com
On Thu, Mar 28, 2019 at 1:30 AM Thomas Munro <thomas.munro@gmail.com> wrote:
On Thu, Mar 28, 2019 at 1:48 AM Heikki Linnakangas <hlinnaka@iki.fi>
wrote:Once we have the FullTransactionId type and basic macros in place, I'm
sure we could tidy up a bunch of code by using them.
Thanks for the reviews! Pushed.
I think that this might be broken.
We have this change:
@@ -73,7 +75,8 @@ GetNewTransactionId(bool isSubXact)
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
+ full_xid = ShmemVariableCache->nextFullXid;
+ xid = XidFromFullTransactionId(full_xid);
But then later on in an little-used code path around line 164:
/* Re-acquire lock and start over */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
full_xid does not get updated, but then later on full_xid gets returned in
lieu of xid.
Is there a reason that this is OK?
Cheers,
Jeff
On Sat, May 4, 2019 at 1:34 PM Jeff Janes <jeff.janes@gmail.com> wrote:
On Thu, Mar 28, 2019 at 1:30 AM Thomas Munro <thomas.munro@gmail.com>
wrote:On Thu, Mar 28, 2019 at 1:48 AM Heikki Linnakangas <hlinnaka@iki.fi>
wrote:Once we have the FullTransactionId type and basic macros in place, I'm
sure we could tidy up a bunch of code by using them.Thanks for the reviews! Pushed.
I think that this might be broken.
We have this change:
@@ -73,7 +75,8 @@ GetNewTransactionId(bool isSubXact)
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
- xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid); + full_xid = ShmemVariableCache->nextFullXid; + xid = XidFromFullTransactionId(full_xid);But then later on in an little-used code path around line 164:
/* Re-acquire lock and start over */
LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);full_xid does not get updated, but then later on full_xid gets returned in
lieu of xid.Is there a reason that this is OK?
Ah, sorry for the noise. I see that this has been fixed already. I wasn't
looking at HEAD, or at the other email thread, when I "discovered" this.
Sorry for the noise.
Cheers,
Jeff
Round pegs fit into square holes when the peg is smaller.,..
Hey why did the chicken follow Simon off the ledge?
Just because he told him to?!! Was he really trying to get home, did he
find out the hard way things no one else knows but what he had learnt on
the other side, just to let you know.... you don't ever really die. Plan to
stay. I know enough to now....
The grass is not greener. There are not ten virgins for each man. The women
are equals and love to suck Dick just as much as ya mother. Don't laugh
just yet,,.. at this 13th generation level we are all pretty much related.
And i am the original original, along with my brothers and your God, a
princess to a King, stolen long long ago, Now doing a shit tonne better
thank you, for asking.
Thanks for the flowers,,,,, I never got. Thanks for the love though... I
did feel that.
I'm one month sober.
It was only when I died again did I get all the secrets that only the dead
get and forget before living again just to tell you.,, im alive and i have
insider knowledge, you'd pay zillions for!
I jumped into that portal and landed right back where you put me. Now not
so glitchy,,, though money alludes me, I see all the numbers in a matrix
like Neo, only on my computer screen,,, this world is real, and so am I.
Flowers smell beautiful, and look precious.... crystals are purposeful and
magic... the sky is an ever changing scenescape that words can only ever
scrape the surface of in trying to describe it's magnificence, whatever the
weather.
I'm solo. In need of some close up cuddling with Jonathan aka Andrew or
whatever you chose, you know who you are. Hurry up, cos I miss you like
crazy.... in this secured medicated snow globe I'm safe in right now. All I
need is you.
Love your guts,
Em XO
Aka: Surety
On Wed, 27 Mar. 2019, 22:45 Thomas Munro, <thomas.munro@gmail.com> wrote:
Show quoted text
On Tue, Mar 26, 2019 at 12:58 PM Thomas Munro <thomas.munro@gmail.com>
wrote:On Tue, Mar 26, 2019 at 3:23 AM Heikki Linnakangas <hlinnaka@iki.fi>
wrote:
Looks good.
I did some testing and proof-reading and made a few minor changes:
* I tidied up the code that serialises transaction state. It was
already hammering round pegs into square holes, and the previous patch
made that even worse, so I added a new struct
SerializedTransactionState to do this properly.* I open-coded Get{Current,Top}TransactionId[IfAny](), rather than
having them call the "Full" variants, so that nobody could accuse me
of adding an extra function call that might not be inlined. It's just
a couple of lines anyway.* I kept the name GetNewTransactionId(), since it's referred to in
many places in comments etc. Previously I had called it
GetNewFullTransactionId() and had GetNewTransactionId() just call that
and truncate to 32 bits, but there wasn't much point without an
in-tree caller for the narrow version. If there is any out-of-tree
code calling this, it will now fail to compile thanks to our
non-convertible return type.These are the patches I'm planning to push tomorrow.
I still need to look into Andres's suggestion about getting rid of
epoch from various user interfaces and showing 64 bit numbers. I
should probably also find a place in the relevant README to explain
this new scheme. I will post follow-up patches for those.--
Thomas Munro
https://enterprisedb.com
On Thu, Mar 28, 2019 at 8:30 AM Thomas Munro <thomas.munro@gmail.com> wrote:
Thanks for the reviews! Pushed.
Any ideas we should move towards 64-bit xids in more places? That has
been discussed several times already. I think last time it was
discussed in person during FOSDEM PGDay 2018 Developer Meeting [1].
There we've discussed that it probably doesn't worth it to change
32-bit on-disk xids in heap. It's better to leave existing heap "as
is", but allow other pluggable table access methods to support 64-bit
xids. Given now we have pluggable table access methods, we may build
a plan on full support of 64-bit xids in core.
In my vision sketchy plan may look like this.
1. Change all non-heap types of xids from TransactionId to
FullTransactionId. But in shared memory change TransactionId to
pg_atomic_uint64 in order to guarantee atomicity of access, which we
require in multiple places.
2. Also introduce FullMultixactId, and apply to MultixactId the
similar change as #1.
3. Change SLRU on-disk format to handle 64-bit xids and multixacts.
In particular make SLRU page numbers 64-bit too. Add SLRU upgrade
procedure to pg_upgrade.
4. Change relfrozenxid and relminmxid to something like rellimitxid
and rellimitmxid. So, we don't imply there is restriction of 32-bit
xids. Instead, we let table AM place (or don't place) a limit to
advancing nextXid, nextMultixact.
5. Table AM API would be switched to 64-bit xids/multixacts, while
heap will remain using 32-bit. So, heap should convert 32-bit xid
value to 64-bit basing on nextXid/nextMultixact. Assuming we set
rellimitxid and rellimitmxid for relation as oldestxid + 2^32 and
oldestmxid + 2^32, we may safely assume the 32-bit values of
xids/multixacts are in 2^32 range before nextXid/nextMultixact.
Thanks to this even in heap we would be able to operate 2^32
xids/multixacts simultaneously instead of 2^31 we have now.
So, this is how I see the possible plan. I would be glad to see a feedback.
Unfortunately, I don't have enough time to implement all of this. But
I think there are many interested parties in finally having 64-bit
xids in core. Especially assuming we now have pluggable table AMs,
and it would be ridiculous to spear limitation of 32-bit xids to new
table AMs.
Links
1. https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2018_Developer_Meeting
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On Fri, Jun 21, 2019 at 7:01 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
On Thu, Mar 28, 2019 at 8:30 AM Thomas Munro <thomas.munro@gmail.com> wrote:
Thanks for the reviews! Pushed.
Any ideas we should move towards 64-bit xids in more places? That has
been discussed several times already. I think last time it was
discussed in person during FOSDEM PGDay 2018 Developer Meeting [1].
There we've discussed that it probably doesn't worth it to change
32-bit on-disk xids in heap. It's better to leave existing heap "as
is", but allow other pluggable table access methods to support 64-bit
xids. Given now we have pluggable table access methods, we may build
a plan on full support of 64-bit xids in core.In my vision sketchy plan may look like this.
1. Change all non-heap types of xids from TransactionId to
FullTransactionId.
I think it's fine to replace TransactionId with FullTransactionId
without stressing about it too much in code that's not that heavily
trafficked. However, I'm not sure we can do that across the board. For
example, converting snapshots to use 64-bit XIDs would mean that in
the worst case a snapshot will use twice as many cache lines, and that
might have performance implications on some workloads.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Hi,
On June 24, 2019 8:19:13 AM PDT, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Jun 21, 2019 at 7:01 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:On Thu, Mar 28, 2019 at 8:30 AM Thomas Munro <thomas.munro@gmail.com>
wrote:
Thanks for the reviews! Pushed.
Any ideas we should move towards 64-bit xids in more places? That
has
been discussed several times already. I think last time it was
discussed in person during FOSDEM PGDay 2018 Developer Meeting [1].
There we've discussed that it probably doesn't worth it to change
32-bit on-disk xids in heap. It's better to leave existing heap "as
is", but allow other pluggable table access methods to support 64-bit
xids. Given now we have pluggable table access methods, we may build
a plan on full support of 64-bit xids in core.In my vision sketchy plan may look like this.
1. Change all non-heap types of xids from TransactionId to
FullTransactionId.I think it's fine to replace TransactionId with FullTransactionId
without stressing about it too much in code that's not that heavily
trafficked. However, I'm not sure we can do that across the board. For
example, converting snapshots to use 64-bit XIDs would mean that in
the worst case a snapshot will use twice as many cache lines, and that
might have performance implications on some workloads.
We could probably expand the transaction IDs on access or when computing them for most of these, as usually they'll largely be about currently running transactions. It e.g. seems sensible to keep the procarray at 32 bit xids, expand xmin/xmax to 64 when computing snapshots, and leave the snapshot's transaction ID array at 32xids. That ought to be an negligible overhead.
Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
On 2019-Jun-22, Alexander Korotkov wrote:
2. Also introduce FullMultixactId, and apply to MultixactId the
similar change as #1.
3. Change SLRU on-disk format to handle 64-bit xids and multixacts.
In particular make SLRU page numbers 64-bit too. Add SLRU upgrade
procedure to pg_upgrade.
I think enlarging multixacts to 64 bits is a terrible idea. I would
rather get rid of multixacts completely; zheap is proposing not to use
multixacts at all, for example. The amount of bloat caused by
pg_multixact data is already pretty bad ... because of which requiring
pg_multixact to be rewritten by pg_upgrade would cause a severe slowdown
for upgrades. (It worked for FSM because the volume is tiny, but that's
not the case for multixact.)
I think the pg_upgrade problem can be worked around by creating a new
dir pg_multixact64 (an example) which is populated from the upgrade
point onwards; so you keep the old data unchanged, and new multixacts
use the new location, but the system knows to read the old one when
reading old tuples. But, as I said above, I would rather not have
multixacts at all.
Another idea: create a new table AM that mimics heapam (I think �-heap
"eszett-heap" is a good name), except that it reuses zheap's idea of
keeping "transaction locks" separately for tuple locks rather than
multixacts; heapam continues to use 32bits multixact. Tables can be
migrated from heapam to �-heap (alter table ... set access method) to
incrementally reduce reliance on multixacts going forward. No need for
pg_upgrade-time disruption.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi!
On Mon, Jun 24, 2019 at 6:27 PM Andres Freund <andres@anarazel.de> wrote:
On June 24, 2019 8:19:13 AM PDT, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Jun 21, 2019 at 7:01 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:On Thu, Mar 28, 2019 at 8:30 AM Thomas Munro <thomas.munro@gmail.com>
wrote:
Thanks for the reviews! Pushed.
Any ideas we should move towards 64-bit xids in more places? That
has
been discussed several times already. I think last time it was
discussed in person during FOSDEM PGDay 2018 Developer Meeting [1].
There we've discussed that it probably doesn't worth it to change
32-bit on-disk xids in heap. It's better to leave existing heap "as
is", but allow other pluggable table access methods to support 64-bit
xids. Given now we have pluggable table access methods, we may build
a plan on full support of 64-bit xids in core.In my vision sketchy plan may look like this.
1. Change all non-heap types of xids from TransactionId to
FullTransactionId.I think it's fine to replace TransactionId with FullTransactionId
without stressing about it too much in code that's not that heavily
trafficked. However, I'm not sure we can do that across the board. For
example, converting snapshots to use 64-bit XIDs would mean that in
the worst case a snapshot will use twice as many cache lines, and that
might have performance implications on some workloads.We could probably expand the transaction IDs on access or when computing them for most of these, as usually they'll largely be about currently running transactions. It e.g. seems sensible to keep the procarray at 32 bit xids, expand xmin/xmax to 64 when computing snapshots, and leave the snapshot's transaction ID array at 32xids. That ought to be an negligible overhead.
I see, replace TransactionId with FullTransactionId just everywhere
doesn't look like good idea.
Given now we have pluggable table AMs, new table AMs may not have data
wraparound problem. For instance, table AM could store xids 64-bit
internally, and convert them to 32-bit on-the-fly. If xid is too old
for conversion, just replace it with FrozenTransactionId. So, the
restriction we really have now is that running xacts and active
snapshots should fit 2^31 range. Turning Snapshot xmin/xmas into
64-bit will soften this restriction, then just running xacts should
fit 2^31 range while snapshots could be older.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On Mon, Jun 24, 2019 at 8:43 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
On 2019-Jun-22, Alexander Korotkov wrote:
2. Also introduce FullMultixactId, and apply to MultixactId the
similar change as #1.
3. Change SLRU on-disk format to handle 64-bit xids and multixacts.
In particular make SLRU page numbers 64-bit too. Add SLRU upgrade
procedure to pg_upgrade.I think enlarging multixacts to 64 bits is a terrible idea. I would
rather get rid of multixacts completely; zheap is proposing not to use
multixacts at all, for example. The amount of bloat caused by
pg_multixact data is already pretty bad ... because of which requiring
pg_multixact to be rewritten by pg_upgrade would cause a severe slowdown
for upgrades. (It worked for FSM because the volume is tiny, but that's
not the case for multixact.)I think the pg_upgrade problem can be worked around by creating a new
dir pg_multixact64 (an example) which is populated from the upgrade
point onwards; so you keep the old data unchanged, and new multixacts
use the new location, but the system knows to read the old one when
reading old tuples. But, as I said above, I would rather not have
multixacts at all.Another idea: create a new table AM that mimics heapam (I think ß-heap
"eszett-heap" is a good name), except that it reuses zheap's idea of
keeping "transaction locks" separately for tuple locks rather than
multixacts; heapam continues to use 32bits multixact. Tables can be
migrated from heapam to ß-heap (alter table ... set access method) to
incrementally reduce reliance on multixacts going forward. No need for
pg_upgrade-time disruption.
We need multixacts to store row-level locks information. I remember
they weren't crash-safe some time ago; because we basically don't need
lock information about previous server run: all that locks are for
sure released. Due to some difficulties we finally made them
crash-safe (I didn't follow that in much details). But what about
discarding mulixacts on pg_upgrade? Is it feasible goal?
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On Mon, Jun 24, 2019 at 1:43 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
I think enlarging multixacts to 64 bits is a terrible idea. I would
rather get rid of multixacts completely; zheap is proposing not to use
multixacts at all, for example.
But zedstore, at least as of the Saturday after PGCon, is proposing to
keep using them, after first widening them to 64 bits:
/messages/by-id/CA+TgmoYeTeQSmALox0PmSm5Gh03oe=UNjhmL+K+btofY_U2jFQ@mail.gmail.com
I think we all have a visceral reaction against MultiXacts at this
point; they've just caused us too much pain, and the SLRU
infrastructure is a bunch of crap.[1]At least in comparison to other parts of the PostgreSQL infrastructure which are more awesome. However, the case where N
sessions all do "SELECT * FROM sometab FOR SHARE" is pretty wretched
without multixacts. You'll just have to keep reducing the tuple
density per page to fit all the locks, whereas the current heap is
totally fine and neither the heap nor the multixact space bloats all
that terribly much.
I currently think it's likely that zheap will use something
multixact-like rather than actually using multixacts per se. I am
having trouble seeing how we can have some sort of fixed-bit-width ID
number that represents as set of XIDs for purposes of lock-tracking
without creating some nasty degenerate cases that don't exist
today.[2]I'd like to credit Andres Freund for helping me understand this issue better.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
[1]: At least in comparison to other parts of the PostgreSQL infrastructure which are more awesome.
infrastructure which are more awesome.
[2]: I'd like to credit Andres Freund for helping me understand this issue better.
issue better.
On Sat, Jun 22, 2019 at 11:01 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
3. Change SLRU on-disk format to handle 64-bit xids and multixacts.
In particular make SLRU page numbers 64-bit too. Add SLRU upgrade
procedure to pg_upgrade.
+1 for doing this for the xid-indexed SLRUs so the segment file names
are never recycled. Having yet another level of wraparound code to
feed and water and debug is not nice:
/messages/by-id/20190202083822.GC32531@gust.leadboat.com
--
Thomas Munro
https://enterprisedb.com
On Sun, Jun 30, 2019 at 9:07 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Jun 24, 2019 at 1:43 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
I think enlarging multixacts to 64 bits is a terrible idea. I would
rather get rid of multixacts completely; zheap is proposing not to use
multixacts at all, for example.But zedstore, at least as of the Saturday after PGCon, is proposing to
keep using them, after first widening them to 64 bits:/messages/by-id/CA+TgmoYeTeQSmALox0PmSm5Gh03oe=UNjhmL+K+btofY_U2jFQ@mail.gmail.com
I think we all have a visceral reaction against MultiXacts at this
point; they've just caused us too much pain, and the SLRU
infrastructure is a bunch of crap.[1] However, the case where N
sessions all do "SELECT * FROM sometab FOR SHARE" is pretty wretched
without multixacts. You'll just have to keep reducing the tuple
density per page to fit all the locks, whereas the current heap is
totally fine and neither the heap nor the multixact space bloats all
that terribly much.I currently think it's likely that zheap will use something
multixact-like rather than actually using multixacts per se. I am
having trouble seeing how we can have some sort of fixed-bit-width ID
number that represents as set of XIDs for purposes of lock-tracking
without creating some nasty degenerate cases that don't exist
today.[2]
The new thing described over here is intended to support something a
bit like multixacts:
/messages/by-id/CA+hUKGKni7EEU4FT71vZCCwPeaGb2PQOeKOFjQJavKnD577UMQ@mail.gmail.com
For example, here is some throw-away demo code that puts arbitrary
data into an undo log, in this case a list of xids given with SELECT
test_multixact(ARRAY[1234, 2345, 3456]), and provides a callback to
say when the data can be discarded, in this case when all of those
xids are no longer running. You can see the record with SELECT * FROM
undoinspect('shared'), until it gets eaten by a background worker.
The reason it has to be an in-core demo is just because we don't have
a way to do extensions that have an RMGR ID and callback functions
yet. Although this demo throws the return value away, the function
PrepareUndoInsert() returns a 64 bit UndoRecPtr which could be stored
in any page and can be used to retrieve the records (via the buffer
pool) including the binary payload which can be whatever struct you
like (though there is a size limit so you might need more than one
record for a huge list of multi-lockers). When you try to load the
record, you might be told that it's gone, which means that the lockers
are gone, whcih means that your callback must have said that was OK.
This probably makes most sense for a system that is already planning
to use UndoRecPtr for other things, like MVCC, since it probably
already has per page or per tuple space to store UndoRecPtr, and
updates and explicit locks are so closely related.
https://github.com/EnterpriseDB/zheap/commit/c92fdfd1f1178cbb44557a7c630b366d1539c332
--
Thomas Munro
https://enterprisedb.com