Hot Standby startup with overflowed snapshots
Chris Redekop's recent report of slow startup for Hot Standby has made
me revisit the code there.
Although there isn't a bug, there is a missed opportunity for starting
up faster which could be the source of Chris' annoyance.
The following patch allows a faster startup in some circumstances.
The patch also alters the log levels for messages and gives a single
simple message for this situation. The log will now say
LOG: recovery snapshot waiting for non-overflowed snapshot or until
oldest active xid on standby is at least %u (now %u)
...multiple times until snapshot non-overflowed or xid reached...
whereas before the first LOG message shown was
LOG: consistent state delayed because recovery snapshot incomplete
and only later, at DEBUG2 do you see
LOG: recovery snapshot waiting for %u oldest active xid on standby is %u
...multiple times until xid reached...
Comments please.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachments:
faster_hot_standby_startup_withsubxacts.v1.patchapplication/octet-stream; name=faster_hot_standby_startup_withsubxacts.v1.patchDownload
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index bbc5f44..fa86e7c 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -475,8 +475,9 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
/*
* If our initial RunningTransactionsData had an overflowed snapshot then
- * we knew we were missing some subxids from our snapshot. We can use this
- * data as an initial snapshot, but we cannot yet mark it valid. We know
+ * we knew we were missing some subxids from our snapshot. If we continue
+ * to see overflowed snapshots then we might never be able to start up,
+ * so we make another test to see if our snapshot is now valid. We know
* that the missing subxids are equal to or earlier than nextXid. After we
* initialise we continue to apply changes during recovery, so once the
* oldestRunningXid is later than the nextXid from the initial snapshot we
@@ -485,21 +486,31 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
*/
if (standbyState == STANDBY_SNAPSHOT_PENDING)
{
- if (TransactionIdPrecedes(standbySnapshotPendingXmin,
- running->oldestRunningXid))
+ /*
+ * If the snapshot isn't overflowed, reset our pending state and
+ * use this snapshot instead.
+ */
+ if (!running->subxid_overflow)
+ standbyState = STANDBY_INITIALIZED;
+ else
{
- standbyState = STANDBY_SNAPSHOT_READY;
- elog(trace_recovery(DEBUG2),
- "running xact data now proven complete");
- elog(trace_recovery(DEBUG2),
- "recovery snapshots are now enabled");
+ if (TransactionIdPrecedes(standbySnapshotPendingXmin,
+ running->oldestRunningXid))
+ {
+ standbyState = STANDBY_SNAPSHOT_READY;
+ elog(trace_recovery(DEBUG2),
+ "running xact data now proven complete");
+ elog(LOG,
+ "recovery snapshots are now enabled");
+ }
+ else
+ elog(LOG,
+ "recovery snapshot waiting for non-overflowed snapshot or "
+ "until oldest active xid on standby is at least %u (now %u)",
+ standbySnapshotPendingXmin,
+ running->oldestRunningXid);
+ return;
}
- else
- elog(trace_recovery(DEBUG2),
- "recovery snapshot waiting for %u oldest active xid on standby is %u",
- standbySnapshotPendingXmin,
- running->oldestRunningXid);
- return;
}
Assert(standbyState == STANDBY_INITIALIZED);
@@ -604,7 +615,6 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
standbyState = STANDBY_SNAPSHOT_READY;
standbySnapshotPendingXmin = InvalidTransactionId;
- procArray->lastOverflowedXid = InvalidTransactionId;
}
/*
@@ -627,13 +637,15 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
LWLockRelease(ProcArrayLock);
- elog(trace_recovery(DEBUG2), "running transaction data initialized");
KnownAssignedXidsDisplay(trace_recovery(DEBUG3));
if (standbyState == STANDBY_SNAPSHOT_READY)
- elog(trace_recovery(DEBUG2), "recovery snapshots are now enabled");
+ elog(LOG, "recovery snapshots are now enabled");
else
- ereport(LOG,
- (errmsg("consistent state delayed because recovery snapshot incomplete")));
+ elog(LOG,
+ "recovery snapshot waiting for non-overflowed snapshot or "
+ "until oldest active xid on standby is at least %u (now %u)",
+ standbySnapshotPendingXmin,
+ running->oldestRunningXid);
}
/*
Thanks for the patch Simon, but unfortunately it does not resolve the issue
I am seeing. The standby still refuses to finish starting up until long
after all clients have disconnected from the primary (>10 minutes). I do
see your new log statement on startup, but only once - it does not repeat.
Is there any way for me to see what the oldest xid on the standby is via
controldata or something like that? The standby does stream to keep up with
the primary while the primary has load, and then it becomes idle when the
primary becomes idle (when I kill all the connections)....so it appears to
be current...but it just doesn't finish starting up
I'm not sure if it's relevant, but after it has sat idle for a couple
minutes I start seeing these statements in the log (with the same offset
every time):
DEBUG: skipping restartpoint, already performed at 9/95000020
On Thu, Oct 27, 2011 at 7:26 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
Show quoted text
Chris Redekop's recent report of slow startup for Hot Standby has made
me revisit the code there.Although there isn't a bug, there is a missed opportunity for starting
up faster which could be the source of Chris' annoyance.The following patch allows a faster startup in some circumstances.
The patch also alters the log levels for messages and gives a single
simple message for this situation. The log will now sayLOG: recovery snapshot waiting for non-overflowed snapshot or until
oldest active xid on standby is at least %u (now %u)
...multiple times until snapshot non-overflowed or xid reached...whereas before the first LOG message shown was
LOG: consistent state delayed because recovery snapshot incomplete
and only later, at DEBUG2 do you see
LOG: recovery snapshot waiting for %u oldest active xid on standby is %u
...multiple times until xid reached...Comments please.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Oct 27, 2011 at 5:26 PM, Chris Redekop <chris@replicon.com> wrote:
Thanks for the patch Simon, but unfortunately it does not resolve the issue
I am seeing. The standby still refuses to finish starting up until long
after all clients have disconnected from the primary (>10 minutes). I do
see your new log statement on startup, but only once - it does not repeat.
Is there any way for me to see what the oldest xid on the standby is via
controldata or something like that? The standby does stream to keep up with
the primary while the primary has load, and then it becomes idle when the
primary becomes idle (when I kill all the connections)....so it appears to
be current...but it just doesn't finish starting up
I'm not sure if it's relevant, but after it has sat idle for a couple
minutes I start seeing these statements in the log (with the same offset
every time):
DEBUG: skipping restartpoint, already performed at 9/95000020
OK, so it looks like there are 2 opportunities to improve, not just one.
Try this.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachments:
faster_hot_standby_startup_withsubxacts.v2.patchapplication/octet-stream; name=faster_hot_standby_startup_withsubxacts.v2.patchDownload
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index bbc5f44..eb6d133 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -475,8 +475,9 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
/*
* If our initial RunningTransactionsData had an overflowed snapshot then
- * we knew we were missing some subxids from our snapshot. We can use this
- * data as an initial snapshot, but we cannot yet mark it valid. We know
+ * we knew we were missing some subxids from our snapshot. If we continue
+ * to see overflowed snapshots then we might never be able to start up,
+ * so we make another test to see if our snapshot is now valid. We know
* that the missing subxids are equal to or earlier than nextXid. After we
* initialise we continue to apply changes during recovery, so once the
* oldestRunningXid is later than the nextXid from the initial snapshot we
@@ -485,21 +486,33 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
*/
if (standbyState == STANDBY_SNAPSHOT_PENDING)
{
- if (TransactionIdPrecedes(standbySnapshotPendingXmin,
- running->oldestRunningXid))
+ /*
+ * If the snapshot isn't overflowed or if its empty we can
+ * reset our pending state and use this snapshot instead.
+ */
+ if (!running->subxid_overflow || running->xcnt == 0)
{
- standbyState = STANDBY_SNAPSHOT_READY;
- elog(trace_recovery(DEBUG2),
- "running xact data now proven complete");
- elog(trace_recovery(DEBUG2),
- "recovery snapshots are now enabled");
+ standbyState = STANDBY_INITIALIZED;
}
else
- elog(trace_recovery(DEBUG2),
- "recovery snapshot waiting for %u oldest active xid on standby is %u",
- standbySnapshotPendingXmin,
- running->oldestRunningXid);
- return;
+ {
+ if (TransactionIdPrecedes(standbySnapshotPendingXmin,
+ running->oldestRunningXid))
+ {
+ standbyState = STANDBY_SNAPSHOT_READY;
+ elog(trace_recovery(DEBUG2),
+ "running xact data now proven complete");
+ elog(LOG,
+ "recovery snapshots are now enabled");
+ }
+ else
+ elog(LOG,
+ "recovery snapshot waiting for non-overflowed snapshot or "
+ "until oldest active xid on standby is at least %u (now %u)",
+ standbySnapshotPendingXmin,
+ running->oldestRunningXid);
+ return;
+ }
}
Assert(standbyState == STANDBY_INITIALIZED);
@@ -604,7 +617,6 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
standbyState = STANDBY_SNAPSHOT_READY;
standbySnapshotPendingXmin = InvalidTransactionId;
- procArray->lastOverflowedXid = InvalidTransactionId;
}
/*
@@ -627,13 +639,15 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
LWLockRelease(ProcArrayLock);
- elog(trace_recovery(DEBUG2), "running transaction data initialized");
KnownAssignedXidsDisplay(trace_recovery(DEBUG3));
if (standbyState == STANDBY_SNAPSHOT_READY)
- elog(trace_recovery(DEBUG2), "recovery snapshots are now enabled");
+ elog(LOG, "recovery snapshots are now enabled");
else
- ereport(LOG,
- (errmsg("consistent state delayed because recovery snapshot incomplete")));
+ elog(LOG,
+ "recovery snapshot waiting for non-overflowed snapshot or "
+ "until oldest active xid on standby is at least %u (now %u)",
+ standbySnapshotPendingXmin,
+ running->oldestRunningXid);
}
/*
hrmz, still basically the same behaviour. I think it might be a *little*
better with this patch. Before when under load it would start up quickly
maybe 2 or 3 times out of 10 attempts....with this patch it might be up to 4
or 5 times out of 10...ish...or maybe it was just fluke *shrug*. I'm still
only seeing your log statement a single time (I'm running at debug2). I
have discovered something though - when the standby is in this state if I
force a checkpoint on the primary then the standby comes right up. Is there
anything I check or try for you to help figure this out?....or is it
actually as designed that it could take 10-ish minutes to start up even
after all clients have disconnected from the primary?
On Thu, Oct 27, 2011 at 11:27 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
Show quoted text
On Thu, Oct 27, 2011 at 5:26 PM, Chris Redekop <chris@replicon.com> wrote:
Thanks for the patch Simon, but unfortunately it does not resolve the
issue
I am seeing. The standby still refuses to finish starting up until long
after all clients have disconnected from the primary (>10 minutes). I do
see your new log statement on startup, but only once - it does notrepeat.
Is there any way for me to see what the oldest xid on the standby is
via
controldata or something like that? The standby does stream to keep up
with
the primary while the primary has load, and then it becomes idle when the
primary becomes idle (when I kill all the connections)....so it appearsto
be current...but it just doesn't finish starting up
I'm not sure if it's relevant, but after it has sat idle for a couple
minutes I start seeing these statements in the log (with the same offset
every time):
DEBUG: skipping restartpoint, already performed at 9/95000020OK, so it looks like there are 2 opportunities to improve, not just one.
Try this.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Oct 27, 2011 at 10:09 PM, Chris Redekop <chris@replicon.com> wrote:
hrmz, still basically the same behaviour. I think it might be a *little*
better with this patch. Before when under load it would start up quickly
maybe 2 or 3 times out of 10 attempts....with this patch it might be up to 4
or 5 times out of 10...ish...or maybe it was just fluke *shrug*. I'm still
only seeing your log statement a single time (I'm running at debug2). I
have discovered something though - when the standby is in this state if I
force a checkpoint on the primary then the standby comes right up. Is there
anything I check or try for you to help figure this out?....or is it
actually as designed that it could take 10-ish minutes to start up even
after all clients have disconnected from the primary?
Thanks for testing. The improvements cover specific cases, so its not
subject to chance; its not a performance patch.
It's not "designed" to act the way you describe, but it does.
The reason this occurs is that you have a transaction heavy workload
with occasional periods of complete quiet and a base backup time that
is much less than checkpoint_timeout. If your base backup was slower
the checkpoint would have hit naturally before recovery had reached a
consistent state. Which seems fairly atypical. I guess you're doing
this on a test system.
It seems cheap to add in a call to LogStandbySnapshot() after each
call to pg_stop_backup().
Does anyone think this case is worth adding code for? Seems like one
more thing to break.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Sorry..."designed" was poor choice of words, I meant "not unexpected".
Doing the checkpoint right after pg_stop_backup() looks like it will work
perfectly for me, so thanks for all your help!
On a side note I am sporadically seeing another error on hotstandby startup.
I'm not terribly concerned about it as it is pretty rare and it will work
on a retry so it's not a big deal. The error is "FATAL: out-of-order XID
insertion in KnownAssignedXids". If you think it might be a bug and are
interested in hunting it down let me know and I'll help any way I can...but
if you're not too worried about it then neither am I :)
On Thu, Oct 27, 2011 at 4:55 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
Show quoted text
On Thu, Oct 27, 2011 at 10:09 PM, Chris Redekop <chris@replicon.com>
wrote:hrmz, still basically the same behaviour. I think it might be a *little*
better with this patch. Before when under load it would start up quickly
maybe 2 or 3 times out of 10 attempts....with this patch it might be upto 4
or 5 times out of 10...ish...or maybe it was just fluke *shrug*. I'm
still
only seeing your log statement a single time (I'm running at debug2). I
have discovered something though - when the standby is in this state if I
force a checkpoint on the primary then the standby comes right up. Isthere
anything I check or try for you to help figure this out?....or is it
actually as designed that it could take 10-ish minutes to start up even
after all clients have disconnected from the primary?Thanks for testing. The improvements cover specific cases, so its not
subject to chance; its not a performance patch.It's not "designed" to act the way you describe, but it does.
The reason this occurs is that you have a transaction heavy workload
with occasional periods of complete quiet and a base backup time that
is much less than checkpoint_timeout. If your base backup was slower
the checkpoint would have hit naturally before recovery had reached a
consistent state. Which seems fairly atypical. I guess you're doing
this on a test system.It seems cheap to add in a call to LogStandbySnapshot() after each
call to pg_stop_backup().Does anyone think this case is worth adding code for? Seems like one
more thing to break.--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Oct 27, 2011 at 6:55 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
It seems cheap to add in a call to LogStandbySnapshot() after each
call to pg_stop_backup().Does anyone think this case is worth adding code for? Seems like one
more thing to break.
Why at that particular time?
It would maybe nice if the master could notice when it has a plausible
(non-overflowed) snapshot and log it then. But that might be more
code than the problem is worth.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Fri, Oct 28, 2011 at 3:42 AM, Chris Redekop <chris@replicon.com> wrote:
On a side note I am sporadically seeing another error on hotstandby startup.
I'm not terribly concerned about it as it is pretty rare and it will work
on a retry so it's not a big deal. The error is "FATAL: out-of-order XID
insertion in KnownAssignedXids". If you think it might be a bug and are
interested in hunting it down let me know and I'll help any way I can...but
if you're not too worried about it then neither am I :)
I'd be interested to see further details of this if you see it again,
or have access to previous logs.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
oops....reply-to-all
---------- Forwarded message ----------
From: Chris Redekop <chris@replicon.com>
Date: Wed, Nov 2, 2011 at 8:41 AM
Subject: Re: [HACKERS] Hot Standby startup with overflowed snapshots
To: Simon Riggs <simon@2ndquadrant.com>
Sure, I've got quite a few logs lying around - I've attached 3 of 'em...let
me know if there are any specific things you'd like me to do or look for
next time it happens....
On Wed, Nov 2, 2011 at 2:59 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
Show quoted text
On Fri, Oct 28, 2011 at 3:42 AM, Chris Redekop <chris@replicon.com> wrote:
On a side note I am sporadically seeing another error on hotstandby
startup.
I'm not terribly concerned about it as it is pretty rare and it will
work
on a retry so it's not a big deal. The error is "FATAL: out-of-order
XID
insertion in KnownAssignedXids". If you think it might be a bug and are
interested in hunting it down let me know and I'll help any way Ican...but
if you're not too worried about it then neither am I :)
I'd be interested to see further details of this if you see it again,
or have access to previous logs.--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services