Re: [DOCS] max_worker_processes on the standby

Started by Alvaro Herreraover 10 years ago26 messages
#1Alvaro Herrera
alvherre@2ndquadrant.com
1 attachment(s)

oonishitk@nttdata.co.jp wrote:

The below error messages were shown in standby server log:

FATAL: could not access status of transaction 9009
DETAIL: Could not read from file "pg_commit_ts/0000" at offset 90112: Success.
CONTEXT: xlog redo Transaction/COMMIT: 2015-09-30 15:52:41.924141+09
LOG: startup process (PID 23199) exited with exit code 1
LOG: terminating any other active server processes

Before this FATAL, there were some INFO but annoying messages:

LOG: file "pg_commit_ts/0000" doesn't exist, reading as zeroes

Here's a patch.

I went over the commit_ts.c code a few more times. I eventually
realized that we were trying to update the value of the GUC, which is a
rather unreliable thing to do; this was made worse by the fact that we
were updating it in one process only.

I thought it was better to have a separate boolean flag, affecting the
recovery process only, which is set at startup time or when the
XLOG_PARAMETER_CHANGE message is received. The module is enabled if
either the GUC is set or we see that the master has the module enabled.
This only enables it as far as replaying xlog records though: if you use
the SQL interface, it will still raise an error that you cannot read
values unless the GUC is enabled. This seems fine to me.

A curious but benign effect of this patch is that if you have the module
disabled in the master but enabled in the standby, you can actually
query the commit times in the standby, and they will correspond to
whatever the master used in the commit xlog record.

Other small changes:

- Moved some code out of xlog_redo into a new public commit_ts.c
routine; made ActivateCommitTs and Deactivate* statics.

- In the previous commit I added an assert that we're not writing xlog
and replaying xlog at the same time. This is pointless because xlog.c
already complains about that, so this commit takes it out again.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

committs.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 78090c5..9613783 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -93,6 +93,14 @@ CommitTimestampShared *commitTsShared;
 /* GUC variable */
 bool		track_commit_timestamp;
 
+/*
+ * When this is set, commit_ts is force-enabled during recovery.  This is so
+ * that a standby can replay WAL records coming from a master with the setting
+ * enabled.  (Note that this doesn't enable SQL access to the data; it's
+ * effectively write-only until the GUC itself is enabled.)
+ */
+static bool		enable_during_recovery;
+
 static void SetXidCommitTsInPage(TransactionId xid, int nsubxids,
 					 TransactionId *subxids, TimestampTz ts,
 					 RepOriginId nodeid, int pageno);
@@ -100,6 +108,8 @@ static void TransactionIdSetCommitTs(TransactionId xid, TimestampTz ts,
 						 RepOriginId nodeid, int slotno);
 static int	ZeroCommitTsPage(int pageno, bool writeXlog);
 static bool CommitTsPagePrecedes(int page1, int page2);
+static void ActivateCommitTs(void);
+static void DeactivateCommitTs(bool do_wal);
 static void WriteZeroPageXlogRec(int pageno);
 static void WriteTruncateXlogRec(int pageno);
 static void WriteSetTimestampXlogRec(TransactionId mainxid, int nsubxids,
@@ -122,10 +132,6 @@ static void WriteSetTimestampXlogRec(TransactionId mainxid, int nsubxids,
  * subtrans implementation changes in the future, we might want to revisit the
  * decision of storing timestamp info for each subxid.
  *
- * The replaying_xlog parameter indicates whether the module should execute
- * its write even if the feature is nominally disabled, because we're replaying
- * a record generated from a master where the feature is enabled.
- *
  * The write_xlog parameter tells us whether to include an XLog record of this
  * or not.  Normally, this is called from transaction commit routines (both
  * normal and prepared) and the information will be stored in the transaction
@@ -136,18 +142,17 @@ static void WriteSetTimestampXlogRec(TransactionId mainxid, int nsubxids,
 void
 TransactionTreeSetCommitTsData(TransactionId xid, int nsubxids,
 							   TransactionId *subxids, TimestampTz timestamp,
-							   RepOriginId nodeid,
-							   bool replaying_xlog, bool write_xlog)
+							   RepOriginId nodeid, bool write_xlog)
 {
 	int			i;
 	TransactionId headxid;
 	TransactionId newestXact;
 
-	/* We'd better not try to write xlog during replay */
-	Assert(!(write_xlog && replaying_xlog));
-
-	/* No-op if feature not enabled, unless replaying WAL */
-	if (!track_commit_timestamp && !replaying_xlog)
+	/*
+	 * No-op if the module is not enabled, but allow writes in a standby
+	 * during recovery.
+	 */
+	if (!track_commit_timestamp && !enable_during_recovery)
 		return;
 
 	/*
@@ -534,32 +539,26 @@ ZeroCommitTsPage(int pageno, bool writeXlog)
 /*
  * This must be called ONCE during postmaster or standalone-backend startup,
  * after StartupXLOG has initialized ShmemVariableCache->nextXid.
+ *
+ * Caller may choose to enable the feature even when it is turned off in the
+ * configuration.
  */
 void
-StartupCommitTs(void)
+StartupCommitTs(bool force_enable)
 {
-	TransactionId xid = ShmemVariableCache->nextXid;
-	int			pageno = TransactionIdToCTsPage(xid);
-
-	if (track_commit_timestamp)
-	{
-		ActivateCommitTs();
-		return;
-	}
-
-	LWLockAcquire(CommitTsControlLock, LW_EXCLUSIVE);
-
 	/*
-	 * Initialize our idea of the latest page number.
+	 * If the module is not enabled, there's nothing to do here.  The module
+	 * could still be activated from elsewhere.
 	 */
-	CommitTsCtl->shared->latest_page_number = pageno;
+	if (!track_commit_timestamp && !force_enable)
+		return;
 
-	LWLockRelease(CommitTsControlLock);
+	ActivateCommitTs();
 }
 
 /*
  * This must be called ONCE during postmaster or standalone-backend startup,
- * when commit timestamp is enabled, after recovery has finished.
+ * after recovery has finished.
  */
 void
 CompleteCommitTsInitialization(void)
@@ -569,6 +568,31 @@ CompleteCommitTsInitialization(void)
 }
 
 /*
+ * Activate or deactivate CommitTs' upon reception of a XLOG_PARAMETER_CHANGE
+ * XLog record in a standby.
+ */
+void
+CommitTsParameterChange(bool newvalue, bool oldvalue)
+{
+	/*
+	 * If the commit_ts module is disabled in this server and we get word from
+	 * the master server that it is enabled there, activate it so that we can
+	 * replay future WAL records involving it; also mark it as active on
+	 * pg_control.  If the old value was already set, we already did this, so
+	 * don't do anything.
+	 *
+	 * If the module is disabled in the master, disable it here too.
+	 */
+	if (newvalue)
+	{
+		if (!track_commit_timestamp && !oldvalue)
+			ActivateCommitTs();
+	}
+	else if (oldvalue)
+		DeactivateCommitTs(false);
+}
+
+/*
  * Activate this module whenever necessary.
  *		This must happen during postmaster or standalong-backend startup,
  *		or during WAL replay anytime the track_commit_timestamp setting is
@@ -584,12 +608,13 @@ CompleteCommitTsInitialization(void)
  * running with this module disabled for a while and thus might have skipped
  * the normal creation point.
  */
-void
+static void
 ActivateCommitTs(void)
 {
 	TransactionId xid = ShmemVariableCache->nextXid;
 	int			pageno = TransactionIdToCTsPage(xid);
 
+
 	/*
 	 * Re-Initialize our idea of the latest page number.
 	 */
@@ -629,6 +654,9 @@ ActivateCommitTs(void)
 		Assert(!CommitTsCtl->shared->page_dirty[slotno]);
 		LWLockRelease(CommitTsControlLock);
 	}
+
+	/* We can now replay xlog records from this module */
+	enable_during_recovery = true;
 }
 
 /*
@@ -641,7 +669,7 @@ ActivateCommitTs(void)
  * Resets CommitTs into invalid state to make sure we don't hand back
  * possibly-invalid data; also removes segments of old data.
  */
-void
+static void
 DeactivateCommitTs(bool do_wal)
 {
 	TransactionId xid = ShmemVariableCache->nextXid;
@@ -660,6 +688,9 @@ DeactivateCommitTs(bool do_wal)
 	LWLockRelease(CommitTsLock);
 
 	TruncateCommitTs(ReadNewTransactionId(), do_wal);
+
+	/* No longer enabled on recovery */
+	enable_during_recovery = false;
 }
 
 /*
@@ -699,7 +730,7 @@ ExtendCommitTs(TransactionId newestXact)
 	int			pageno;
 
 	/* nothing to do if module not enabled */
-	if (!track_commit_timestamp)
+	if (!track_commit_timestamp || !enable_during_recovery)
 		return;
 
 	/*
@@ -916,8 +947,7 @@ commit_ts_redo(XLogReaderState *record)
 			subxids = NULL;
 
 		TransactionTreeSetCommitTsData(setts->mainxid, nsubxids, subxids,
-									   setts->timestamp, setts->nodeid, false,
-									   true);
+									   setts->timestamp, setts->nodeid, true);
 		if (subxids)
 			pfree(subxids);
 	}
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index e005cc5..8c47e0f 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -2131,7 +2131,7 @@ RecordTransactionCommitPrepared(TransactionId xid,
 
 	TransactionTreeSetCommitTsData(xid, nchildren, children,
 								   replorigin_session_origin_timestamp,
-								   replorigin_session_origin, false, false);
+								   replorigin_session_origin, false);
 
 	/*
 	 * We don't currently try to sleep before flush here ... nor is there any
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 8f56a44..e8aafba 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -1237,8 +1237,7 @@ RecordTransactionCommit(void)
 
 		TransactionTreeSetCommitTsData(xid, nchildren, children,
 									   replorigin_session_origin_timestamp,
-									   replorigin_session_origin,
-									   false, false);
+									   replorigin_session_origin, false);
 	}
 
 	/*
@@ -5333,8 +5332,7 @@ xact_redo_commit(xl_xact_parsed_commit *parsed,
 
 	/* Set the transaction commit timestamp and metadata */
 	TransactionTreeSetCommitTsData(xid, parsed->nsubxacts, parsed->subxacts,
-								   commit_time, origin_id,
-								   true, false);
+								   commit_time, origin_id, false);
 
 	if (standbyState == STANDBY_DISABLED)
 	{
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 0266d61..08d1682 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6567,7 +6567,7 @@ StartupXLOG(void)
 			 * maintained during recovery and need not be started yet.
 			 */
 			StartupCLOG();
-			StartupCommitTs();
+			StartupCommitTs(ControlFile->track_commit_timestamp);
 			StartupSUBTRANS(oldestActiveXID);
 
 			/*
@@ -7336,7 +7336,7 @@ StartupXLOG(void)
 	if (standbyState == STANDBY_DISABLED)
 	{
 		StartupCLOG();
-		StartupCommitTs();
+		StartupCommitTs(false);
 		StartupSUBTRANS(oldestActiveXID);
 	}
 
@@ -9456,25 +9456,9 @@ xlog_redo(XLogReaderState *record)
 			ControlFile->minRecoveryPointTLI = ThisTimeLineID;
 		}
 
-		/*
-		 * Update the commit timestamp tracking. If there was a change it
-		 * needs to be activated or deactivated accordingly.
-		 */
-		if (track_commit_timestamp != xlrec.track_commit_timestamp)
-		{
-			track_commit_timestamp = xlrec.track_commit_timestamp;
-			ControlFile->track_commit_timestamp = track_commit_timestamp;
-			if (track_commit_timestamp)
-				ActivateCommitTs();
-			else
-
-				/*
-				 * We can't create a new WAL record here, but that's OK as
-				 * master did the WAL logging already and we will replay the
-				 * record from master in case we crash.
-				 */
-				DeactivateCommitTs(false);
-		}
+		CommitTsParameterChange(xlrec.track_commit_timestamp,
+								ControlFile->track_commit_timestamp);
+		ControlFile->track_commit_timestamp = xlrec.track_commit_timestamp;
 
 		UpdateControlFile();
 		LWLockRelease(ControlFileLock);
diff --git a/src/include/access/commit_ts.h b/src/include/access/commit_ts.h
index dc865d1..1b95b58 100644
--- a/src/include/access/commit_ts.h
+++ b/src/include/access/commit_ts.h
@@ -24,8 +24,7 @@ extern bool check_track_commit_timestamp(bool *newval, void **extra,
 
 extern void TransactionTreeSetCommitTsData(TransactionId xid, int nsubxids,
 							   TransactionId *subxids, TimestampTz timestamp,
-							   RepOriginId nodeid,
-							   bool replaying_xlog, bool write_xlog);
+							   RepOriginId nodeid, bool write_xlog);
 extern bool TransactionIdGetCommitTsData(TransactionId xid,
 							 TimestampTz *ts, RepOriginId *nodeid);
 extern TransactionId GetLatestCommitTsData(TimestampTz *ts,
@@ -35,9 +34,8 @@ extern Size CommitTsShmemBuffers(void);
 extern Size CommitTsShmemSize(void);
 extern void CommitTsShmemInit(void);
 extern void BootStrapCommitTs(void);
-extern void StartupCommitTs(void);
-extern void ActivateCommitTs(void);
-extern void DeactivateCommitTs(bool do_wal);
+extern void StartupCommitTs(bool force_enable);
+extern void CommitTsParameterChange(bool xlrecvalue, bool pgcontrolvalue);
 extern void CompleteCommitTsInitialization(void);
 extern void ShutdownCommitTs(void);
 extern void CheckPointCommitTs(void);
#2Noname
oonishitk@nttdata.co.jp
In reply to: Alvaro Herrera (#1)
Re: max_worker_processes on the standby

Here's a patch.

Thank you!
With this patch, the standby server down disappears in my environment.

Regards,

========
Takashi Ohnishi
oonishitk@nttdata.co.jp

-----Original Message-----
From: Alvaro Herrera [mailto:alvherre@2ndquadrant.com]
Sent: Thursday, October 01, 2015 7:48 AM
To: Fujii Masao <masao.fujii@gmail.com>; SPS 大西 高史(三技術) <oonishitk@nttdata.co.jp>
Cc: pgsql-docs <pgsql-docs@postgresql.org>; pgsql-hackers@postgresql.org
Subject: Re: [DOCS] max_worker_processes on the standby

oonishitk@nttdata.co.jp wrote:

The below error messages were shown in standby server log:

FATAL: could not access status of transaction 9009
DETAIL: Could not read from file "pg_commit_ts/0000" at offset 90112: Success.
CONTEXT: xlog redo Transaction/COMMIT: 2015-09-30 15:52:41.924141+09
LOG: startup process (PID 23199) exited with exit code 1
LOG: terminating any other active server processes

Before this FATAL, there were some INFO but annoying messages:

LOG: file "pg_commit_ts/0000" doesn't exist, reading as zeroes

Here's a patch.

I went over the commit_ts.c code a few more times. I eventually realized that we were trying to update the value of the GUC, which is a rather unreliable thing to do; this was made worse by the fact that we were updating it in one process only.

I thought it was better to have a separate boolean flag, affecting the recovery process only, which is set at startup time or when the XLOG_PARAMETER_CHANGE message is received. The module is enabled if either the GUC is set or we see that the master has the module enabled.
This only enables it as far as replaying xlog records though: if you use the SQL interface, it will still raise an error that you cannot read values unless the GUC is enabled. This seems fine to me.

A curious but benign effect of this patch is that if you have the module disabled in the master but enabled in the standby, you can actually query the commit times in the standby, and they will correspond to whatever the master used in the commit xlog record.

Other small changes:

- Moved some code out of xlog_redo into a new public commit_ts.c routine; made ActivateCommitTs and Deactivate* statics.

- In the previous commit I added an assert that we're not writing xlog and replaying xlog at the same time. This is pointless because xlog.c already complains about that, so this commit takes it out again.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#3Fujii Masao
masao.fujii@gmail.com
In reply to: Alvaro Herrera (#1)
Re: max_worker_processes on the standby

On Thu, Oct 1, 2015 at 7:48 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

oonishitk@nttdata.co.jp wrote:

The below error messages were shown in standby server log:

FATAL: could not access status of transaction 9009
DETAIL: Could not read from file "pg_commit_ts/0000" at offset 90112: Success.
CONTEXT: xlog redo Transaction/COMMIT: 2015-09-30 15:52:41.924141+09
LOG: startup process (PID 23199) exited with exit code 1
LOG: terminating any other active server processes

Before this FATAL, there were some INFO but annoying messages:

LOG: file "pg_commit_ts/0000" doesn't exist, reading as zeroes

Here's a patch.

I've not read the patch yet, but the patched server with track_commit_timestamp
enabled caused the following PANIC error when I ran pgbench.

PANIC: could not access status of transaction 2457
DETAIL: Could not read from file "pg_commit_ts/0000" at offset 24576: Success.
STATEMENT: END;

The procedure to reproduce the PANIC error is,

1. Enable track_commit_timestamp
2. Start up the server
3. Run pgbench -i -s10
4. Run pgbench -j 4 -c 4 -T 30

Regards,

--
Fujii Masao

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Fujii Masao (#3)
Re: max_worker_processes on the standby

Fujii Masao wrote:

I've not read the patch yet, but the patched server with track_commit_timestamp
enabled caused the following PANIC error when I ran pgbench.

Ah, that was a stupid typo: I used || instead of &&. Fixed that.

I also changed DeactivateCommitTs() so that it removes all slru segments
instead of leaving the last one around (which is what SimpleLruTruncate
was doing). This was noticeable when you ran a server with the feature
enabled (which created some files), then disabled it (which removed all
but the last) and ran for some more xacts; then enabled it again (which
created a new file, far ahead from the previously existing one). Since
the feature has enough protections that it doesn't have a problem with
no files at all being present, this works correctly. Note no
wal-logging of this operation: it's not necessary AFAICS because the
same deactivation routine would be called again in recovery and in
XLOG_PARAMETER_CHANGE, so it should be safe.

And pushed.

Thanks!

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#5Fujii Masao
masao.fujii@gmail.com
In reply to: Alvaro Herrera (#4)

On Fri, Oct 2, 2015 at 3:12 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

Fujii Masao wrote:

I've not read the patch yet, but the patched server with track_commit_timestamp
enabled caused the following PANIC error when I ran pgbench.

Ah, that was a stupid typo: I used || instead of &&. Fixed that.

I also changed DeactivateCommitTs() so that it removes all slru segments
instead of leaving the last one around (which is what SimpleLruTruncate
was doing). This was noticeable when you ran a server with the feature
enabled (which created some files), then disabled it (which removed all
but the last) and ran for some more xacts; then enabled it again (which
created a new file, far ahead from the previously existing one). Since
the feature has enough protections that it doesn't have a problem with
no files at all being present, this works correctly. Note no
wal-logging of this operation: it's not necessary AFAICS because the
same deactivation routine would be called again in recovery and in
XLOG_PARAMETER_CHANGE, so it should be safe.

What happens if pg_xact_commit_timestamp() is called in standby after
track_commit_timestamp is disabled in master, DeactivateCommitTs() is
called and all commit_ts files are removed in standby? I tried that case
and got the following assertion failure.

TRAP: FailedAssertion("!(((oldestCommitTs) != ((TransactionId) 0)) ==
((newestCommitTs) != ((TransactionId) 0)))", File: "commit_ts.c",
Line: 307)

LOG: server process (PID 11160) was terminated by signal 6: Aborted
DETAIL: Failed process was running: select num,
pg_xact_commit_timestamp(num::text::xid) from generate_series(1750,
1800) num

The steps to reproduce the problem is

1. Set up replication with track_commit_timestamp enabled.
2. Run several write transactions.
3. Disable track_commit_timestamp only in master and wait for
XLOG_PARAMETER_CHANGE record to be replayed in standby.
4. Run pg_xact_commit_timestamp() in standby.

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Fujii Masao (#5)
1 attachment(s)
Re: max_worker_processes on the standby

Fujii Masao wrote:

What happens if pg_xact_commit_timestamp() is called in standby after
track_commit_timestamp is disabled in master, DeactivateCommitTs() is
called and all commit_ts files are removed in standby? I tried that case
and got the following assertion failure.

Ah. So the standby needs to keep the module activated if it's enabled
locally, even when it receives a message that the master turned it off.
Here's a patch.

Thanks for your continued testing!

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

committs.patchtext/x-diff; charset=us-asciiDownload
commit 7441f88b746b7522f1714ed9fec95c3c4fe1dacb
Author:     Alvaro Herrera <alvherre@alvh.no-ip.org>
AuthorDate: Fri Oct 2 11:39:44 2015 -0300
CommitDate: Fri Oct 2 11:39:44 2015 -0300

    Don't disable commit_ts in standby if enabled locally
    
    Bug noticed by Fujii Masao

diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 79ca04a..24b8291 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -583,14 +583,15 @@ CommitTsParameterChange(bool newvalue, bool oldvalue)
 	 * pg_control.  If the old value was already set, we already did this, so
 	 * don't do anything.
 	 *
-	 * If the module is disabled in the master, disable it here too.
+	 * If the module is disabled in the master, disable it here too, unless
+	 * the module is enabled locally.
 	 */
 	if (newvalue)
 	{
 		if (!track_commit_timestamp && !oldvalue)
 			ActivateCommitTs();
 	}
-	else if (oldvalue)
+	else if (!track_commit_timestamp && oldvalue)
 		DeactivateCommitTs(false);
 }
 
#7Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#6)
Re: max_worker_processes on the standby

On Fri, Oct 2, 2015 at 10:58 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

Fujii Masao wrote:

What happens if pg_xact_commit_timestamp() is called in standby after
track_commit_timestamp is disabled in master, DeactivateCommitTs() is
called and all commit_ts files are removed in standby? I tried that case
and got the following assertion failure.

Ah. So the standby needs to keep the module activated if it's enabled
locally, even when it receives a message that the master turned it off.
Here's a patch.

The standby can have the feature enabled even though the master has it
disabled? That seems like it can only lead to heartache.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#8Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#7)
Re: max_worker_processes on the standby

Robert Haas wrote:

The standby can have the feature enabled even though the master has it
disabled? That seems like it can only lead to heartache.

Can you elaborate?

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#9Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#8)
Re: max_worker_processes on the standby

On Fri, Oct 2, 2015 at 2:59 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

Robert Haas wrote:

The standby can have the feature enabled even though the master has it
disabled? That seems like it can only lead to heartache.

Can you elaborate?

Sort of. Our rule up until now has always been that the standby is an
exact copy of the master. I suspect deviating from that behavior will
introduce bugs. I suspect having the standby make data changes that
aren't WAL-logged will introduce bugs; not to be unkind, but that
certainly seems like a lesson to take from what happened with
multixacts.

I haven't looked at this code well enough to guess specifically what
will go wrong. But consider people turning the feature on and off
repeatedly on the master, and separately on the standby, combined with
crashes on the standby that restart replay from earlier points
(possibly with settings that have changed in the meantime). Are we
really sure that we're never going to end up with the wrong files, or
inconsistent ones, on the standby? I have a really hard time
believing that's going to work out.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#10Fujii Masao
masao.fujii@gmail.com
In reply to: Alvaro Herrera (#6)
Re: max_worker_processes on the standby

On Fri, Oct 2, 2015 at 11:58 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

Fujii Masao wrote:

What happens if pg_xact_commit_timestamp() is called in standby after
track_commit_timestamp is disabled in master, DeactivateCommitTs() is
called and all commit_ts files are removed in standby? I tried that case
and got the following assertion failure.

Ah. So the standby needs to keep the module activated if it's enabled
locally, even when it receives a message that the master turned it off.
Here's a patch.

I'm afraid that this behavior might confuse the users.

Please imagine the following scenario.

1. start up the server with track_commit_timestamp disabled
2. run several transactions
3. shut down the server with immediate mode
4. restart the server with track_commit_timestamp enabled
5. run "SELECT pg_last_committed_xact()"
6. run "SELECT pg_xact_commit_timestamp(xid) FROM pg_last_committed_xact()"
7. restart the server
8. run "SELECT pg_last_committed_xact()"

Firstly, in #5, pg_last_committed_xact() returns the XID and
timestamp of the last transaction which was executed in #2
(i.e., while track_commit_timestamp was disabled).
This is confusing. I think that both pg_last_committed_xact()
and pg_xact_commit_timestamp() should return only the transaction
which was executed with track_commit_timestamp enabled.

Secondly, SELECT query in #6 returns NULL. This means that
pg_xact_commit_timestamp() may not handle the transaction
which pg_last_committed_xact() handles. This is also confusing.

Finally, in #8, pg_last_committed_xact() returns NULL while
it returned non-NULL before the restart. This is also confusing.

Regards,

--
Fujii Masao

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#11Petr Jelinek
petr@2ndquadrant.com
In reply to: Robert Haas (#9)
1 attachment(s)
Re: [HACKERS] max_worker_processes on the standby

On 2015-10-02 22:02, Robert Haas wrote:

On Fri, Oct 2, 2015 at 2:59 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

Robert Haas wrote:

The standby can have the feature enabled even though the master has it
disabled? That seems like it can only lead to heartache.

Can you elaborate?

Sort of. Our rule up until now has always been that the standby is an
exact copy of the master. I suspect deviating from that behavior will
introduce bugs. I suspect having the standby make data changes that
aren't WAL-logged will introduce bugs; not to be unkind, but that
certainly seems like a lesson to take from what happened with
multixacts.

I agree with that sentiment.

Attached patch adds variable to the shmem which is used for module
activation tracking - set to true in ActiveCommitTs() and false in
DeactivateCommitTs(). All the checks inside the commit_ts code were
changed to use this new variable. I also removed the static variable
Alvaro added in previous commit because it's not needed anymore.

The patch also does full cleanup of the shmem state in
DeactivateCommitTs() so that standby does not have stale last committed
transaction info after enable/disable/enable cycle on primary

I also removed no longer used do_wal parameters in couple of functions.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

committs-activation-fixes.patchapplication/x-patch; name=committs-activation-fixes.patchDownload
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 24b8291..8af8dbe 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -80,11 +80,17 @@ static SlruCtlData CommitTsCtlData;
 /*
  * We keep a cache of the last value set in shared memory.  This is protected
  * by CommitTsLock.
+ *
+ * This is also good place to keep the activation status.  We need to keep
+ * the activation status separate from the GUC bellow because the standby needs
+ * to activate the module if the primary has it active independently of what
+ * track_commit_timestamp setting is on standby.
  */
 typedef struct CommitTimestampShared
 {
 	TransactionId xidLastCommit;
 	CommitTimestampEntry dataLastCommit;
+	bool	commitTsActive;
 } CommitTimestampShared;
 
 CommitTimestampShared *commitTsShared;
@@ -93,14 +99,6 @@ CommitTimestampShared *commitTsShared;
 /* GUC variable */
 bool		track_commit_timestamp;
 
-/*
- * When this is set, commit_ts is force-enabled during recovery.  This is so
- * that a standby can replay WAL records coming from a master with the setting
- * enabled.  (Note that this doesn't enable SQL access to the data; it's
- * effectively write-only until the GUC itself is enabled.)
- */
-static bool		enable_during_recovery;
-
 static void SetXidCommitTsInPage(TransactionId xid, int nsubxids,
 					 TransactionId *subxids, TimestampTz ts,
 					 RepOriginId nodeid, int pageno);
@@ -109,7 +107,7 @@ static void TransactionIdSetCommitTs(TransactionId xid, TimestampTz ts,
 static int	ZeroCommitTsPage(int pageno, bool writeXlog);
 static bool CommitTsPagePrecedes(int page1, int page2);
 static void ActivateCommitTs(void);
-static void DeactivateCommitTs(bool do_wal);
+static void DeactivateCommitTs(void);
 static void WriteZeroPageXlogRec(int pageno);
 static void WriteTruncateXlogRec(int pageno);
 static void WriteSetTimestampXlogRec(TransactionId mainxid, int nsubxids,
@@ -148,11 +146,8 @@ TransactionTreeSetCommitTsData(TransactionId xid, int nsubxids,
 	TransactionId headxid;
 	TransactionId newestXact;
 
-	/*
-	 * No-op if the module is not enabled, but allow writes in a standby
-	 * during recovery.
-	 */
-	if (!track_commit_timestamp && !enable_during_recovery)
+	/* No-op if the module is not active. */
+	if (!commitTsShared->commitTsActive)
 		return;
 
 	/*
@@ -284,7 +279,7 @@ TransactionIdGetCommitTsData(TransactionId xid, TimestampTz *ts,
 	TransactionId newestCommitTs;
 
 	/* Error if module not enabled */
-	if (!track_commit_timestamp)
+	if (!commitTsShared->commitTsActive)
 		ereport(ERROR,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 				 errmsg("could not get commit timestamp data"),
@@ -367,7 +362,7 @@ GetLatestCommitTsData(TimestampTz *ts, RepOriginId *nodeid)
 	TransactionId xid;
 
 	/* Error if module not enabled */
-	if (!track_commit_timestamp)
+	if (!commitTsShared->commitTsActive)
 		ereport(ERROR,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 				 errmsg("could not get commit timestamp data"),
@@ -493,6 +488,7 @@ CommitTsShmemInit(void)
 		commitTsShared->xidLastCommit = InvalidTransactionId;
 		TIMESTAMP_NOBEGIN(commitTsShared->dataLastCommit.time);
 		commitTsShared->dataLastCommit.nodeid = InvalidRepOriginId;
+		commitTsShared->commitTsActive = false;
 	}
 	else
 		Assert(found);
@@ -566,7 +562,7 @@ CompleteCommitTsInitialization(void)
 	 * any leftover data.
 	 */
 	if (!track_commit_timestamp)
-		DeactivateCommitTs(true);
+		DeactivateCommitTs();
 }
 
 /*
@@ -588,11 +584,11 @@ CommitTsParameterChange(bool newvalue, bool oldvalue)
 	 */
 	if (newvalue)
 	{
-		if (!track_commit_timestamp && !oldvalue)
+		if (!commitTsShared->commitTsActive)
 			ActivateCommitTs();
 	}
-	else if (!track_commit_timestamp && oldvalue)
-		DeactivateCommitTs(false);
+	else if (commitTsShared->commitTsActive)
+		DeactivateCommitTs();
 }
 
 /*
@@ -645,7 +641,7 @@ ActivateCommitTs(void)
 	}
 	LWLockRelease(CommitTsLock);
 
-	/* Finally, create the current segment file, if necessary */
+	/* Create the current segment file, if necessary */
 	if (!SimpleLruDoesPhysicalPageExist(CommitTsCtl, pageno))
 	{
 		int			slotno;
@@ -657,8 +653,10 @@ ActivateCommitTs(void)
 		LWLockRelease(CommitTsControlLock);
 	}
 
-	/* We can now replay xlog records from this module */
-	enable_during_recovery = true;
+	/* Change the activation status in shared memory. */
+	LWLockAcquire(CommitTsLock, LW_EXCLUSIVE);
+	commitTsShared->commitTsActive = true;
+	LWLockRelease(CommitTsLock);
 }
 
 /*
@@ -672,7 +670,7 @@ ActivateCommitTs(void)
  * possibly-invalid data; also removes segments of old data.
  */
 static void
-DeactivateCommitTs(bool do_wal)
+DeactivateCommitTs(void)
 {
 	TransactionId xid = ShmemVariableCache->nextXid;
 	int			pageno = TransactionIdToCTsPage(xid);
@@ -684,11 +682,26 @@ DeactivateCommitTs(bool do_wal)
 	CommitTsCtl->shared->latest_page_number = pageno;
 	LWLockRelease(CommitTsControlLock);
 
+	/*
+	 * Cleanup the status in the shared memory.
+	 *
+	 * We reset everything in the commitTsShared record to prevent user from
+	 * getting confusing data about last committed transaction on the standby
+	 * when the module was activated repeatedly on the primary.
+	 */
 	LWLockAcquire(CommitTsLock, LW_EXCLUSIVE);
+
+	commitTsShared->commitTsActive = false;
+	commitTsShared->xidLastCommit = InvalidTransactionId;
+	TIMESTAMP_NOBEGIN(commitTsShared->dataLastCommit.time);
+	commitTsShared->dataLastCommit.nodeid = InvalidRepOriginId;
+
 	ShmemVariableCache->oldestCommitTs = InvalidTransactionId;
 	ShmemVariableCache->newestCommitTs = InvalidTransactionId;
+
 	LWLockRelease(CommitTsLock);
 
+
 	/*
 	 * Remove *all* files.  This is necessary so that there are no leftover
 	 * files; in the case where this feature is later enabled after running
@@ -698,9 +711,6 @@ DeactivateCommitTs(bool do_wal)
 	 * tidy.)
 	 */
 	(void) SlruScanDirectory(CommitTsCtl, SlruScanDirCbDeleteAll, NULL);
-
-	/* No longer enabled on recovery */
-	enable_during_recovery = false;
 }
 
 /*
@@ -740,7 +750,7 @@ ExtendCommitTs(TransactionId newestXact)
 	int			pageno;
 
 	/* nothing to do if module not enabled */
-	if (!track_commit_timestamp && !enable_during_recovery)
+	if (!commitTsShared->commitTsActive)
 		return;
 
 	/*
@@ -768,7 +778,7 @@ ExtendCommitTs(TransactionId newestXact)
  * Note that we don't need to flush XLOG here.
  */
 void
-TruncateCommitTs(TransactionId oldestXact, bool do_wal)
+TruncateCommitTs(TransactionId oldestXact)
 {
 	int			cutoffPage;
 
@@ -784,8 +794,7 @@ TruncateCommitTs(TransactionId oldestXact, bool do_wal)
 		return;					/* nothing to remove */
 
 	/* Write XLOG record */
-	if (do_wal)
-		WriteTruncateXlogRec(cutoffPage);
+	WriteTruncateXlogRec(cutoffPage);
 
 	/* Now we can remove the old CommitTs segment(s) */
 	SimpleLruTruncate(CommitTsCtl, cutoffPage);
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 6d55148..7c4ef58 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -1140,7 +1140,7 @@ vac_truncate_clog(TransactionId frozenXID,
 	 * Truncate CLOG, multixact and CommitTs to the oldest computed value.
 	 */
 	TruncateCLOG(frozenXID);
-	TruncateCommitTs(frozenXID, true);
+	TruncateCommitTs(frozenXID);
 	TruncateMultiXact(minMulti, minmulti_datoid);
 
 	/*
diff --git a/src/include/access/commit_ts.h b/src/include/access/commit_ts.h
index 1b95b58..3844bb3 100644
--- a/src/include/access/commit_ts.h
+++ b/src/include/access/commit_ts.h
@@ -40,7 +40,7 @@ extern void CompleteCommitTsInitialization(void);
 extern void ShutdownCommitTs(void);
 extern void CheckPointCommitTs(void);
 extern void ExtendCommitTs(TransactionId newestXact);
-extern void TruncateCommitTs(TransactionId oldestXact, bool do_wal);
+extern void TruncateCommitTs(TransactionId oldestXact);
 extern void SetCommitTsLimit(TransactionId oldestXact,
 				 TransactionId newestXact);
 extern void AdvanceOldestCommitTs(TransactionId oldestXact);
#12Robert Haas
robertmhaas@gmail.com
In reply to: Petr Jelinek (#11)
Re: [HACKERS] max_worker_processes on the standby

On Sat, Oct 17, 2015 at 5:37 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:

I agree with that sentiment.

Attached patch adds variable to the shmem which is used for module
activation tracking - set to true in ActiveCommitTs() and false in
DeactivateCommitTs(). All the checks inside the commit_ts code were changed
to use this new variable. I also removed the static variable Alvaro added in
previous commit because it's not needed anymore.

That sounds good to me. On a quick read-through it looks OK too.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#13Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#12)
1 attachment(s)
Re: [HACKERS] max_worker_processes on the standby

Robert Haas wrote:

On Sat, Oct 17, 2015 at 5:37 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:

I agree with that sentiment.

Attached patch adds variable to the shmem which is used for module
activation tracking - set to true in ActiveCommitTs() and false in
DeactivateCommitTs(). All the checks inside the commit_ts code were changed
to use this new variable. I also removed the static variable Alvaro added in
previous commit because it's not needed anymore.

That sounds good to me. On a quick read-through it looks OK too.

A revised version is attached. Two changes on top of Petr's patch:

1. In the two "get" routines, we were reading the flag without grabbing
the lock. This is okay in a master server, because the flag cannot
change in flight, but in a standby it is possible to have the module
be deactivated while TS data is being queried. To fix this, simply move
the check for the active shmem flag a few lines down to be inside the
locked section.

There are two other places that also read the flag without grabbing the
lock. These look okay to me, so I added comments stating so.

2. In TransactionIdGetCommitTsData() we were grabbing lock, reading some
data, releasing lock, then examining the "cached" value in shmem without
a lock to see if it matched the function argument; if it's match, grab
lock again and return the correct data. In the original coding this
made sense because there was no locked section prior to reading the
cache, but after the patch this was pointless. Make it simpler by
moving the read of the cache inside the locked section too.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

committs-activation-fixes-2.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 24b8291..b21a313 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -78,13 +78,21 @@ static SlruCtlData CommitTsCtlData;
 #define CommitTsCtl (&CommitTsCtlData)
 
 /*
- * We keep a cache of the last value set in shared memory.  This is protected
- * by CommitTsLock.
+ * We keep a cache of the last value set in shared memory.
+ *
+ * This is also good place to keep the activation status.  We keep this
+ * separate from the GUC so that the standby can activate the module if the
+ * primary has it active independently of the value of the GUC.
+ *
+ * This is protected by CommitTsLock.  In some places, we use commitTsActive
+ * without acquiring the lock; where this happens, a comment explains the
+ * rationale for it.
  */
 typedef struct CommitTimestampShared
 {
 	TransactionId xidLastCommit;
 	CommitTimestampEntry dataLastCommit;
+	bool	commitTsActive;
 } CommitTimestampShared;
 
 CommitTimestampShared *commitTsShared;
@@ -93,14 +101,6 @@ CommitTimestampShared *commitTsShared;
 /* GUC variable */
 bool		track_commit_timestamp;
 
-/*
- * When this is set, commit_ts is force-enabled during recovery.  This is so
- * that a standby can replay WAL records coming from a master with the setting
- * enabled.  (Note that this doesn't enable SQL access to the data; it's
- * effectively write-only until the GUC itself is enabled.)
- */
-static bool		enable_during_recovery;
-
 static void SetXidCommitTsInPage(TransactionId xid, int nsubxids,
 					 TransactionId *subxids, TimestampTz ts,
 					 RepOriginId nodeid, int pageno);
@@ -109,7 +109,7 @@ static void TransactionIdSetCommitTs(TransactionId xid, TimestampTz ts,
 static int	ZeroCommitTsPage(int pageno, bool writeXlog);
 static bool CommitTsPagePrecedes(int page1, int page2);
 static void ActivateCommitTs(void);
-static void DeactivateCommitTs(bool do_wal);
+static void DeactivateCommitTs(void);
 static void WriteZeroPageXlogRec(int pageno);
 static void WriteTruncateXlogRec(int pageno);
 static void WriteSetTimestampXlogRec(TransactionId mainxid, int nsubxids,
@@ -149,10 +149,14 @@ TransactionTreeSetCommitTsData(TransactionId xid, int nsubxids,
 	TransactionId newestXact;
 
 	/*
-	 * No-op if the module is not enabled, but allow writes in a standby
-	 * during recovery.
+	 * No-op if the module is not active.
+	 *
+	 * An unlocked read here is fine, because in a standby (the only place
+	 * where the flag can change in flight) this routine is only called by
+	 * the recovery process, which is also the only process which can change
+	 * the flag.
 	 */
-	if (!track_commit_timestamp && !enable_during_recovery)
+	if (!commitTsShared->commitTsActive)
 		return;
 
 	/*
@@ -283,30 +287,45 @@ TransactionIdGetCommitTsData(TransactionId xid, TimestampTz *ts,
 	TransactionId oldestCommitTs;
 	TransactionId newestCommitTs;
 
-	/* Error if module not enabled */
-	if (!track_commit_timestamp)
-		ereport(ERROR,
-				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
-				 errmsg("could not get commit timestamp data"),
-			  errhint("Make sure the configuration parameter \"%s\" is set.",
-					  "track_commit_timestamp")));
-
 	/* error if the given Xid doesn't normally commit */
 	if (!TransactionIdIsNormal(xid))
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 		errmsg("cannot retrieve commit timestamp for transaction %u", xid)));
 
-	/*
-	 * Return empty if the requested value is outside our valid range.
-	 */
 	LWLockAcquire(CommitTsLock, LW_SHARED);
+
+	/* Error if module not enabled */
+	if (!commitTsShared->commitTsActive)
+		ereport(ERROR,
+				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+				 errmsg("could not get commit timestamp data"),
+			  errhint("Make sure the configuration parameter \"%s\" is set.",
+					  "track_commit_timestamp")));
+
+	/*
+	 * If we're asked for the cached value, return that.  Otherwise, fall
+	 * through to read from SLRU.
+	 */
+	if (commitTsShared->xidLastCommit == xid)
+	{
+		*ts = commitTsShared->dataLastCommit.time;
+		if (nodeid)
+			*nodeid = commitTsShared->dataLastCommit.nodeid;
+
+		LWLockRelease(CommitTsLock);
+		return *ts != 0;
+	}
+
 	oldestCommitTs = ShmemVariableCache->oldestCommitTs;
 	newestCommitTs = ShmemVariableCache->newestCommitTs;
 	/* neither is invalid, or both are */
 	Assert(TransactionIdIsValid(oldestCommitTs) == TransactionIdIsValid(newestCommitTs));
 	LWLockRelease(CommitTsLock);
 
+	/*
+	 * Return empty if the requested value is outside our valid range.
+	 */
 	if (!TransactionIdIsValid(oldestCommitTs) ||
 		TransactionIdPrecedes(xid, oldestCommitTs) ||
 		TransactionIdPrecedes(newestCommitTs, xid))
@@ -317,27 +336,6 @@ TransactionIdGetCommitTsData(TransactionId xid, TimestampTz *ts,
 		return false;
 	}
 
-	/*
-	 * Use an unlocked atomic read on our cached value in shared memory; if
-	 * it's a hit, acquire a lock and read the data, after verifying that it's
-	 * still what we initially read.  Otherwise, fall through to read from
-	 * SLRU.
-	 */
-	if (commitTsShared->xidLastCommit == xid)
-	{
-		LWLockAcquire(CommitTsLock, LW_SHARED);
-		if (commitTsShared->xidLastCommit == xid)
-		{
-			*ts = commitTsShared->dataLastCommit.time;
-			if (nodeid)
-				*nodeid = commitTsShared->dataLastCommit.nodeid;
-
-			LWLockRelease(CommitTsLock);
-			return *ts != 0;
-		}
-		LWLockRelease(CommitTsLock);
-	}
-
 	/* lock is acquired by SimpleLruReadPage_ReadOnly */
 	slotno = SimpleLruReadPage_ReadOnly(CommitTsCtl, pageno, xid);
 	memcpy(&entry,
@@ -366,15 +364,16 @@ GetLatestCommitTsData(TimestampTz *ts, RepOriginId *nodeid)
 {
 	TransactionId xid;
 
+	LWLockAcquire(CommitTsLock, LW_SHARED);
+
 	/* Error if module not enabled */
-	if (!track_commit_timestamp)
+	if (!commitTsShared->commitTsActive)
 		ereport(ERROR,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 				 errmsg("could not get commit timestamp data"),
 			  errhint("Make sure the configuration parameter \"%s\" is set.",
 					  "track_commit_timestamp")));
 
-	LWLockAcquire(CommitTsLock, LW_SHARED);
 	xid = commitTsShared->xidLastCommit;
 	if (ts)
 		*ts = commitTsShared->dataLastCommit.time;
@@ -493,6 +492,7 @@ CommitTsShmemInit(void)
 		commitTsShared->xidLastCommit = InvalidTransactionId;
 		TIMESTAMP_NOBEGIN(commitTsShared->dataLastCommit.time);
 		commitTsShared->dataLastCommit.nodeid = InvalidRepOriginId;
+		commitTsShared->commitTsActive = false;
 	}
 	else
 		Assert(found);
@@ -566,7 +566,7 @@ CompleteCommitTsInitialization(void)
 	 * any leftover data.
 	 */
 	if (!track_commit_timestamp)
-		DeactivateCommitTs(true);
+		DeactivateCommitTs();
 }
 
 /*
@@ -588,11 +588,11 @@ CommitTsParameterChange(bool newvalue, bool oldvalue)
 	 */
 	if (newvalue)
 	{
-		if (!track_commit_timestamp && !oldvalue)
+		if (!commitTsShared->commitTsActive)
 			ActivateCommitTs();
 	}
-	else if (!track_commit_timestamp && oldvalue)
-		DeactivateCommitTs(false);
+	else if (commitTsShared->commitTsActive)
+		DeactivateCommitTs();
 }
 
 /*
@@ -645,7 +645,7 @@ ActivateCommitTs(void)
 	}
 	LWLockRelease(CommitTsLock);
 
-	/* Finally, create the current segment file, if necessary */
+	/* Create the current segment file, if necessary */
 	if (!SimpleLruDoesPhysicalPageExist(CommitTsCtl, pageno))
 	{
 		int			slotno;
@@ -657,8 +657,10 @@ ActivateCommitTs(void)
 		LWLockRelease(CommitTsControlLock);
 	}
 
-	/* We can now replay xlog records from this module */
-	enable_during_recovery = true;
+	/* Change the activation status in shared memory. */
+	LWLockAcquire(CommitTsLock, LW_EXCLUSIVE);
+	commitTsShared->commitTsActive = true;
+	LWLockRelease(CommitTsLock);
 }
 
 /*
@@ -672,21 +674,25 @@ ActivateCommitTs(void)
  * possibly-invalid data; also removes segments of old data.
  */
 static void
-DeactivateCommitTs(bool do_wal)
+DeactivateCommitTs(void)
 {
-	TransactionId xid = ShmemVariableCache->nextXid;
-	int			pageno = TransactionIdToCTsPage(xid);
-
 	/*
-	 * Re-Initialize our idea of the latest page number.
+	 * Cleanup the status in the shared memory.
+	 *
+	 * We reset everything in the commitTsShared record to prevent user from
+	 * getting confusing data about last committed transaction on the standby
+	 * when the module was activated repeatedly on the primary.
 	 */
-	LWLockAcquire(CommitTsControlLock, LW_EXCLUSIVE);
-	CommitTsCtl->shared->latest_page_number = pageno;
-	LWLockRelease(CommitTsControlLock);
-
 	LWLockAcquire(CommitTsLock, LW_EXCLUSIVE);
+
+	commitTsShared->commitTsActive = false;
+	commitTsShared->xidLastCommit = InvalidTransactionId;
+	TIMESTAMP_NOBEGIN(commitTsShared->dataLastCommit.time);
+	commitTsShared->dataLastCommit.nodeid = InvalidRepOriginId;
+
 	ShmemVariableCache->oldestCommitTs = InvalidTransactionId;
 	ShmemVariableCache->newestCommitTs = InvalidTransactionId;
+
 	LWLockRelease(CommitTsLock);
 
 	/*
@@ -697,10 +703,9 @@ DeactivateCommitTs(bool do_wal)
 	 * be overwritten anyway when we wrap around, but it seems better to be
 	 * tidy.)
 	 */
+	LWLockAcquire(CommitTsControlLock, LW_EXCLUSIVE);
 	(void) SlruScanDirectory(CommitTsCtl, SlruScanDirCbDeleteAll, NULL);
-
-	/* No longer enabled on recovery */
-	enable_during_recovery = false;
+	LWLockRelease(CommitTsControlLock);
 }
 
 /*
@@ -739,8 +744,13 @@ ExtendCommitTs(TransactionId newestXact)
 {
 	int			pageno;
 
-	/* nothing to do if module not enabled */
-	if (!track_commit_timestamp && !enable_during_recovery)
+	/*
+	 * Nothing to do if module not enabled.  Note we do an unlocked read of the
+	 * flag here, which is okay because this routine is only called from
+	 * GetNewTransactionId, which is never called in a standby.
+	 */
+	Assert(!InRecovery);
+	if (!commitTsShared->commitTsActive)
 		return;
 
 	/*
@@ -768,7 +778,7 @@ ExtendCommitTs(TransactionId newestXact)
  * Note that we don't need to flush XLOG here.
  */
 void
-TruncateCommitTs(TransactionId oldestXact, bool do_wal)
+TruncateCommitTs(TransactionId oldestXact)
 {
 	int			cutoffPage;
 
@@ -784,8 +794,7 @@ TruncateCommitTs(TransactionId oldestXact, bool do_wal)
 		return;					/* nothing to remove */
 
 	/* Write XLOG record */
-	if (do_wal)
-		WriteTruncateXlogRec(cutoffPage);
+	WriteTruncateXlogRec(cutoffPage);
 
 	/* Now we can remove the old CommitTs segment(s) */
 	SimpleLruTruncate(CommitTsCtl, cutoffPage);
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 6d55148..7c4ef58 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -1140,7 +1140,7 @@ vac_truncate_clog(TransactionId frozenXID,
 	 * Truncate CLOG, multixact and CommitTs to the oldest computed value.
 	 */
 	TruncateCLOG(frozenXID);
-	TruncateCommitTs(frozenXID, true);
+	TruncateCommitTs(frozenXID);
 	TruncateMultiXact(minMulti, minmulti_datoid);
 
 	/*
diff --git a/src/include/access/commit_ts.h b/src/include/access/commit_ts.h
index 1b95b58..3844bb3 100644
--- a/src/include/access/commit_ts.h
+++ b/src/include/access/commit_ts.h
@@ -40,7 +40,7 @@ extern void CompleteCommitTsInitialization(void);
 extern void ShutdownCommitTs(void);
 extern void CheckPointCommitTs(void);
 extern void ExtendCommitTs(TransactionId newestXact);
-extern void TruncateCommitTs(TransactionId oldestXact, bool do_wal);
+extern void TruncateCommitTs(TransactionId oldestXact);
 extern void SetCommitTsLimit(TransactionId oldestXact,
 				 TransactionId newestXact);
 extern void AdvanceOldestCommitTs(TransactionId oldestXact);
#14Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alvaro Herrera (#13)

Alvaro Herrera wrote:

Robert Haas wrote:

On Sat, Oct 17, 2015 at 5:37 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:

I agree with that sentiment.

Attached patch adds variable to the shmem which is used for module
activation tracking - set to true in ActiveCommitTs() and false in
DeactivateCommitTs(). All the checks inside the commit_ts code were changed
to use this new variable. I also removed the static variable Alvaro added in
previous commit because it's not needed anymore.

That sounds good to me. On a quick read-through it looks OK too.

A revised version is attached.

Pushed.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Fujii Masao
masao.fujii@gmail.com
In reply to: Alvaro Herrera (#14)

On Wed, Oct 28, 2015 at 3:07 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

Alvaro Herrera wrote:

Robert Haas wrote:

On Sat, Oct 17, 2015 at 5:37 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:

I agree with that sentiment.

Attached patch adds variable to the shmem which is used for module
activation tracking - set to true in ActiveCommitTs() and false in
DeactivateCommitTs(). All the checks inside the commit_ts code were changed
to use this new variable. I also removed the static variable Alvaro added in
previous commit because it's not needed anymore.

That sounds good to me. On a quick read-through it looks OK too.

A revised version is attached.

Pushed.

I found another strange behavior on track_commit_timestamp.
Here are the steps to reproduce it.

1. Start the master and standby servers with track_commit_timestamp enabled.
Since committs is activated in standby, pg_last_committed_xact() can
successfully return the timestamp of last transaction as expected.

2. Disable track_commit_timestamp in the master and restart the master server.
The parameter-change WAL record is streamed to the standby and committs
is deactivated. pg_last_committed_xact() causes an ERROR in the standby.

3. Run checkpoint in the master.

4. Run restartpoint in the standby after the checkpoint WAL record generated
in #3 is replicated to the standby.

5. Restart the standby server.
Committs is activated in the standby because track_commit_timestamp is
enabled. Since there is no parameter-change WAL record since last
restartpoint, committs is not deactivated. So pg_last_committed_xact()
can successfully return the timestamp.

6. Enable track_commit_timestamp in the master and restart the master server.

7. Disable track_commit_timestamp in the master and restart the master server.
Back to the same situation as #2. That is, pg_last_committed_xact() causes
an ERROR.

8. Promote the standby server to new master.
Since committs is still inactive even after the promotion,
pg_last_committed_xact() keeps causing an ERROR though
track_commit_timestamp is on.

What I think strange is that pg_last_committed_xact() behaves differently
in #2, #5, and #7 though the settings of track_commit_timestamp are same
in both servers, i.e., it's disabled in the master but enabled in the standby.

I was thinking that whether committs is active or not should depend on
the setting of track_commit_timestamp *after* the promotion.
The behavior in #8 looked strange.

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Robert Haas
robertmhaas@gmail.com
In reply to: Fujii Masao (#15)
Re: [HACKERS] max_worker_processes on the standby

On Thu, Oct 29, 2015 at 5:41 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

I found another strange behavior on track_commit_timestamp.
Here are the steps to reproduce it.

1. Start the master and standby servers with track_commit_timestamp enabled.
Since committs is activated in standby, pg_last_committed_xact() can
successfully return the timestamp of last transaction as expected.

2. Disable track_commit_timestamp in the master and restart the master server.
The parameter-change WAL record is streamed to the standby and committs
is deactivated. pg_last_committed_xact() causes an ERROR in the standby.

3. Run checkpoint in the master.

4. Run restartpoint in the standby after the checkpoint WAL record generated
in #3 is replicated to the standby.

5. Restart the standby server.
Committs is activated in the standby because track_commit_timestamp is
enabled.

This seems wrong already at this point.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#17Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Fujii Masao (#15)
1 attachment(s)
Re: [HACKERS] max_worker_processes on the standby

I paraphrase Fujii Masao, who wrote:

1. Start the master and standby servers with track_commit_timestamp enabled.
2. Disable track_commit_timestamp in the master and restart the master server.
3. Run checkpoint in the master.
4. Run restartpoint in the standby after the checkpoint WAL record generated
5. Restart the standby server.
6. Enable track_commit_timestamp in the master and restart the master server.
7. Disable track_commit_timestamp in the master and restart the master server.

What I think strange is that pg_last_committed_xact() behaves differently
in #2, #5, and #7 though the settings of track_commit_timestamp are same
in both servers, i.e., it's disabled in the master but enabled in the standby.

Interesting, thanks. You're right that this behaves oddly.

I think in order to fix these two points (#5 and #7), we need to make
the standby not honour the GUC at all, i.e. only heed what the master
setting is.

8. Promote the standby server to new master.
Since committs is still inactive even after the promotion,
pg_last_committed_xact() keeps causing an ERROR though
track_commit_timestamp is on.

I was thinking that whether committs is active or not should depend on
the setting of track_commit_timestamp *after* the promotion.
The behavior in #8 looked strange.

To fix this problem, we can re-run StartupCommitTs() after recovery is
done, this time making sure to honour the GUC setting.

I haven't tried the scenarios we fixed with the previous round of
patching, but this patch seems to close the problems just reported.
(My next step will be looking over the recovery test framework by
Michael et al, so that I can apply a few tests for this stuff.)
In the meantime, if you can look this over I would appreciate it.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

committs.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 349228d..03d9743 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -510,7 +510,7 @@ BootStrapCommitTs(void)
 	/*
 	 * Nothing to do here at present, unlike most other SLRU modules; segments
 	 * are created when the server is started with this module enabled. See
-	 * StartupCommitTs.
+	 * ActivateCommitTs.
 	 */
 }
 
@@ -544,13 +544,13 @@ ZeroCommitTsPage(int pageno, bool writeXlog)
  * configuration.
  */
 void
-StartupCommitTs(bool force_enable)
+StartupCommitTs(bool honour_config, bool force_enable)
 {
 	/*
 	 * If the module is not enabled, there's nothing to do here.  The module
 	 * could still be activated from elsewhere.
 	 */
-	if (track_commit_timestamp || force_enable)
+	if ((honour_config && track_commit_timestamp) || force_enable)
 		ActivateCommitTs();
 }
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 08d1682..0ed6822 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6567,7 +6567,7 @@ StartupXLOG(void)
 			 * maintained during recovery and need not be started yet.
 			 */
 			StartupCLOG();
-			StartupCommitTs(ControlFile->track_commit_timestamp);
+			StartupCommitTs(false, ControlFile->track_commit_timestamp);
 			StartupSUBTRANS(oldestActiveXID);
 
 			/*
@@ -7336,7 +7336,7 @@ StartupXLOG(void)
 	if (standbyState == STANDBY_DISABLED)
 	{
 		StartupCLOG();
-		StartupCommitTs(false);
+		StartupCommitTs(true, false);
 		StartupSUBTRANS(oldestActiveXID);
 	}
 
diff --git a/src/include/access/commit_ts.h b/src/include/access/commit_ts.h
index 3844bb3..991dd42 100644
--- a/src/include/access/commit_ts.h
+++ b/src/include/access/commit_ts.h
@@ -34,7 +34,7 @@ extern Size CommitTsShmemBuffers(void);
 extern Size CommitTsShmemSize(void);
 extern void CommitTsShmemInit(void);
 extern void BootStrapCommitTs(void);
-extern void StartupCommitTs(bool force_enable);
+extern void StartupCommitTs(bool honour_config, bool force_enable);
 extern void CommitTsParameterChange(bool xlrecvalue, bool pgcontrolvalue);
 extern void CompleteCommitTsInitialization(void);
 extern void ShutdownCommitTs(void);
#18Petr Jelinek
petr@2ndquadrant.com
In reply to: Alvaro Herrera (#17)
1 attachment(s)

On 2015-11-16 22:43, Alvaro Herrera wrote:

I paraphrase Fujii Masao, who wrote:

1. Start the master and standby servers with track_commit_timestamp enabled.
2. Disable track_commit_timestamp in the master and restart the master server.
3. Run checkpoint in the master.
4. Run restartpoint in the standby after the checkpoint WAL record generated
5. Restart the standby server.
6. Enable track_commit_timestamp in the master and restart the master server.
7. Disable track_commit_timestamp in the master and restart the master server.

What I think strange is that pg_last_committed_xact() behaves differently
in #2, #5, and #7 though the settings of track_commit_timestamp are same
in both servers, i.e., it's disabled in the master but enabled in the standby.

Interesting, thanks. You're right that this behaves oddly.

I think in order to fix these two points (#5 and #7), we need to make
the standby not honour the GUC at all, i.e. only heed what the master
setting is.

8. Promote the standby server to new master.
Since committs is still inactive even after the promotion,
pg_last_committed_xact() keeps causing an ERROR though
track_commit_timestamp is on.

I was thinking that whether committs is active or not should depend on
the setting of track_commit_timestamp *after* the promotion.
The behavior in #8 looked strange.

To fix this problem, we can re-run StartupCommitTs() after recovery is
done, this time making sure to honour the GUC setting.

I haven't tried the scenarios we fixed with the previous round of
patching, but this patch seems to close the problems just reported.
(My next step will be looking over the recovery test framework by
Michael et al, so that I can apply a few tests for this stuff.)
In the meantime, if you can look this over I would appreciate it.

While this seems good, I'd code it slightly differently. I didn't like
the addition of new bool when it's not really needed. This brings the
question if we actually need the BootStrapCommitTs and StartupCommitTs
functions which really don't do much though.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

committs-activation-fix.patchbinary/octet-stream; name=committs-activation-fix.patchDownload
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 349228d..edb0d2b 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -510,7 +510,7 @@ BootStrapCommitTs(void)
 	/*
 	 * Nothing to do here at present, unlike most other SLRU modules; segments
 	 * are created when the server is started with this module enabled. See
-	 * StartupCommitTs.
+	 * ActivateCommitTs.
 	 */
 }
 
@@ -544,13 +544,13 @@ ZeroCommitTsPage(int pageno, bool writeXlog)
  * configuration.
  */
 void
-StartupCommitTs(bool force_enable)
+StartupCommitTs(bool enabled)
 {
 	/*
 	 * If the module is not enabled, there's nothing to do here.  The module
 	 * could still be activated from elsewhere.
 	 */
-	if (track_commit_timestamp || force_enable)
+	if (enabled)
 		ActivateCommitTs();
 }
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index f17f834..ee0b959 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7339,7 +7339,7 @@ StartupXLOG(void)
 	if (standbyState == STANDBY_DISABLED)
 	{
 		StartupCLOG();
-		StartupCommitTs(false);
+		StartupCommitTs(track_commit_timestamp);
 		StartupSUBTRANS(oldestActiveXID);
 	}
 
diff --git a/src/include/access/commit_ts.h b/src/include/access/commit_ts.h
index 3844bb3..f5b3969 100644
--- a/src/include/access/commit_ts.h
+++ b/src/include/access/commit_ts.h
@@ -34,7 +34,7 @@ extern Size CommitTsShmemBuffers(void);
 extern Size CommitTsShmemSize(void);
 extern void CommitTsShmemInit(void);
 extern void BootStrapCommitTs(void);
-extern void StartupCommitTs(bool force_enable);
+extern void StartupCommitTs(bool enabled);
 extern void CommitTsParameterChange(bool xlrecvalue, bool pgcontrolvalue);
 extern void CompleteCommitTsInitialization(void);
 extern void ShutdownCommitTs(void);
#19Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Petr Jelinek (#18)

Petr Jelinek wrote:

While this seems good, I'd code it slightly differently. I didn't like the
addition of new bool when it's not really needed. This brings the question
if we actually need the BootStrapCommitTs and StartupCommitTs functions
which really don't do much though.

Thanks, it's certainly nice that this got simpler. (I'm not in love
with the idea of having xlog.c know what flag needs to pass in each
case, but I don't see any option that's more convenient.)

We weren't quite there however -- namely this patch didn't close problem
#8 in Fujii-san rundown. The problem is that when promoting,
standbyState is not STANDBY_DISABLED but STANDBY_SNAPSHOT_READY (which
is a bit surprising but not something this patch should fix). To fix
this I took the StartupCommitTs() call out of that block, so that it
runs inconditionally.

I also changed the hint message:

postgres=# select * from pg_last_committed_xact();
ERROR: could not get commit timestamp data
HINT: Make sure the configuration parameter "track_commit_timestamp" is set in the master server.

Otherwise this would be very confusing:

postgres=# select * from pg_last_committed_xact();
ERROR: could not get commit timestamp data
HINT: Make sure the configuration parameter "track_commit_timestamp" is set.
postgres=# show track_commit_timestamp ;
track_commit_timestamp
------------------------
on
(1 fila)

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Fujii Masao
masao.fujii@gmail.com
In reply to: Alvaro Herrera (#17)

On Tue, Nov 17, 2015 at 6:43 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

I paraphrase Fujii Masao, who wrote:

1. Start the master and standby servers with track_commit_timestamp enabled.
2. Disable track_commit_timestamp in the master and restart the master server.
3. Run checkpoint in the master.
4. Run restartpoint in the standby after the checkpoint WAL record generated
5. Restart the standby server.
6. Enable track_commit_timestamp in the master and restart the master server.
7. Disable track_commit_timestamp in the master and restart the master server.

What I think strange is that pg_last_committed_xact() behaves differently
in #2, #5, and #7 though the settings of track_commit_timestamp are same
in both servers, i.e., it's disabled in the master but enabled in the standby.

Interesting, thanks. You're right that this behaves oddly.

I think in order to fix these two points (#5 and #7), we need to make
the standby not honour the GUC at all, i.e. only heed what the master
setting is.

8. Promote the standby server to new master.
Since committs is still inactive even after the promotion,
pg_last_committed_xact() keeps causing an ERROR though
track_commit_timestamp is on.

I was thinking that whether committs is active or not should depend on
the setting of track_commit_timestamp *after* the promotion.
The behavior in #8 looked strange.

To fix this problem, we can re-run StartupCommitTs() after recovery is
done, this time making sure to honour the GUC setting.

I haven't tried the scenarios we fixed with the previous round of
patching, but this patch seems to close the problems just reported.
(My next step will be looking over the recovery test framework by
Michael et al, so that I can apply a few tests for this stuff.)
In the meantime, if you can look this over I would appreciate it.

Sorry for not reviewing the patch before you push it...

In HEAD, I ran very simple test case:

1. enable track_commit_timestamp
2. start the server
3. run some transactions
4. execute pg_last_committed_xact() -- returns non-null values
5. shutdown the server with immdiate mode
6. restart the server -- crash recovery happens
7. execute pg_last_committed_xact()

The last call of pg_last_committed_xact() returns NULL values, which means
that the xid and timestamp information of the last committed transaction
disappeared by crash recovery. Isn't this a bug?

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Fujii Masao (#20)

Fujii Masao wrote:

Sorry for not reviewing the patch before you push it...

In HEAD, I ran very simple test case:

1. enable track_commit_timestamp
2. start the server
3. run some transactions
4. execute pg_last_committed_xact() -- returns non-null values
5. shutdown the server with immdiate mode
6. restart the server -- crash recovery happens
7. execute pg_last_committed_xact()

The last call of pg_last_committed_xact() returns NULL values, which means
that the xid and timestamp information of the last committed transaction
disappeared by crash recovery. Isn't this a bug?

Hm, not really, because the status of the "last" transaction is kept in
shared memory as a cache and not expected to live across a restart.
However, I tested the equivalent scenario:

alvherre=# create table fg();
CREATE TABLE

alvherre=# select ts.* from pg_class,pg_xact_commit_timestamp(xmin) ts where relname = 'fg';
ts
-------------------------------
2015-12-04 12:41:48.017976-03
(1 fila)

then crash the server, and after recovery the data is gone:

alvherre=# select ts.*, xmin, c.relname from pg_class c,pg_xact_commit_timestamp(xmin) ts where relname = 'fg';
ts | xmin | relname
----+------+---------
| 630 | fg
(1 fila)

Not sure what is going on; my reading of the code certainly says that
the data should be there. I'm looking into it.

I also noticed that I didn't actually push the whole of the patch
yesterday -- I neglected to "git add" the latest changes, the ones that
fix the promotion scenario :-( so the commit messages is misleading
because it describes something that's not there.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22Fujii Masao
masao.fujii@gmail.com
In reply to: Alvaro Herrera (#21)
Re: [HACKERS] max_worker_processes on the standby

On Sat, Dec 5, 2015 at 12:56 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

Fujii Masao wrote:

Sorry for not reviewing the patch before you push it...

In HEAD, I ran very simple test case:

1. enable track_commit_timestamp
2. start the server
3. run some transactions
4. execute pg_last_committed_xact() -- returns non-null values
5. shutdown the server with immdiate mode
6. restart the server -- crash recovery happens
7. execute pg_last_committed_xact()

The last call of pg_last_committed_xact() returns NULL values, which means
that the xid and timestamp information of the last committed transaction
disappeared by crash recovery. Isn't this a bug?

Hm, not really, because the status of the "last" transaction is kept in
shared memory as a cache and not expected to live across a restart.
However, I tested the equivalent scenario:

alvherre=# create table fg();
CREATE TABLE

alvherre=# select ts.* from pg_class,pg_xact_commit_timestamp(xmin) ts where relname = 'fg';
ts
-------------------------------
2015-12-04 12:41:48.017976-03
(1 fila)

then crash the server, and after recovery the data is gone:

alvherre=# select ts.*, xmin, c.relname from pg_class c,pg_xact_commit_timestamp(xmin) ts where relname = 'fg';
ts | xmin | relname
----+------+---------
| 630 | fg
(1 fila)

Not sure what is going on; my reading of the code certainly says that
the data should be there. I'm looking into it.

I also noticed that I didn't actually push the whole of the patch
yesterday -- I neglected to "git add" the latest changes, the ones that
fix the promotion scenario :-( so the commit messages is misleading
because it describes something that's not there.

So firstly you will push those "latest" changes soon?

Regards,

--
Fujii Masao

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#23Robert Haas
robertmhaas@gmail.com
In reply to: Fujii Masao (#22)

On Mon, Dec 7, 2015 at 8:33 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Sat, Dec 5, 2015 at 12:56 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

Fujii Masao wrote:

Sorry for not reviewing the patch before you push it...

In HEAD, I ran very simple test case:

1. enable track_commit_timestamp
2. start the server
3. run some transactions
4. execute pg_last_committed_xact() -- returns non-null values
5. shutdown the server with immdiate mode
6. restart the server -- crash recovery happens
7. execute pg_last_committed_xact()

The last call of pg_last_committed_xact() returns NULL values, which means
that the xid and timestamp information of the last committed transaction
disappeared by crash recovery. Isn't this a bug?

Hm, not really, because the status of the "last" transaction is kept in
shared memory as a cache and not expected to live across a restart.
However, I tested the equivalent scenario:

alvherre=# create table fg();
CREATE TABLE

alvherre=# select ts.* from pg_class,pg_xact_commit_timestamp(xmin) ts where relname = 'fg';
ts
-------------------------------
2015-12-04 12:41:48.017976-03
(1 fila)

then crash the server, and after recovery the data is gone:

alvherre=# select ts.*, xmin, c.relname from pg_class c,pg_xact_commit_timestamp(xmin) ts where relname = 'fg';
ts | xmin | relname
----+------+---------
| 630 | fg
(1 fila)

Not sure what is going on; my reading of the code certainly says that
the data should be there. I'm looking into it.

I also noticed that I didn't actually push the whole of the patch
yesterday -- I neglected to "git add" the latest changes, the ones that
fix the promotion scenario :-( so the commit messages is misleading
because it describes something that's not there.

So firstly you will push those "latest" changes soon?

It seems like these changes haven't been pushed yet, and unfortunately
that's probably a beta blocker.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#23)
Re: [HACKERS] max_worker_processes on the standby

Robert Haas wrote:

On Mon, Dec 7, 2015 at 8:33 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

So firstly you will push those "latest" changes soon?

It seems like these changes haven't been pushed yet, and unfortunately
that's probably a beta blocker.

I'm on this.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#25Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#24)
Re: [HACKERS] max_worker_processes on the standby

On Wed, Dec 9, 2015 at 4:21 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

Robert Haas wrote:

On Mon, Dec 7, 2015 at 8:33 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

So firstly you will push those "latest" changes soon?

It seems like these changes haven't been pushed yet, and unfortunately
that's probably a beta blocker.

I'm on this.

Uh, when are you going to do this? At this point we've probably lost
another week getting rc1 out the door.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#26Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alvaro Herrera (#21)
Re: [HACKERS] max_worker_processes on the standby

Alvaro Herrera wrote:

Not sure what is going on; my reading of the code certainly says that
the data should be there. I'm looking into it.

I also noticed that I didn't actually push the whole of the patch
yesterday -- I neglected to "git add" the latest changes, the ones that
fix the promotion scenario :-( so the commit messages is misleading
because it describes something that's not there.

Pushed a fix.

I also wrote some tests using the RecoveryNode stuff submitted by
Michael Paquier. These aren't yet pushed, because we don't have the
framework; once we have that I can push them too. As far as I can tell,
these tests exercise all the cases that have been pointed out so far; I
can see some of them fail if I run on previous commits.

Thanks for the continued testing.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs