pg_dump -j against standbys

Started by Magnus Haganderover 9 years ago10 messages
#1Magnus Hagander
magnus@hagander.net
1 attachment(s)

pg_dump -j against a standby server returns a pretty bad error message when
pointed at a standby node:

pg_dump: [archiver (db)] query failed: ERROR: cannot assign TransactionIds
during recovery
pg_dump: [archiver (db)] query was: SELECT pg_export_snapshot()

That looks quite scary to the user, and also throws errors in the server
log which monitoring tools or DBAs will react to.

PFA a patch that changes this to:
pg_dump: Synchronized snapshots are not supported on standby servers.
Run with --no-synchronized-snapshots instead if you do not need
synchronized snapshots.

which is a message similar (the hint identical) the one you get if you
point it at a version older than 9.2 which doesn't have sync snapshots.

I think the cleanest way to do it is to just track if it's a standby in the
AH struct as written.

Comments?

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

Attachments:

pg_dump_sync_snapshots.patchtext/x-patch; charset=US-ASCII; name=pg_dump_sync_snapshots.patchDownload
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 83f6029..f94caa3 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -173,6 +173,7 @@ typedef struct Archive
 	int			verbose;
 	char	   *remoteVersionStr;		/* server's version string */
 	int			remoteVersion;	/* same in numeric form */
+	bool		isStandby;		/* is server a standby node */
 
 	int			minRemoteVersion;		/* allowable range */
 	int			maxRemoteVersion;
diff --git a/src/bin/pg_dump/pg_backup_db.c b/src/bin/pg_dump/pg_backup_db.c
index 35ce945..dc073a6 100644
--- a/src/bin/pg_dump/pg_backup_db.c
+++ b/src/bin/pg_dump/pg_backup_db.c
@@ -37,6 +37,7 @@ _check_database_version(ArchiveHandle *AH)
 {
 	const char *remoteversion_str;
 	int			remoteversion;
+	PGresult   *res;
 
 	remoteversion_str = PQparameterStatus(AH->connection, "server_version");
 	remoteversion = PQserverVersion(AH->connection);
@@ -56,6 +57,12 @@ _check_database_version(ArchiveHandle *AH)
 				  remoteversion_str, progname, PG_VERSION);
 		exit_horribly(NULL, "aborting because of server version mismatch\n");
 	}
+
+	/* Identify if this is a standby node */
+	res = ExecuteSqlQuery((Archive *) AH, "SELECT pg_catalog.pg_is_in_recovery()", PGRES_TUPLES_OK);
+
+	AH->public.isStandby = (strcmp(PQgetvalue(res, 0, 0), "t") == 0);
+	PQclear(res);
 }
 
 /*
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 1267afb..e0c64cf 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -651,19 +651,13 @@ main(int argc, char **argv)
 	 * When running against 9.0 or later, check if we are in recovery mode,
 	 * which means we are on a hot standby.
 	 */
-	if (fout->remoteVersion >= 90000)
+	if (fout->remoteVersion >= 90000 && fout->isStandby)
 	{
-		PGresult   *res = ExecuteSqlQueryForSingleRow(fout, "SELECT pg_catalog.pg_is_in_recovery()");
-
-		if (strcmp(PQgetvalue(res, 0, 0), "t") == 0)
-		{
-			/*
-			 * On hot standby slaves, never try to dump unlogged table data,
-			 * since it will just throw an error.
-			 */
-			dopt.no_unlogged_table_data = true;
-		}
-		PQclear(res);
+		/*
+		 * On hot standby slaves, never try to dump unlogged table data, since
+		 * it will just throw an error.
+		 */
+		dopt.no_unlogged_table_data = true;
 	}
 
 	/* Select the appropriate subquery to convert user IDs to names */
@@ -1096,7 +1090,16 @@ setup_connection(Archive *AH, const char *dumpencoding,
 	else if (AH->numWorkers > 1 &&
 			 AH->remoteVersion >= 90200 &&
 			 !dopt->no_synchronized_snapshots)
+	{
+		if (!dopt->no_synchronized_snapshots && AH->isStandby)
+			exit_horribly(NULL,
+			 "Synchronized snapshots are not supported on standby servers.\n"
+						  "Run with --no-synchronized-snapshots instead if you do not need\n"
+						  "synchronized snapshots.\n");
+
+
 		AH->sync_snapshot_id = get_synchronized_snapshot(AH);
+	}
 }
 
 static void
#2Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Magnus Hagander (#1)
Re: pg_dump -j against standbys

On 5/24/16 6:27 AM, Magnus Hagander wrote:

pg_dump: Synchronized snapshots are not supported on standby servers.

Would it be that much harder to support this use case, perhaps by having
the replica request the snapshot from the master on the client's behalf?
It certainly doesn't surprise me that people would want to parallel dump
from a replica...
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#1)
Re: pg_dump -j against standbys

Magnus Hagander <magnus@hagander.net> writes:

I think the cleanest way to do it is to just track if it's a standby in the
AH struct as written.

Comments?

This patch will cause pg_dump to fail entirely against pre-9.0 servers.
You need to execute the test conditionally.

Also, in view of that, this test

-	if (fout->remoteVersion >= 90000)
+	if (fout->remoteVersion >= 90000 && fout->isStandby)

could be reduced to just "if (fout->isStandby)". And the comment in
front of it could stand some attention (possibly just replace it with
the one that's currently within the inner if() ... actually, that
comment should move to where you moved the test to, no?)

Also, why didn't you keep using ExecuteSqlQueryForSingleRow()?
As coded, you're losing a sanity check that the query gives exactly
one row back.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Michael Paquier
michael.paquier@gmail.com
In reply to: Jim Nasby (#2)
Re: pg_dump -j against standbys

On Tue, May 24, 2016 at 7:02 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

On 5/24/16 6:27 AM, Magnus Hagander wrote:

pg_dump: Synchronized snapshots are not supported on standby servers.

Would it be that much harder to support this use case, perhaps by having the
replica request the snapshot from the master on the client's behalf? It
certainly doesn't surprise me that people would want to parallel dump from a
replica...

For HEAD, I totally agree. However, it seems to me that what Magnus is
looking for here is a backpatch that would allow users to get more
useful information related to the error that's happening: pg_dump
provides now an error that does not help at all in diagnosing what the
problem is. And that is confusing.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#3)
Re: pg_dump -j against standbys

On Tue, May 24, 2016 at 5:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

I think the cleanest way to do it is to just track if it's a standby in

the

AH struct as written.

Comments?

This patch will cause pg_dump to fail entirely against pre-9.0 servers.
You need to execute the test conditionally.

Ugh. can I blame coding while too jetlagged?

Also, in view of that, this test

-       if (fout->remoteVersion >= 90000)
+       if (fout->remoteVersion >= 90000 && fout->isStandby)

could be reduced to just "if (fout->isStandby)". And the comment in
front of it could stand some attention (possibly just replace it with
the one that's currently within the inner if() ... actually, that
comment should move to where you moved the test to, no?)

True. Will fix.

Also, why didn't you keep using ExecuteSqlQueryForSingleRow()?
As coded, you're losing a sanity check that the query gives exactly
one row back.

The reason I did that is that ExecuteSqlQueryForSingleRow() is a static
method in pg_dump.c. I was planning to go back and review that, and
consider moving it, but I forgot it :S

I think the clean thing is probably to use that one, and also move it over
to not be a static method in pg_dump.c, but instead sit next to
ExecuteSqlQuery in pg_backup_db.c. Do you agree that's reasonable, and
something that's OK to backpatch?

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#6Marko Tiikkaja
marko@joh.to
In reply to: Magnus Hagander (#5)
Re: pg_dump -j against standbys

On 25/05/16 15:59, Magnus Hagander wrote:

On Tue, May 24, 2016 at 5:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

This patch will cause pg_dump to fail entirely against pre-9.0 servers.
You need to execute the test conditionally.

Ugh. can I blame coding while too jetlagged?

You could try blaming Magnus. Oh, wait..

.m

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#5)
Re: pg_dump -j against standbys

Magnus Hagander <magnus@hagander.net> writes:

Also, why didn't you keep using ExecuteSqlQueryForSingleRow()?

The reason I did that is that ExecuteSqlQueryForSingleRow() is a static
method in pg_dump.c. I was planning to go back and review that, and
consider moving it, but I forgot it :S

I think the clean thing is probably to use that one, and also move it over
to not be a static method in pg_dump.c, but instead sit next to
ExecuteSqlQuery in pg_backup_db.c. Do you agree that's reasonable, and
something that's OK to backpatch?

No objection here, since it wouldn't be exposed outside pg_dump in any
case.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#7)
1 attachment(s)
Re: pg_dump -j against standbys

On Wed, May 25, 2016 at 4:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

Also, why didn't you keep using ExecuteSqlQueryForSingleRow()?

The reason I did that is that ExecuteSqlQueryForSingleRow() is a static
method in pg_dump.c. I was planning to go back and review that, and
consider moving it, but I forgot it :S

I think the clean thing is probably to use that one, and also move it

over

to not be a static method in pg_dump.c, but instead sit next to
ExecuteSqlQuery in pg_backup_db.c. Do you agree that's reasonable, and
something that's OK to backpatch?

No objection here, since it wouldn't be exposed outside pg_dump in any
case.

Here's an updated patch based on this,and the other feedback.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

Attachments:

pg_dump_sync_snapshots_2.patchtext/x-patch; charset=US-ASCII; name=pg_dump_sync_snapshots_2.patchDownload
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 83f6029..f94caa3 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -173,6 +173,7 @@ typedef struct Archive
 	int			verbose;
 	char	   *remoteVersionStr;		/* server's version string */
 	int			remoteVersion;	/* same in numeric form */
+	bool		isStandby;		/* is server a standby node */
 
 	int			minRemoteVersion;		/* allowable range */
 	int			maxRemoteVersion;
diff --git a/src/bin/pg_dump/pg_backup_db.c b/src/bin/pg_dump/pg_backup_db.c
index 35ce945..818bc9e 100644
--- a/src/bin/pg_dump/pg_backup_db.c
+++ b/src/bin/pg_dump/pg_backup_db.c
@@ -37,6 +37,7 @@ _check_database_version(ArchiveHandle *AH)
 {
 	const char *remoteversion_str;
 	int			remoteversion;
+	PGresult   *res;
 
 	remoteversion_str = PQparameterStatus(AH->connection, "server_version");
 	remoteversion = PQserverVersion(AH->connection);
@@ -56,6 +57,20 @@ _check_database_version(ArchiveHandle *AH)
 				  remoteversion_str, progname, PG_VERSION);
 		exit_horribly(NULL, "aborting because of server version mismatch\n");
 	}
+
+	/*
+	 * When running against 9.0 or later, check if we are in recovery mode,
+	 * which means we are on a hot standby.
+	 */
+	if (remoteversion >= 90000)
+	{
+		res = ExecuteSqlQueryForSingleRow((Archive *) AH, "SELECT pg_catalog.pg_is_in_recovery()");
+
+		AH->public.isStandby = (strcmp(PQgetvalue(res, 0, 0), "t") == 0);
+		PQclear(res);
+	}
+	else
+		AH->public.isStandby = false;
 }
 
 /*
@@ -389,6 +404,29 @@ ExecuteSqlQuery(Archive *AHX, const char *query, ExecStatusType status)
 }
 
 /*
+ * Execute an SQL query and verify that we got exactly one row back.
+ */
+PGresult *
+ExecuteSqlQueryForSingleRow(Archive *fout, char *query)
+{
+	PGresult   *res;
+	int			ntups;
+
+	res = ExecuteSqlQuery(fout, query, PGRES_TUPLES_OK);
+
+	/* Expecting a single result only */
+	ntups = PQntuples(res);
+	if (ntups != 1)
+		exit_horribly(NULL,
+					  ngettext("query returned %d row instead of one: %s\n",
+							   "query returned %d rows instead of one: %s\n",
+							   ntups),
+					  ntups, query);
+
+	return res;
+}
+
+/*
  * Convenience function to send a query.
  * Monitors result to detect COPY statements
  */
diff --git a/src/bin/pg_dump/pg_backup_db.h b/src/bin/pg_dump/pg_backup_db.h
index 6408f14..527449e 100644
--- a/src/bin/pg_dump/pg_backup_db.h
+++ b/src/bin/pg_dump/pg_backup_db.h
@@ -16,6 +16,7 @@ extern int	ExecuteSqlCommandBuf(Archive *AHX, const char *buf, size_t bufLen);
 extern void ExecuteSqlStatement(Archive *AHX, const char *query);
 extern PGresult *ExecuteSqlQuery(Archive *AHX, const char *query,
 				ExecStatusType status);
+extern PGresult *ExecuteSqlQueryForSingleRow(Archive *fout, char *query);
 
 extern void EndDBCopyMode(Archive *AHX, const char *tocEntryTag);
 
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 1267afb..a6550ed 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -264,7 +264,6 @@ static bool nonemptyReloptions(const char *reloptions);
 static void appendReloptionsArrayAH(PQExpBuffer buffer, const char *reloptions,
 						const char *prefix, Archive *fout);
 static char *get_synchronized_snapshot(Archive *fout);
-static PGresult *ExecuteSqlQueryForSingleRow(Archive *fout, char *query);
 static void setupDumpWorker(Archive *AHX);
 
 
@@ -648,23 +647,11 @@ main(int argc, char **argv)
 		dopt.no_security_labels = 1;
 
 	/*
-	 * When running against 9.0 or later, check if we are in recovery mode,
-	 * which means we are on a hot standby.
+	 * On hot standby slaves, never try to dump unlogged table data, since it
+	 * will just throw an error.
 	 */
-	if (fout->remoteVersion >= 90000)
-	{
-		PGresult   *res = ExecuteSqlQueryForSingleRow(fout, "SELECT pg_catalog.pg_is_in_recovery()");
-
-		if (strcmp(PQgetvalue(res, 0, 0), "t") == 0)
-		{
-			/*
-			 * On hot standby slaves, never try to dump unlogged table data,
-			 * since it will just throw an error.
-			 */
-			dopt.no_unlogged_table_data = true;
-		}
-		PQclear(res);
-	}
+	if (fout->isStandby)
+		dopt.no_unlogged_table_data = true;
 
 	/* Select the appropriate subquery to convert user IDs to names */
 	if (fout->remoteVersion >= 80100)
@@ -1096,7 +1083,16 @@ setup_connection(Archive *AH, const char *dumpencoding,
 	else if (AH->numWorkers > 1 &&
 			 AH->remoteVersion >= 90200 &&
 			 !dopt->no_synchronized_snapshots)
+	{
+		if (AH->isStandby)
+			exit_horribly(NULL,
+			 "Synchronized snapshots are not supported on standby servers.\n"
+						  "Run with --no-synchronized-snapshots instead if you do not need\n"
+						  "synchronized snapshots.\n");
+
+
 		AH->sync_snapshot_id = get_synchronized_snapshot(AH);
+	}
 }
 
 static void
@@ -17845,26 +17841,3 @@ appendReloptionsArrayAH(PQExpBuffer buffer, const char *reloptions,
 	if (!res)
 		write_msg(NULL, "WARNING: could not parse reloptions array\n");
 }
-
-/*
- * Execute an SQL query and verify that we got exactly one row back.
- */
-static PGresult *
-ExecuteSqlQueryForSingleRow(Archive *fout, char *query)
-{
-	PGresult   *res;
-	int			ntups;
-
-	res = ExecuteSqlQuery(fout, query, PGRES_TUPLES_OK);
-
-	/* Expecting a single result only */
-	ntups = PQntuples(res);
-	if (ntups != 1)
-		exit_horribly(NULL,
-					  ngettext("query returned %d row instead of one: %s\n",
-							   "query returned %d rows instead of one: %s\n",
-							   ntups),
-					  ntups, query);
-
-	return res;
-}
#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#8)
Re: pg_dump -j against standbys

Magnus Hagander <magnus@hagander.net> writes:

Here's an updated patch based on this,and the other feedback.

Looks sane in a quick once-over, but I haven't tested it.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#9)
Re: pg_dump -j against standbys

On Thu, May 26, 2016 at 5:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

Here's an updated patch based on this,and the other feedback.

Looks sane in a quick once-over, but I haven't tested it.

I've run some tests and it looks good. Will apply. Thanks for the quick
look!

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/