backward incompatible pg_basebackup and pg_receivexlog
pg_basebackup and pg_receivexlog from 9.3 won't work with earlier
servers anymore. I wonder if this has been fully thought through. We
have put in a lot of effort to make client programs compatible with many
server versions as well as keeping the client/server protocol compatible
across many versions. Both of these assumptions are now being broken,
which will result in all kinds of annoyances.
It seems to me that these tools could probably be enhanced to understand
both old and new formats.
Also, using the old tools against new server versions either behaves
funny or silently appears to work, both of which might be a problem.
I think if we are documenting the replication protocol as part of the
frontend/backend protocol and are exposing client tools that use it,
changes need to be done with the same rigor as other protocol changes.
As far as I can tell, there is no separate version number for the
replication part of the protocol, so either there needs to be one or the
protocol as a whole needs to be updated.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 19.03.2013 04:42, Peter Eisentraut wrote:
pg_basebackup and pg_receivexlog from 9.3 won't work with earlier
servers anymore. I wonder if this has been fully thought through. We
have put in a lot of effort to make client programs compatible with many
server versions as well as keeping the client/server protocol compatible
across many versions. Both of these assumptions are now being broken,
which will result in all kinds of annoyances.It seems to me that these tools could probably be enhanced to understand
both old and new formats.
Yes, this was discussed, and the consensus was to break
backwards-compatibility in 9.3, so that we can clean up the protocol to
be architecture-independent. That makes it easier to write portable
clients, from 9.3 onwards. See the thread ending at
/messages/by-id/4FE2279C.2070506@enterprisedb.com.
Also, using the old tools against new server versions either behaves
funny or silently appears to work, both of which might be a problem.
Hmm, it would be good to fix that. I wonder how, though. The most
straightforward way would be to add an explicit version check in the
clients, in backbranches. That would give a nice error message, but that
would only help with new minor versions.
I think if we are documenting the replication protocol as part of the
frontend/backend protocol and are exposing client tools that use it,
changes need to be done with the same rigor as other protocol changes.
Agreed. The plan is that we're going to be more careful with it from now on.
As far as I can tell, there is no separate version number for the
replication part of the protocol, so either there needs to be one or the
protocol as a whole needs to be updated.
Good point.
I propose that we add a version number, and call the 9.3 version version
2. Let's add a new field to the result set of the IDENTIFY_SYSTEM
command for the replication protocol version number. The version number
should be bumped if the replication protocol is changed in a
non-backwards-compatible way. That includes changes to the messages sent
in the COPY-both mode, after the START_REPLICATION command. If we just
add new commands, there's no need to bump the version; a client can
still check the server version number to determine if a command exists
or not.
We could also try to support old client + new server combination to some
extent by future-proofing the protocol a bit. We could specify that the
client should ignore any message types that it does not understand, and
also add a header length field to the WalData message ('w'), so that we
can add new header fields to it that old clients can just ignore. That
way we can keep the protocol version unchanged if we just add some
optional stuff to it. I'm not sure how useful that is in practice
though; it's not unreasonable that you must upgrade to the latest
client, as long as the new client works with old server versions.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Mar 19, 2013 at 11:39 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
On 19.03.2013 04:42, Peter Eisentraut wrote:
pg_basebackup and pg_receivexlog from 9.3 won't work with earlier
servers anymore. I wonder if this has been fully thought through. We
have put in a lot of effort to make client programs compatible with many
server versions as well as keeping the client/server protocol compatible
across many versions. Both of these assumptions are now being broken,
which will result in all kinds of annoyances.It seems to me that these tools could probably be enhanced to understand
both old and new formats.Yes, this was discussed, and the consensus was to break
backwards-compatibility in 9.3, so that we can clean up the protocol to be
architecture-independent. That makes it easier to write portable clients,
from 9.3 onwards. See the thread ending at
/messages/by-id/4FE2279C.2070506@enterprisedb.com.Also, using the old tools against new server versions either behaves
funny or silently appears to work, both of which might be a problem.Hmm, it would be good to fix that. I wonder how, though. The most
straightforward way would be to add an explicit version check in the
clients, in backbranches. That would give a nice error message, but that
would only help with new minor versions.
Still better to do it in a backbranch, than not at all. At least we
are then nicer to the ones that do keep up with upgrades, which we
recommend they do...
I think if we are documenting the replication protocol as part of the
frontend/backend protocol and are exposing client tools that use it,
changes need to be done with the same rigor as other protocol changes.Agreed. The plan is that we're going to be more careful with it from now on.
As far as I can tell, there is no separate version number for the
replication part of the protocol, so either there needs to be one or the
protocol as a whole needs to be updated.Good point.
I propose that we add a version number, and call the 9.3 version version 2.
Let's add a new field to the result set of the IDENTIFY_SYSTEM command for
the replication protocol version number. The version number should be bumped
if the replication protocol is changed in a non-backwards-compatible way.
+1.
That includes changes to the messages sent in the COPY-both mode, after the
START_REPLICATION command. If we just add new commands, there's no need to
bump the version; a client can still check the server version number to
determine if a command exists or not.
Sounds good.
We could also try to support old client + new server combination to some
extent by future-proofing the protocol a bit. We could specify that the
client should ignore any message types that it does not understand, and also
add a header length field to the WalData message ('w'), so that we can add
new header fields to it that old clients can just ignore. That way we can
keep the protocol version unchanged if we just add some optional stuff to
it. I'm not sure how useful that is in practice though; it's not
unreasonable that you must upgrade to the latest client, as long as the new
client works with old server versions.
I think that's quite reasonable, as long as we detect it, and can give
a nice error message telling the user how to deal with it.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 19.03.2013 13:49, Magnus Hagander wrote:
On Tue, Mar 19, 2013 at 11:39 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:On 19.03.2013 04:42, Peter Eisentraut wrote:
Also, using the old tools against new server versions either behaves
funny or silently appears to work, both of which might be a problem.Hmm, it would be good to fix that. I wonder how, though. The most
straightforward way would be to add an explicit version check in the
clients, in backbranches. That would give a nice error message, but that
would only help with new minor versions.Still better to do it in a backbranch, than not at all. At least we
are then nicer to the ones that do keep up with upgrades, which we
recommend they do...
Ok, here are patches for 9.1, 9.2 and master, to add explicit version
checks. Each branch has its own quirks. A 9.1 client should still work
with a 9.2 server, because we don't want to break things in a minor
version that used to accidentally work, even if it was never explicitly
supported. In master, I tweaked pg_basebackup so that it still works
with older servers if you don't use the "-X stream" option. The changes
to the streaming protocol only affected "-X stream".
This doesn't yet add the "streaming protocol version number" that was
discussed.
- Heikki
Attachments:
version-checks-91.patchtext/x-diff; name=version-checks-91.patchDownload
commit aa5d7d58ba40187bd8c6a2216bfd24514da78003
Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Mon Mar 25 11:03:20 2013 +0200
Add a server version check to pg_basebackup and pg_receivexlog.
These programs don't work against 9.0 or earlier servers, so check that when
the connection is made. That's better than a cryptic error message you got
before.
Also, these programs won't work with a 9.3 server, because the WAL streaming
protocol was changed in a non-backwards-compatible way. As a general rule,
we don't make any guarantee that an old client will work with a new server,
so check that. However, allow a 9.1 client to connect to a 9.2 server, to
avoid breaking environments that currently work; a 9.1 client happens to
work with a 9.2 server, even though we didn't make any great effort to
ensure that.
This patch is for the 9.1 and 9.2 branches, I'll commit a similar patch to
master later. Although this isn't a critical bug fix, it seems safe enough
to back-patch. The error message you got when connecting to a 9.3devel
server without this patch was cryptic enough to warrant backpatching.
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 472df3a..d68e742 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -816,6 +816,9 @@ BaseBackup(void)
int i;
char xlogstart[64];
char xlogend[64];
+ int minServerMajor,
+ maxServerMajor;
+ int serverMajor;
/*
* Connect in replication mode to the server
@@ -823,6 +826,24 @@ BaseBackup(void)
conn = GetConnection();
/*
+ * Check server version. BASE_BACKUP command was introduced in 9.1, so
+ * we can't work with servers older than 9.1. We don't officially support
+ * servers newer than the client, but the 9.1 version happens to work with
+ * a 9.2 server. This version check was added to 9.1 branch in a minor
+ * release, so allow connecting to a 9.2 server, to avoid breaking
+ * environments that worked before this version check was added.
+ */
+ minServerMajor = 901;
+ maxServerMajor = 902;
+ serverMajor = PQserverVersion(conn) / 100;
+ if (serverMajor < minServerMajor || serverMajor > maxServerMajor)
+ {
+ fprintf(stderr, _("%s: unsupported server version %s\n"),
+ progname, PQparameterStatus(conn, "server_version"));
+ disconnect_and_exit(1);
+ }
+
+ /*
* Start the actual backup
*/
PQescapeStringConn(conn, escaped_label, label, sizeof(escaped_label), &i);
version-checks-92.patchtext/x-diff; name=version-checks-92.patchDownload
commit 6980497f7d7f4d17b918a7a433aa904943a4bb97
Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Mon Mar 25 11:02:55 2013 +0200
Add a server version check to pg_basebackup and pg_receivexlog.
These programs don't work against 9.0 or earlier servers, so check that when
the connection is made. That's better than a cryptic error message you got
before.
Also, these programs won't work with a 9.3 server, because the WAL streaming
protocol was changed in a non-backwards-compatible way. As a general rule,
we don't make any guarantee that an old client will work with a new server,
so check that. However, allow a 9.1 client to connect to a 9.2 server, to
avoid breaking environments that currently work; a 9.1 client happens to
work with a 9.2 server, even though we didn't make any great effort to
ensure that.
This patch is for the 9.1 and 9.2 branches, I'll commit a similar patch to
master later. Although this isn't a critical bug fix, it seems safe enough
to back-patch. The error message you got when connecting to a 9.3devel
server without this patch was cryptic enough to warrant backpatching.
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index bddd371..19cc9e8 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -947,6 +947,9 @@ BaseBackup(void)
int i;
char xlogstart[64];
char xlogend[64];
+ int minServerMajor,
+ maxServerMajor;
+ int serverMajor;
/*
* Connect in replication mode to the server
@@ -957,6 +960,21 @@ BaseBackup(void)
exit(1);
/*
+ * Check server version. BASE_BACKUP command was introduced in 9.1, so
+ * we can't work with servers older than 9.1. And we don't support servers
+ * newer than the client.
+ */
+ minServerMajor = 901;
+ maxServerMajor = PG_VERSION_NUM / 100;
+ serverMajor = PQserverVersion(conn) / 100;
+ if (serverMajor < minServerMajor || serverMajor > maxServerMajor)
+ {
+ fprintf(stderr, _("%s: unsupported server version %s\n"),
+ progname, PQparameterStatus(conn, "server_version"));
+ disconnect_and_exit(1);
+ }
+
+ /*
* Run IDENTIFY_SYSTEM so we can get the timeline
*/
res = PQexec(conn, "IDENTIFY_SYSTEM");
diff --git a/src/bin/pg_basebackup/pg_receivexlog.c b/src/bin/pg_basebackup/pg_receivexlog.c
index 4d91add..b7df693 100644
--- a/src/bin/pg_basebackup/pg_receivexlog.c
+++ b/src/bin/pg_basebackup/pg_receivexlog.c
@@ -220,6 +220,9 @@ StreamLog(void)
PGresult *res;
uint32 timeline;
XLogRecPtr startpos;
+ int minServerMajor,
+ maxServerMajor;
+ int serverMajor;
/*
* Connect in replication mode to the server
@@ -230,6 +233,21 @@ StreamLog(void)
return;
/*
+ * Check server version. IDENTIFY_SYSTEM didn't return the current xlog
+ * position before 9.1, so we can't work with servers older than 9.1. And
+ * we don't support servers newer than the client.
+ */
+ minServerMajor = 901;
+ maxServerMajor = PG_VERSION_NUM / 100;
+ serverMajor = PQserverVersion(conn) / 100;
+ if (serverMajor < minServerMajor || serverMajor > maxServerMajor)
+ {
+ fprintf(stderr, _("%s: unsupported server version %s\n"),
+ progname, PQparameterStatus(conn, "server_version"));
+ disconnect_and_exit(1);
+ }
+
+ /*
* Run IDENTIFY_SYSTEM so we can get the timeline and current xlog
* position.
*/
version-checks-master.patchtext/x-diff; name=version-checks-master.patchDownload
commit a3b8853d79498d9e0b6b8596abf6650237065a11
Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Fri Mar 22 13:02:59 2013 +0200
Make pg_basebackup work with pre-9.3 servers, and add server version check.
A new 'starttli' field was added to the response of BASE_BACKUP command.
Make pg_basebackup tolerate the case that it's missing, so that it still
works with older servers.
Add an explicit check for the server version, so that you get a nicer error
message if you try to use it with a pre-9.1 server.
The streaming protocol message format changed in 9.3, so -X stream still won't
work with pre-9.3 servers. I added a version check to ReceiveXLogStream()
earlier, but write that slightly differently, so that in 9.4, it will still
work with a 9.3 server. (In 9.4, the error message needs to be adjusted to
"9.3 or above", though). Also, if the version check fails, don't retry.
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 578541a..9fe440a 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -520,6 +520,12 @@ PostgreSQL documentation
for all additional tablespaces must be identical whenever a backup is
restored. The main data directory, however, is relocatable to any location.
</para>
+
+ <para>
+ <application>pg_basebackup</application> works with servers of the same
+ or an older major version, down to 9.1. However, WAL streaming mode (-X
+ stream) only works with server version 9.3.
+ </para>
</refsect1>
<refsect1>
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index eacb592..4558506 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -1223,12 +1223,16 @@ BaseBackup(void)
{
PGresult *res;
char *sysidentifier;
+ uint32 latesttli;
uint32 starttli;
char current_path[MAXPGPATH];
char escaped_label[MAXPGPATH];
int i;
char xlogstart[64];
char xlogend[64];
+ int minServerMajor,
+ maxServerMajor;
+ int serverMajor;
/*
* Connect in replication mode to the server
@@ -1239,6 +1243,31 @@ BaseBackup(void)
exit(1);
/*
+ * Check server version. BASE_BACKUP command was introduced in 9.1, so
+ * we can't work with servers older than 9.1.
+ */
+ minServerMajor = 901;
+ maxServerMajor = PG_VERSION_NUM / 100;
+ serverMajor = PQserverVersion(conn) / 100;
+ if (serverMajor < minServerMajor || serverMajor > maxServerMajor)
+ {
+ const char *serverver = PQparameterStatus(conn, "server_version");
+ fprintf(stderr, _("%s: incompatible server version %s\n"),
+ progname, serverver ? serverver : "'unknown'");
+ disconnect_and_exit(1);
+ }
+
+ /*
+ * If WAL streaming was requested, also check that the server is new
+ * enough for that.
+ */
+ if (streamwal && !CheckServerVersionForStreaming(conn))
+ {
+ /* Error message already written in CheckServerVersionForStreaming() */
+ disconnect_and_exit(1);
+ }
+
+ /*
* Build contents of recovery.conf if requested
*/
if (writerecoveryconf)
@@ -1262,6 +1291,7 @@ BaseBackup(void)
disconnect_and_exit(1);
}
sysidentifier = pg_strdup(PQgetvalue(res, 0, 0));
+ latesttli = atoi(PQgetvalue(res, 0, 1));
PQclear(res);
/*
@@ -1293,7 +1323,7 @@ BaseBackup(void)
progname, PQerrorMessage(conn));
disconnect_and_exit(1);
}
- if (PQntuples(res) != 1 || PQnfields(res) < 2)
+ if (PQntuples(res) != 1)
{
fprintf(stderr,
_("%s: server returned unexpected response to BASE_BACKUP command; got %d rows and %d fields, expected %d rows and %d fields\n"),
@@ -1302,8 +1332,14 @@ BaseBackup(void)
}
strcpy(xlogstart, PQgetvalue(res, 0, 0));
- starttli = atoi(PQgetvalue(res, 0, 1));
-
+ /*
+ * 9.3 and later sends the TLI of the starting point. With older servers,
+ * assume it's the same as the latest timeline reported by IDENTIFY_SYSTEM.
+ */
+ if (PQnfields(res) >= 2)
+ starttli = atoi(PQgetvalue(res, 0, 1));
+ else
+ starttli = latesttli;
PQclear(res);
MemSet(xlogend, 0, sizeof(xlogend));
diff --git a/src/bin/pg_basebackup/pg_receivexlog.c b/src/bin/pg_basebackup/pg_receivexlog.c
index e68f8ea..34869b8 100644
--- a/src/bin/pg_basebackup/pg_receivexlog.c
+++ b/src/bin/pg_basebackup/pg_receivexlog.c
@@ -220,6 +220,9 @@ StreamLog(void)
uint32 servertli;
uint32 hi,
lo;
+ int minServerMajor,
+ maxServerMajor;
+ int serverMajor;
/*
* Connect in replication mode to the server
@@ -229,6 +232,16 @@ StreamLog(void)
/* Error message already written in GetConnection() */
return;
+ if (!CheckServerVersionForStreaming(conn))
+ {
+ /*
+ * Error message already written in CheckServerVersionForStreaming().
+ * There's no hope of recovering from a version mismatch, so don't
+ * retry.
+ */
+ disconnect_and_exit(1);
+ }
+
/*
* Run IDENTIFY_SYSTEM so we can get the timeline and current xlog
* position.
diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c
index 1f7611f..2bf4df9 100644
--- a/src/bin/pg_basebackup/receivelog.c
+++ b/src/bin/pg_basebackup/receivelog.c
@@ -437,6 +437,40 @@ sendFeedback(PGconn *conn, XLogRecPtr blockpos, int64 now, bool replyRequested)
}
/*
+ * Check that the server version we're connected to is supported by
+ * ReceiveXlogStream().
+ *
+ * If it's not, an error message is printed to stderr, and false is returned.
+ */
+bool
+CheckServerVersionForStreaming(PGconn *conn)
+{
+ int minServerMajor,
+ maxServerMajor;
+ int serverMajor;
+
+ /*
+ * The message format used in streaming replication changed in 9.3, so we
+ * cannot stream from older servers. And we don't support servers newer
+ * than the client; it might work, but we don't know, so err on the safe
+ * side.
+ */
+ minServerMajor = 903;
+ maxServerMajor = PG_VERSION_NUM / 100;
+ serverMajor = PQserverVersion(conn) / 100;
+ if (serverMajor < minServerMajor || serverMajor > maxServerMajor)
+ {
+ const char *serverver = PQparameterStatus(conn, "server_version");
+ fprintf(stderr, _("%s: incompatible server version %s; streaming is only supported with server version %s\n"),
+ progname,
+ serverver ? serverver : "'unknown'",
+ "9.3");
+ return false;
+ }
+ return true;
+}
+
+/*
* Receive a log stream starting at the specified position.
*
* If sysidentifier is specified, validate that both the system
@@ -476,19 +510,11 @@ ReceiveXlogStream(PGconn *conn, XLogRecPtr startpos, uint32 timeline,
XLogRecPtr stoppos;
/*
- * The message format used in streaming replication changed in 9.3, so we
- * cannot stream from older servers. Don't know if we would work with
- * newer versions, but let's not take the risk.
+ * The caller should've checked the server version already, but doesn't do
+ * any harm to check it here too.
*/
- if (PQserverVersion(conn) / 100 != PG_VERSION_NUM / 100)
- {
- const char *serverver = PQparameterStatus(conn, "server_version");
- fprintf(stderr, _("%s: incompatible server version %s; streaming is only supported with server version %s\n"),
- progname,
- serverver ? serverver : "'unknown'",
- PG_MAJORVERSION);
+ if (!CheckServerVersionForStreaming(conn))
return false;
- }
if (sysidentifier != NULL)
{
diff --git a/src/bin/pg_basebackup/receivelog.h b/src/bin/pg_basebackup/receivelog.h
index 53f31a7..7c983cd 100644
--- a/src/bin/pg_basebackup/receivelog.h
+++ b/src/bin/pg_basebackup/receivelog.h
@@ -6,6 +6,7 @@
*/
typedef bool (*stream_stop_callback) (XLogRecPtr segendpos, uint32 timeline, bool segment_finished);
+extern bool CheckServerVersionForStreaming(PGconn *conn);
extern bool ReceiveXlogStream(PGconn *conn,
XLogRecPtr startpos,
uint32 timeline,
On 25.03.2013 11:23, Heikki Linnakangas wrote:
On 19.03.2013 13:49, Magnus Hagander wrote:
On Tue, Mar 19, 2013 at 11:39 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:On 19.03.2013 04:42, Peter Eisentraut wrote:
Also, using the old tools against new server versions either behaves
funny or silently appears to work, both of which might be a problem.Hmm, it would be good to fix that. I wonder how, though. The most
straightforward way would be to add an explicit version check in the
clients, in backbranches. That would give a nice error message, but that
would only help with new minor versions.Still better to do it in a backbranch, than not at all. At least we
are then nicer to the ones that do keep up with upgrades, which we
recommend they do...Ok, here are patches for 9.1, 9.2 and master, to add explicit version
checks. Each branch has its own quirks. A 9.1 client should still work
with a 9.2 server, because we don't want to break things in a minor
version that used to accidentally work, even if it was never explicitly
supported. In master, I tweaked pg_basebackup so that it still works
with older servers if you don't use the "-X stream" option. The changes
to the streaming protocol only affected "-X stream".This doesn't yet add the "streaming protocol version number" that was
discussed.
Committed this.. Will work on the additional version number.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers