Using pg_upgrade on log-shipping standby servers
I occasionally get questions about how to run pg_upgrade on log-shipping
standby servers. The attached documentation patch outlines how to do
it.
I don't think we can assume that because pg_upgrade was run on the
master and standby that they are binary identical, can we? Technically
the user file are identical, but the system catalogs and WAL might be
different, hence my suggestion to run rsync before allowing the standby
to rejoin the primary.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
Attachments:
pg_upgrade.difftext/x-diff; charset=us-asciiDownload+7-0
Bruce Momjian <bruce@momjian.us> writes:
+ While a Log-Shipping Standby Server (<xref linkend="warm-standby">) can + be upgraded, the server must be in changed to a primary server to allow + writes, and after the upgrade it cannot be reused as a standby server. + (Running <command>rsync</> after the upgrade allows reuse.)
"in changed"? This sentence makes no sense at all to me.
regards, tom lane
On Tue, Jul 10, 2012 at 12:04:50PM -0400, Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
+ While a Log-Shipping Standby Server (<xref linkend="warm-standby">) can + be upgraded, the server must be in changed to a primary server to allow + writes, and after the upgrade it cannot be reused as a standby server. + (Running <command>rsync</> after the upgrade allows reuse.)"in changed"? This sentence makes no sense at all to me.
Oops. New wording attached with "in" removed:
the server must be changed to a primary server
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
Attachments:
pg_upgrade.difftext/x-diff; charset=us-asciiDownload+7-0
On Tue, Jul 10, 2012 at 6:17 PM, Bruce Momjian <bruce@momjian.us> wrote:
On Tue, Jul 10, 2012 at 12:04:50PM -0400, Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
+ While a Log-Shipping Standby Server (<xref linkend="warm-standby">) can + be upgraded, the server must be in changed to a primary server to allow + writes, and after the upgrade it cannot be reused as a standby server. + (Running <command>rsync</> after the upgrade allows reuse.)"in changed"? This sentence makes no sense at all to me.
Oops. New wording attached with "in" removed:
the server must be changed to a primary server
Don't we normally talk about "must be promoted to a primary server",
not changed?
And wouldn't it be good if it also mentions that another good option
is to just pg_upgrade the master and rebuild the standby? (Unless
that's already mentioned somewhere else).
What's the actual usecase for promoting the slave, upgrading it and
then *not* using it, which is what I think this paragraph suggests?
And I think the sentence about running rsync is extremely vague - run
rsync where and how? What are you actually trying to suggest people
do?
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On Tue, Jul 10, 2012 at 06:21:35PM +0200, Magnus Hagander wrote:
On Tue, Jul 10, 2012 at 6:17 PM, Bruce Momjian <bruce@momjian.us> wrote:
On Tue, Jul 10, 2012 at 12:04:50PM -0400, Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
+ While a Log-Shipping Standby Server (<xref linkend="warm-standby">) can + be upgraded, the server must be in changed to a primary server to allow + writes, and after the upgrade it cannot be reused as a standby server. + (Running <command>rsync</> after the upgrade allows reuse.)"in changed"? This sentence makes no sense at all to me.
Oops. New wording attached with "in" removed:
the server must be changed to a primary server
Don't we normally talk about "must be promoted to a primary server",
not changed?
OK, sure, updated patch attached.
And wouldn't it be good if it also mentions that another good option
is to just pg_upgrade the master and rebuild the standby? (Unless
that's already mentioned somewhere else).
I assume they already realize they re-create the standbys.
What's the actual usecase for promoting the slave, upgrading it and
then *not* using it, which is what I think this paragraph suggests?
Testing maybe? I feel we have just avoided saying what you can and
can't do with the standbys and pg_upgrade, so I think we have to state
something. If we just want to say "recreate", let's say that.
And I think the sentence about running rsync is extremely vague - run
rsync where and how? What are you actually trying to suggest people
do?
Updated docs attached.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
Attachments:
pg_upgrade.difftext/x-diff; charset=us-asciiDownload+8-0
On Tue, Jul 10, 2012 at 6:26 PM, Bruce Momjian <bruce@momjian.us> wrote:
On Tue, Jul 10, 2012 at 06:21:35PM +0200, Magnus Hagander wrote:
On Tue, Jul 10, 2012 at 6:17 PM, Bruce Momjian <bruce@momjian.us> wrote:
On Tue, Jul 10, 2012 at 12:04:50PM -0400, Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
+ While a Log-Shipping Standby Server (<xref linkend="warm-standby">) can + be upgraded, the server must be in changed to a primary server to allow + writes, and after the upgrade it cannot be reused as a standby server. + (Running <command>rsync</> after the upgrade allows reuse.)"in changed"? This sentence makes no sense at all to me.
Oops. New wording attached with "in" removed:
the server must be changed to a primary server
Don't we normally talk about "must be promoted to a primary server",
not changed?OK, sure, updated patch attached.
And wouldn't it be good if it also mentions that another good option
is to just pg_upgrade the master and rebuild the standby? (Unless
that's already mentioned somewhere else).I assume they already realize they re-create the standbys.
What's the actual usecase for promoting the slave, upgrading it and
then *not* using it, which is what I think this paragraph suggests?Testing maybe? I feel we have just avoided saying what you can and
can't do with the standbys and pg_upgrade, so I think we have to state
something. If we just want to say "recreate", let's say that.
Well, the bottom line is we can'd do *anything* with a standby with pg_upgrade.
Once you've promoted it, it is no longer a standby, and now you can
use pg_upgrade.
And I think the sentence about running rsync is extremely vague - run
rsync where and how? What are you actually trying to suggest people
do?Updated docs attached.
I suggest just removing the rsync part completely. You're basically
saying "you ca nset up a new standby after you're done", which is kind
of obvious anyway. And if you're going to use rsync fromthe master to
make a new standby, there's no point in running pg_upgrade on the new
standby in the first place.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On Tue, Jul 10, 2012 at 06:29:24PM +0200, Magnus Hagander wrote:
Testing maybe? I feel we have just avoided saying what you can and
can't do with the standbys and pg_upgrade, so I think we have to state
something. If we just want to say "recreate", let's say that.Well, the bottom line is we can'd do *anything* with a standby with pg_upgrade.
Once you've promoted it, it is no longer a standby, and now you can
use pg_upgrade.And I think the sentence about running rsync is extremely vague - run
rsync where and how? What are you actually trying to suggest people
do?Updated docs attached.
I suggest just removing the rsync part completely. You're basically
saying "you ca nset up a new standby after you're done", which is kind
of obvious anyway. And if you're going to use rsync fromthe master to
make a new standby, there's no point in running pg_upgrade on the new
standby in the first place.
I went the other direction and just said you can't upgrade a standby (as
a standby), and to just use rsync --- patch attached.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
Attachments:
pg_upgrade.difftext/x-diff; charset=us-asciiDownload+6-0
On Tue, Jul 10, 2012 at 6:59 PM, Bruce Momjian <bruce@momjian.us> wrote:
On Tue, Jul 10, 2012 at 06:29:24PM +0200, Magnus Hagander wrote:
Testing maybe? I feel we have just avoided saying what you can and
can't do with the standbys and pg_upgrade, so I think we have to state
something. If we just want to say "recreate", let's say that.Well, the bottom line is we can'd do *anything* with a standby with pg_upgrade.
Once you've promoted it, it is no longer a standby, and now you can
use pg_upgrade.And I think the sentence about running rsync is extremely vague - run
rsync where and how? What are you actually trying to suggest people
do?Updated docs attached.
I suggest just removing the rsync part completely. You're basically
saying "you ca nset up a new standby after you're done", which is kind
of obvious anyway. And if you're going to use rsync fromthe master to
make a new standby, there's no point in running pg_upgrade on the new
standby in the first place.I went the other direction and just said you can't upgrade a standby (as
a standby), and to just use rsync --- patch attached.
Reads much better now. I'd say "use rsync to rebuild the standbys",
but that's more nitpicking :) (And maybe "the simplest way" rather
than "the simplest case"? But i'll leave that to someone who has
english as their first language)
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On Tue, Jul 10, 2012 at 07:06:39PM +0200, Magnus Hagander wrote:
rsync where and how? What are you actually trying to suggest people
do?Updated docs attached.
I suggest just removing the rsync part completely. You're basically
saying "you ca nset up a new standby after you're done", which is kind
of obvious anyway. And if you're going to use rsync fromthe master to
make a new standby, there's no point in running pg_upgrade on the new
standby in the first place.I went the other direction and just said you can't upgrade a standby (as
a standby), and to just use rsync --- patch attached.Reads much better now. I'd say "use rsync to rebuild the standbys",
but that's more nitpicking :) (And maybe "the simplest way" rather
than "the simplest case"? But i'll leave that to someone who has
english as their first language)
Both change made; updated patch attached.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
Attachments:
pg_upgrade.difftext/x-diff; charset=us-asciiDownload+6-0
On Tue, Jul 10, 2012 at 7:27 PM, Bruce Momjian <bruce@momjian.us> wrote:
On Tue, Jul 10, 2012 at 07:06:39PM +0200, Magnus Hagander wrote:
rsync where and how? What are you actually trying to suggest people
do?Updated docs attached.
I suggest just removing the rsync part completely. You're basically
saying "you ca nset up a new standby after you're done", which is kind
of obvious anyway. And if you're going to use rsync fromthe master to
make a new standby, there's no point in running pg_upgrade on the new
standby in the first place.I went the other direction and just said you can't upgrade a standby (as
a standby), and to just use rsync --- patch attached.Reads much better now. I'd say "use rsync to rebuild the standbys",
but that's more nitpicking :) (And maybe "the simplest way" rather
than "the simplest case"? But i'll leave that to someone who has
english as their first language)Both change made; updated patch attached.
Looks good to me.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On Tue, Jul 10, 2012 at 09:10:25PM +0200, Magnus Hagander wrote:
On Tue, Jul 10, 2012 at 7:27 PM, Bruce Momjian <bruce@momjian.us> wrote:
On Tue, Jul 10, 2012 at 07:06:39PM +0200, Magnus Hagander wrote:
rsync where and how? What are you actually trying to suggest people
do?Updated docs attached.
I suggest just removing the rsync part completely. You're basically
saying "you ca nset up a new standby after you're done", which is kind
of obvious anyway. And if you're going to use rsync fromthe master to
make a new standby, there's no point in running pg_upgrade on the new
standby in the first place.I went the other direction and just said you can't upgrade a standby (as
a standby), and to just use rsync --- patch attached.Reads much better now. I'd say "use rsync to rebuild the standbys",
but that's more nitpicking :) (And maybe "the simplest way" rather
than "the simplest case"? But i'll leave that to someone who has
english as their first language)Both change made; updated patch attached.
Looks good to me.
OK, applied and backpatched to 9.2.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
On Tue, 2012-07-10 at 11:50 -0400, Bruce Momjian wrote:
I don't think we can assume that because pg_upgrade was run on the
master and standby that they are binary identical, can we? Technically
the user file are identical, but the system catalogs and WAL might be
different, hence my suggestion to run rsync before allowing the standby
to rejoin the primary.
Do you have plans to change that in the future?
If we know that the user data files are identical between primary and
replica, it would be nice if we could provide a robust way to avoid
copying them.
Regards,
Jeff Davis
On Mon, Jul 16, 2012 at 5:29 PM, Jeff Davis <pgsql@j-davis.com> wrote:
On Tue, 2012-07-10 at 11:50 -0400, Bruce Momjian wrote:
I don't think we can assume that because pg_upgrade was run on the
master and standby that they are binary identical, can we? Technically
the user file are identical, but the system catalogs and WAL might be
different, hence my suggestion to run rsync before allowing the standby
to rejoin the primary.Do you have plans to change that in the future?
If we know that the user data files are identical between primary and
replica, it would be nice if we could provide a robust way to avoid
copying them.
How about this alternative that may sound crazy, but would lend itself
to some unification in archiving:
Could pg_upgrade emit WAL segment(s) to provide continuity of a
timeline? So something like:
* Take down the writable primary for pg_upgrade
* Some WAL is emitted and possibly archived
* The old version, when reaching the special pg_upgrade WAL, could
exit or report its situation having paused replay (as clearly, it
cannot proceed). Unsure.
* Start up a new version of postgres on the same cluster at that
point, which plays the upgrade-WAL.
I see this being pretty mechanically intensive, but right now my hands
are completely tied as to achieving total continuity of my archives,
costing a base-backup's worth of risk window upon upgrade.
--
fdr
On Tue, 2012-07-17 at 01:02 -0700, Daniel Farina wrote:
Could pg_upgrade emit WAL segment(s) to provide continuity of a
timeline? So something like:
By "segments" did you mean "records"?
* Take down the writable primary for pg_upgrade
* Some WAL is emitted and possibly archived
* The old version, when reaching the special pg_upgrade WAL, could
exit or report its situation having paused replay (as clearly, it
cannot proceed). Unsure.
I don't really understand this step.
* Start up a new version of postgres on the same cluster at that
point, which plays the upgrade-WAL.I see this being pretty mechanically intensive, but right now my hands
are completely tied as to achieving total continuity of my archives,
costing a base-backup's worth of risk window upon upgrade.
Does "continuity of archives" mean "avoid downtime" or "maintain a
single WAL sequence".
Regards,
Jeff Davis
On Mon, Jul 16, 2012 at 05:29:26PM -0700, Jeff Davis wrote:
On Tue, 2012-07-10 at 11:50 -0400, Bruce Momjian wrote:
I don't think we can assume that because pg_upgrade was run on the
master and standby that they are binary identical, can we? Technically
the user file are identical, but the system catalogs and WAL might be
different, hence my suggestion to run rsync before allowing the standby
to rejoin the primary.Do you have plans to change that in the future?
If we know that the user data files are identical between primary and
replica, it would be nice if we could provide a robust way to avoid
copying them.
Well, rsync --checksum would work, but both systems have to be down for
that. You could snapshot the down primary and rsync --checksum that
against the standby, but I am not sure how much that helps us.
I can't figure out how to make this work better without adding a whole
lot more code to pg_upgrade that might need adjustment for every minor
release, i.e. pg_upgrade knows nothing about the WAL file format, and I
want to keep it that way.
However, I have two ideas. First, I don't know _why_ the
primary/standby would be any different after pg_upgrade, so I added the
documentation mention because I couldn't _guarantee_ they were the same.
Actually, if people can test this, we might be able to say this is safe.
Second, the user files (large) are certainly identical, it is only the
system tables (small) that _might_ be different, so rsync'ing just those
would add the guarantee, but I know of no easy way to rsync just the
system tables.
Does that help?
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
On Tue, Jul 17, 2012 at 11:55 AM, Jeff Davis <pgsql@j-davis.com> wrote:
On Tue, 2012-07-17 at 01:02 -0700, Daniel Farina wrote:
Could pg_upgrade emit WAL segment(s) to provide continuity of a
timeline? So something like:By "segments" did you mean "records"?
Yes. It would be nicer not to have to tie it to the WAL segment file size.
* Take down the writable primary for pg_upgrade
* Some WAL is emitted and possibly archived
* The old version, when reaching the special pg_upgrade WAL, could
exit or report its situation having paused replay (as clearly, it
cannot proceed). Unsure.I don't really understand this step.
"Some WAL is emitted and possibly archived" needs a subject in that fragment:
"pg_upgrade somehow (directly, or indirectly) emits and/or archives
WAL used to complete binary-upgrade". That means that it should
appear in the WAL stream and be subject to archive_command, like any
other WAL.
The sticky part is what the standby should do when it encounters the
special wal-upgrade records. It should probably pause replay to allow
some other program to stop the old postgres version and start the new
version with the same cluster.
* Start up a new version of postgres on the same cluster at that
point, which plays the upgrade-WAL.I see this being pretty mechanically intensive, but right now my hands
are completely tied as to achieving total continuity of my archives,
costing a base-backup's worth of risk window upon upgrade.Does "continuity of archives" mean "avoid downtime" or "maintain a
single WAL sequence".
The latter.
--
fdr
On Tue, Jul 17, 2012 at 04:49:39PM -0700, Daniel Farina wrote:
On Tue, Jul 17, 2012 at 11:55 AM, Jeff Davis <pgsql@j-davis.com> wrote:
On Tue, 2012-07-17 at 01:02 -0700, Daniel Farina wrote:
Could pg_upgrade emit WAL segment(s) to provide continuity of a
timeline? So something like:By "segments" did you mean "records"?
Yes. It would be nicer not to have to tie it to the WAL segment file size.
* Take down the writable primary for pg_upgrade
* Some WAL is emitted and possibly archived
* The old version, when reaching the special pg_upgrade WAL, could
exit or report its situation having paused replay (as clearly, it
cannot proceed). Unsure.I don't really understand this step.
"Some WAL is emitted and possibly archived" needs a subject in that fragment:
"pg_upgrade somehow (directly, or indirectly) emits and/or archives
WAL used to complete binary-upgrade". That means that it should
appear in the WAL stream and be subject to archive_command, like any
other WAL.The sticky part is what the standby should do when it encounters the
special wal-upgrade records. It should probably pause replay to allow
some other program to stop the old postgres version and start the new
version with the same cluster.
WAL is not guaranteed to be the same between PG major versions, so doing
anything with WAL is pretty much a no-go.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
On Tue, Jul 17, 2012 at 9:16 PM, Bruce Momjian <bruce@momjian.us> wrote:
WAL is not guaranteed to be the same between PG major versions, so doing
anything with WAL is pretty much a no-go.
I understand that the WAL format changes, sometimes dramatically
between versions. What I'm suggesting that the first WAL-record
emitted by the binary upgrade process could be entitled "WAL-stream
upgrade to 9.4" that would fail to be understood by old versions or
possibly understood to mean "stop replay, you won't even understand
what's about to be said."
At that point, start up new version in the same cluster and have it
continue replay from that position on forward, which should all be in
the new format that it can understand. It need not understand the old
format in that case, but the tricky part is this single record that
tells the replayer of the old version to stop while a replayer of the
new version somehow will know it is the right place to start.
One mechanism could be a WAL file segment boundary: the standby could
be told to exit when it finishes recovery of the segment
0000000100001234000055CD, and to start the new version beginning
recovery at 0000000100001234000055CF (one higher), and that would be
the first WAL emitted by pg_upgrade. In principle the same is possible
using the fine-grained record position, such as XXXXX/NN, but may be
more complex for not much gain.
This also means the database would be stuck in an inconsistent state
when it starts, not unlike when recovering from a on-line base backup.
And that's totally reasonable: the new version has to start up
presuming that the database cluster makes not enough sense to enter
hot standby yet.
Yet another mechanism is to not have the Postgres recovery-process
apply the WAL, but rather some special purpose program that knows how
to count through and apply specially-formatted WAL segments, and then
set the resultant cluster to start recovering from the WAL past this
span of specially-formatted WAL. The crux is to get some continuity
in this stream, and there are many ways to slice it. Otherwise, the
continuous archives will have a gap while a new base backup is taken
of data that mostly rests unchanged.
--
fdr
On Tue, Jul 17, 2012 at 6:02 PM, Bruce Momjian <bruce@momjian.us> wrote:
However, I have two ideas. First, I don't know _why_ the
primary/standby would be any different after pg_upgrade, so I added the
documentation mention because I couldn't _guarantee_ they were the same.
Actually, if people can test this, we might be able to say this is safe.Second, the user files (large) are certainly identical, it is only the
system tables (small) that _might_ be different, so rsync'ing just those
would add the guarantee, but I know of no easy way to rsync just the
system tables.
I'm scratching my head in confusion here. After pg_upgrade, the
master is a completely new cluster. The system catalog contents are
completely different, and so are things like the database system
identifier and the WAL position - yeah, the latter is approximately
the same, but almost doesn't count except in horseshoes. Obviously
any attempt to replay WAL from the new cluster on the old cluster is
doomed to failure, at least unless we do a bunch more engineering here
that hasn't really been thought about yet.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, Jul 18, 2012 at 09:36:51AM -0400, Robert Haas wrote:
On Tue, Jul 17, 2012 at 6:02 PM, Bruce Momjian <bruce@momjian.us> wrote:
However, I have two ideas. First, I don't know _why_ the
primary/standby would be any different after pg_upgrade, so I added the
documentation mention because I couldn't _guarantee_ they were the same.
Actually, if people can test this, we might be able to say this is safe.Second, the user files (large) are certainly identical, it is only the
system tables (small) that _might_ be different, so rsync'ing just those
would add the guarantee, but I know of no easy way to rsync just the
system tables.I'm scratching my head in confusion here. After pg_upgrade, the
master is a completely new cluster. The system catalog contents are
completely different, and so are things like the database system
identifier and the WAL position - yeah, the latter is approximately
the same, but almost doesn't count except in horseshoes. Obviously
any attempt to replay WAL from the new cluster on the old cluster is
doomed to failure, at least unless we do a bunch more engineering here
that hasn't really been thought about yet.
No, the point is they run pg_upgrade on the stopped primary and stopped
standbys. Are those the same? I am not really sure.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +