Attempting to do a rolling move to 9.2Beta (as a slave) fails

Started by Karl Denningerover 13 years ago5 messages
#1Karl Denninger
karl@denninger.net

Here's what I'm trying to do in testing 9.2Beta1.

The current configuration is a master and a hot standby at a diverse
location for both hot swap and "online" backup. Both are archived
regularly so if something goes south I can recover (to either as a master.)

I am attempting to validate the path forward to 9.2, and thus tried the
following:

1. Build 9.2Beta1; all fine.

2. Run a pg_basebackup from the current master machine (running 9.1) to
a new directory on the slave machine, using the 9.2Beta1 pg_basebackup
executable.

3. Run a pg_upgrade against that from the new binary directory,
producing a 9.2Beta1 data store.

4. Attempt to start the result as a SLAVE against the existing 9.1 master.

Everything is ok until I try to start the result as a slave. I would
think I should be able to, since this is exactly the procedure (minus
the upgrade) that I used to get the slave in operation in the first
place (although I did the archive/dump/copy to the slave machine
manually rather than use "pg_basebackup" to get it.)

But the last step fails, claiming that "wal_level was set to minimal"
when the WAL records were written. No it wasn't. Not only was it not
on the master where the base backup came from, it wasn't during the
upgrade either nor is it set that way on the new candidate slave.

Is this caused by the version mismatch? Note that it does NOT bitch
about the versions not matching. For obvious reasons I'm not interested
in rolling the production master up to 9.2 until it's released, but
running a second instance of my HA code against it as a slave would
allow me to perform a very complete set of tests against 9.2Beta1
without any hassle or operational risks, yet keep the full working data
set available and online during the testing.

Do I need to run a complete parallel environment instead of trying to
attach a 9.2Beta1 slave to an existing 9.1 master? (and if so, why
doesn't the code complain about the mismatch instead of the bogus WAL
message?)

--
-- Karl Denninger
/The Market Ticker ®/ <http://market-ticker.org&gt;
Cuda Systems LLC

#2Jan Nielsen
jan.sture.nielsen@gmail.com
In reply to: Karl Denninger (#1)
Re: Attempting to do a rolling move to 9.2Beta (as a slave) fails

Hi Karl,

On Sun, May 27, 2012 at 9:18 PM, Karl Denninger <karl@denninger.net> wrote:

Here's what I'm trying to do in testing 9.2Beta1.

The current configuration is a master and a hot standby at a diverse
location for both hot swap and "online" backup. Both are archived
regularly so if something goes south I can recover (to either as a master.)

Okay

1. Build 9.2Beta1; all fine.

2. Run a pg_basebackup from the current master machine (running 9.1) to a
new directory on the slave machine, using the 9.2Beta1 pg_basebackup
executable.

3. Run a pg_upgrade against that from the new binary directory, producing
a 9.2Beta1 data store.

4. Attempt to start the result as a SLAVE against the existing 9.1 master.

Hmm - that's likely a problem: "In general, log shipping between servers
running different major PostgreSQL release levels is not possible." [1]http://www.postgresql.org/docs/current/static/warm-standby.html

Is this caused by the version mismatch?

Probably.

Do I need to run a complete parallel environment instead of trying to
attach a 9.2Beta1 slave to an existing 9.1 master? (and if so, why doesn't
the code complain about the mismatch instead of the bogus WAL message?)

Slony [2]http://slony.info/ or PGBouncer+Londiste [3]http://itand.me/zero-downtime-upgrades-of-postgresql-with-pgb should allow you to do this in an
integrated fashion. [4]http://www.postgresql.org/docs/current/static/different-replication-solutions.html

Cheers,

Jan

[1]: http://www.postgresql.org/docs/current/static/warm-standby.html
[2]: http://slony.info/
[3]: http://itand.me/zero-downtime-upgrades-of-postgresql-with-pgb
[4]: http://www.postgresql.org/docs/current/static/different-replication-solutions.html
http://www.postgresql.org/docs/current/static/different-replication-solutions.html

#3Karl Denninger
karl@denninger.net
In reply to: Jan Nielsen (#2)
Re: Attempting to do a rolling move to 9.2Beta (as a slave) fails

On 5/27/2012 11:08 PM, Jan Nielsen wrote:

Hi Karl,

On Sun, May 27, 2012 at 9:18 PM, Karl Denninger <karl@denninger.net
<mailto:karl@denninger.net>> wrote:

Here's what I'm trying to do in testing 9.2Beta1.

The current configuration is a master and a hot standby at a
diverse location for both hot swap and "online" backup. Both are
archived regularly so if something goes south I can recover (to
either as a master.)

Okay

1. Build 9.2Beta1; all fine.

2. Run a pg_basebackup from the current master machine (running
9.1) to a new directory on the slave machine, using the 9.2Beta1
pg_basebackup executable.

3. Run a pg_upgrade against that from the new binary directory,
producing a 9.2Beta1 data store.

4. Attempt to start the result as a SLAVE against the existing 9.1
master.

Hmm - that's likely a problem: "In general, log shipping between
servers running different major PostgreSQL release levels is not
possible." [1]

Is this caused by the version mismatch?

Probably.

Then the error message is wrong :-)

Do I need to run a complete parallel environment instead of trying
to attach a 9.2Beta1 slave to an existing 9.1 master? (and if so,
why doesn't the code complain about the mismatch instead of the
bogus WAL message?)

Slony [2] or PGBouncer+Londiste [3] should allow you to do this in an
integrated fashion. [4]

I ran Slony for quite a while before 9.x showed up; I could put it back
into use for a while but I really like the integrated setup that exists
now with 9.x.

I'll look at doing a parallel setup but it will more limited in what I
can actually validate against in terms of workload than the above was
workable...

--
-- Karl Denninger
/The Market Ticker ®/ <http://market-ticker.org&gt;
Cuda Systems LLC

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Karl Denninger (#1)
Re: Attempting to do a rolling move to 9.2Beta (as a slave) fails

Karl Denninger <karl@denninger.net> writes:

I am attempting to validate the path forward to 9.2, and thus tried the
following:

1. Build 9.2Beta1; all fine.

2. Run a pg_basebackup from the current master machine (running 9.1) to
a new directory on the slave machine, using the 9.2Beta1 pg_basebackup
executable.

3. Run a pg_upgrade against that from the new binary directory,
producing a 9.2Beta1 data store.

I do not think this can work, unless pg_basebackup is more magic than I
think it is. AFAIK, what you have after step 2 is a non-self-consistent
data directory that needs to be fixed by WAL replay before it is
consistent. And pg_upgrade needs a consistent starting point.

4. Attempt to start the result as a SLAVE against the existing 9.1 master.

This is definitely not going to work. You can only log-ship between
servers of the same major version.

But the last step fails, claiming that "wal_level was set to minimal"
when the WAL records were written. No it wasn't. Not only was it not
on the master where the base backup came from, it wasn't during the
upgrade either nor is it set that way on the new candidate slave.
Is this caused by the version mismatch? Note that it does NOT bitch
about the versions not matching.

That sounds like a bug, or poorly sequenced error checks.

regards, tom lane

#5Karl Denninger
karl@denninger.net
In reply to: Tom Lane (#4)
Re: Attempting to do a rolling move to 9.2Beta (as a slave) fails

On 5/28/2012 11:44 AM, Tom Lane wrote:

Karl Denninger <karl@denninger.net> writes:

I am attempting to validate the path forward to 9.2, and thus tried the
following:
1. Build 9.2Beta1; all fine.
2. Run a pg_basebackup from the current master machine (running 9.1) to
a new directory on the slave machine, using the 9.2Beta1 pg_basebackup
executable.
3. Run a pg_upgrade against that from the new binary directory,
producing a 9.2Beta1 data store.

I do not think this can work, unless pg_basebackup is more magic than I
think it is. AFAIK, what you have after step 2 is a non-self-consistent
data directory that needs to be fixed by WAL replay before it is
consistent. And pg_upgrade needs a consistent starting point.

Actually when pg_upgrade starts it starts the old binary against the old
data directory first, and thus replays the WAL records until it reaches
consistency before it does the upgrade. It /*does*/ work; you have to
specify that you want the WAL records during the pg_basebackup (e.g.
"-x=stream") so you have the WAL files for the old binary to consider
during the startup (or manually ship them after the backup completes.)

4. Attempt to start the result as a SLAVE against the existing 9.1 master.

This is definitely not going to work. You can only log-ship between
servers of the same major version.

OK.

But the last step fails, claiming that "wal_level was set to minimal"
when the WAL records were written. No it wasn't. Not only was it not
on the master where the base backup came from, it wasn't during the
upgrade either nor is it set that way on the new candidate slave.
Is this caused by the version mismatch? Note that it does NOT bitch
about the versions not matching.

That sounds like a bug, or poorly sequenced error checks.

regards, tom lane

Well, at least I know why it fails and that it's a bad error message
(and can't work) rather than something stupid in the original setup
(which looked ok.)

--
-- Karl Denninger
/The Market Ticker ®/ <http://market-ticker.org&gt;
Cuda Systems LLC