Warm Standby Weirdness

Started by Sam Nelsonover 15 years ago5 messagesgeneral
Jump to latest
#1Sam Nelson
samn@consistentstate.com

Let me preface this by saying that I've set up warm standby instances quite
a few times. I think I sort of hopefully know what I'm doing.
pg_start_backup('stuff'), tar data directory, pg_stop_backup(), copy data
directory to warm standby server, extract in data directory, etc.

We have two CentOS 5 boxes that we're trying to set up as a master -> warm
standby. Both have postgres 8.4.4 installed from source. The master's
postgres instance has been there for a while (a couple of months or
something).

I am very, very sorry if I'm missing something really simple, but I just
can't seem to figure out what I'm doing wrong. Here's the process I'm
following:

==master==
$ psql
postgres=# select pg_start_backup('<today's date>')
postgres=# \q
$ cd /path/to/data/directory
$ tar cvzf data.tar.gz *
$ scp data.tar.gz <server>:~/

==warm standby==
$ cd /path/to/data/directory
$ tar xvf ~/data.tar.gz
<create recovery.conf file with restore_command line and modify
postgresql.conf to disable wal archiving>
$ pg_ctl -D /path/to/data/directory start

Here's the output after trying to start the backup instance with pg_ctl
(ignoring the line about postmaster.pid already existing):

server starting
FATAL: incorrect checksum in control file

Here's the output from pg_controldata:

$ pg_controldata `pwd`
WARNING: Calculated CRC checksum does not match value stored in file.
Either the file is corrupt, or it has a different layout than this program
is expecting. The results below are untrustworthy.

pg_control version number: 843
Catalog version number: 200904091
Database system identifier: 5473004134245625319
Database cluster state: in production
pg_control last modified: Wed 31 Dec 1969 05:00:00 PM MST
Latest checkpoint location: B000020/0
Prior checkpoint location: A000020/0
Latest checkpoint's REDO location: B000020/1
Latest checkpoint's TimeLineID: 0
Latest checkpoint's NextXID: 57905/32791
Latest checkpoint's NextOID: 1
Latest checkpoint's NextMultiXactId: 0
Latest checkpoint's NextMultiOffset: 1282256808
Time of latest checkpoint: Wed 31 Dec 1969 05:00:00 PM MST
Minimum recovery ending location: 0/4
Maximum data alignment: 0
Database block size: 8192
Blocks per segment of large relation: 16777216
WAL block size: 64
Bytes per WAL segment: 32
Maximum length of identifiers: 2000
Maximum columns in an index: 257
Maximum size of a TOAST chunk: 513657607
Date/time type storage: floating-point numbers
Float4 argument passing: by reference
Float8 argument passing: by reference

Those timestamps are at the unix epoch - Jan 1 1970 ... what in the spinning
marble?!

Yeah. I'm confused, my boss is confused... We're currently running a yum
-y update on those boxes, but we'd still like to know what's going on, even
if a full update fixes everything. Any clues?

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Sam Nelson (#1)
Re: Warm Standby Weirdness

Sam Nelson <samn@consistentstate.com> writes:

Here's the output from pg_controldata:

$ pg_controldata `pwd`
WARNING: Calculated CRC checksum does not match value stored in file.
Either the file is corrupt, or it has a different layout than this program
is expecting. The results below are untrustworthy.

pg_control version number: 843
Catalog version number: 200904091
Database system identifier: 5473004134245625319
Database cluster state: in production
pg_control last modified: Wed 31 Dec 1969 05:00:00 PM MST
Latest checkpoint location: B000020/0
Prior checkpoint location: A000020/0
Latest checkpoint's REDO location: B000020/1

This is just an educated guess, but I'm going to bet on 32-bit vs 64-bit.
Are you trying to copy the DB to a machine with different word size?
Won't work.

regards, tom lane

#3Sam Nelson
samn@consistentstate.com
In reply to: Tom Lane (#2)
Re: Warm Standby Weirdness

Wow. I must be blind. Or brain dead.

You're right. That was the issue.

-Sam

On Thu, Aug 19, 2010 at 9:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Sam Nelson <samn@consistentstate.com> writes:

Here's the output from pg_controldata:

$ pg_controldata `pwd`
WARNING: Calculated CRC checksum does not match value stored in file.
Either the file is corrupt, or it has a different layout than this

program

is expecting. The results below are untrustworthy.

pg_control version number: 843
Catalog version number: 200904091
Database system identifier: 5473004134245625319
Database cluster state: in production
pg_control last modified: Wed 31 Dec 1969 05:00:00 PM MST
Latest checkpoint location: B000020/0
Prior checkpoint location: A000020/0
Latest checkpoint's REDO location: B000020/1

This is just an educated guess, but I'm going to bet on 32-bit vs 64-bit.
Are you trying to copy the DB to a machine with different word size?
Won't work.

regards, tom lane

#4Yaroslav Tykhiy
yar@barnet.com.au
In reply to: Tom Lane (#2)
Re: Warm Standby Weirdness

On Thu, Aug 19, 2010 at 11:22:15PM -0400, Tom Lane wrote:

Sam Nelson <samn@consistentstate.com> writes:

Here's the output from pg_controldata:

$ pg_controldata `pwd`
WARNING: Calculated CRC checksum does not match value stored in file.
Either the file is corrupt, or it has a different layout than this program
is expecting. The results below are untrustworthy.

[...]

This is just an educated guess, but I'm going to bet on 32-bit vs 64-bit.
Are you trying to copy the DB to a machine with different word size?
Won't work.

Just in case, if anyone ever _really_ needs to run a legacy 32-bit DB on
a 64-bit system, they may find it good to know that it works no problem
at least in FreeBSD. All what it takes is the 32-bit libraries (a stock
part of a 64-bit FreeBSD install) and probably compatibility libraries
(from a package) if the 32-bit Postgresql was built on an older system.
Of course, the old Postgresql binaries will have to be copied over and
used with the old database, but they can work all right in a 64-bit
system as soon as compatibility environment is provided using standard
system components.

My 2 cents worth.

Yar

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Yaroslav Tykhiy (#4)
Re: Warm Standby Weirdness

Yaroslav Tykhiy <yar@barnet.com.au> writes:

On Thu, Aug 19, 2010 at 11:22:15PM -0400, Tom Lane wrote:

This is just an educated guess, but I'm going to bet on 32-bit vs 64-bit.
Are you trying to copy the DB to a machine with different word size?
Won't work.

Just in case, if anyone ever _really_ needs to run a legacy 32-bit DB on
a 64-bit system, they may find it good to know that it works no problem
at least in FreeBSD.

Right, I should have been clearer: you can't run a 32-bit database image
with a 64-bit Postgres executable. Most 64-bit operating systems can
still execute 32-bit executables, though, so you can get there by
running a 32-bit PG on your shiny new 64-bit machine.

regards, tom lane