Strange problem when upgrading to 7.2 with pg_upgrade.

Started by Brian Hirtalmost 24 years ago18 messages
#1Brian Hirt
bhirt@mobygames.com

I've started playing around with 7.2 on one of my development machines.
I decided to try the pg_upgrade program, something I usually never do.

Anyway, I followed the steps in the pg_upgrade (going from 7.1.3 to
7.2), and then when I started the database up after the upgrade finished
and vacuumed one of my tables, i get these error messages from the
postmaster. After this point I cannot restart the postmaster without
resetting the xlog.

I've kept the PGDATA directory around incase someone thinks this is
worth looking into, i would be more than happy to help out.

If i migrate the data over manually like a always do (pg_dump then
pg_restore), i don't have any problems. Part of the problem might be
path names for shared libraries specified in CREATE FUNCTION; I started
using pg back when it was version 6 before '$libdir' was supported and I
haven't bothered to take the absolute path names out yet -- i've just
updated it with each release (each release is installed in a different
location in case i need to roll back, and so i can test multiple version
at one time). not sure if pg_upgrade even checks for this.

oby/pgsql@loopy pg_upgrade]$ /moby/pgsql-7.2/bin/postmaster -i -o -F -B
256 -D/mo
DEBUG: database system was shut down at 2002-02-14 12:20:53 MST
DEBUG: checkpoint record is at 1/A7000010
DEBUG: redo record is at 1/A7000010; undo record is at 1/A7000010;
shutdown TRUE
DEBUG: next transaction id: 589031; next oid: 19512
DEBUG: database system is ready

DEBUG: --Relation developer--
DEBUG: Pages 669: Changed 0, Empty 0; Tup 51508: Vac 0, Keep 0, UnUsed
0.
Total CPU 0.07s/0.03u sec elapsed 0.11 sec.
DEBUG: Analyzing developer
FATAL 2: read of clog file 0, offset 139264 failed: Success
DEBUG: server process (pid 17786) exited with exit code 2
DEBUG: terminating any other active server processes
NOTICE: Message from PostgreSQL backend:
The Postmaster has informed me that some other backend
died abnormally and possibly corrupted shared memory.
I have rolled back the current transaction and am
going to terminate your database system connection and exit.
Please reconnect to the database system and repeat your query.
DEBUG: all server processes terminated; reinitializing shared memory
and semaphores
DEBUG: database system was interrupted at 2002-02-14 12:20:58 MST
DEBUG: checkpoint record is at 1/A7000010
DEBUG: redo record is at 1/A7000010; undo record is at 1/A7000010;
shutdown TRUE
DEBUG: next transaction id: 589031; next oid: 19512
DEBUG: database system was not properly shut down; automatic recovery
in progress
DEBUG: redo starts at 1/A7000050
FATAL 2: read of clog file 0, offset 139264 failed: Success
DEBUG: startup process (pid 17788) exited with exit code 2
DEBUG: aborting startup due to startup process failure
[postgres@loopy pg_upgrade]$
[postgres@loopy pg_upgrade]$
[postgres@loopy pg_upgrade]$
[postgres@loopy pg_upgrade]$ df -k
Filesystem 1k-blocks Used Available Use% Mounted on
/dev/hda8 248895 192496 43549 82% /
/dev/hda1 31079 4988 24487 17% /boot
/dev/hda5 24080660 6601476 17479184 28% /home
/dev/hda6 5044156 1930892 2857032 41% /usr
/dev/hda9 248895 133875 102170 57% /var
/dev/hdd1 59919196 39090008 20829188 66% /disk
oby/pgsql@loopy pg_upgrade]$ /moby/pgsql-7.2/bin/postmaster -i -o -F -B
256 -D/mo
DEBUG: database system was interrupted being in recovery at 2002-02-14
12:21:06 MST
This probably means that some data blocks are corrupted
and you will have to use the last backup for recovery.
DEBUG: checkpoint record is at 1/A7000010
DEBUG: redo record is at 1/A7000010; undo record is at 1/A7000010;
shutdown TRUE
DEBUG: next transaction id: 589031; next oid: 19512
DEBUG: database system was not properly shut down; automatic recovery
in progress
DEBUG: redo starts at 1/A7000050
FATAL 2: read of clog file 0, offset 139264 failed: Success
DEBUG: startup process (pid 17793) exited with exit code 2
DEBUG: aborting startup due to startup process failure
[postgres@loopy pg_upgrade]$

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Brian Hirt (#1)
Re: Strange problem when upgrading to 7.2 with pg_upgrade.

Brian Hirt <bhirt@mobygames.com> writes:

I decided to try the pg_upgrade program, something I usually never do.

FATAL 2: read of clog file 0, offset 139264 failed: Success

Could we see ls -l $PGDATA/pg_clog?

I suspect that pg_upgrade has neglected to make sure the clog is long
enough.

regards, tom lane

#3Brian Hirt
bhirt@mobygames.com
In reply to: Tom Lane (#2)
Re: Strange problem when upgrading to 7.2 with pg_upgrade.

[root@loopy pg_clog]# pwd
/moby/pgsql-upgrade-bad/pg_clog
[root@loopy pg_clog]# ls -la
total 9
drwx------ 2 postgres postgres 72 Feb 14 09:32 .
drwx------ 6 postgres postgres 304 Feb 14 16:02 ..
-rw------- 1 postgres postgres 8192 Feb 14 09:34 0000
[root@loopy pg_clog]# bzip2 < 0000 | uuencode -
begin 644 -
M0EIH.3%!62936<[:PW<``#Y_".;,1H``L!``9@!F``(`"```"#``V*#5/R*>
MHT--`TT!31H`T``"1"(TT*:#U/4>C]8_(/$);W"6=D0`'3$(Z9Y_D(V@K9T)
M+,\6"GDBTU?,C9R[NSB.6-X6M3\55RS<AS$:?0<,;N4/K>#.KV(E,[88LWG%
M[:QR6B"\'JK2G9LB*63"00449P7!2)#0O3IY4PT;P%DC'J$M$T3$5'RU5';2
A*:2EB*:1)!MI,SQ%1=GE_(FY2U#027L7<D4X4)#.VL-W
`
end
[root@loopy pg_clog]#

Show quoted text

On Thu, 2002-02-14 at 17:54, Tom Lane wrote:

Brian Hirt <bhirt@mobygames.com> writes:

I decided to try the pg_upgrade program, something I usually never do.

FATAL 2: read of clog file 0, offset 139264 failed: Success

Could we see ls -l $PGDATA/pg_clog?

I suspect that pg_upgrade has neglected to make sure the clog is long
enough.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

#4Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#2)
Re: Strange problem when upgrading to 7.2 with pg_upgrade.

Tom Lane wrote:

Brian Hirt <bhirt@mobygames.com> writes:

I decided to try the pg_upgrade program, something I usually never do.

FATAL 2: read of clog file 0, offset 139264 failed: Success

Could we see ls -l $PGDATA/pg_clog?

I suspect that pg_upgrade has neglected to make sure the clog is long
enough.

Here is the code that sets the transaction id. Tom, does pg_resetxlog
handle pg_clog file creation properly?

# Set this so future backends don't think these tuples are their own
# because it matches their own XID.
# Commit status already updated by vacuum above
# Set to maximum XID just in case SRC wrapped around recently and
# is lower than DST's database

if [ "$SRC_XID" -gt "$DST_XID" ]
then MAX_XID="$SRC_XID"
else MAX_XID="$DST_XID"
fi

pg_resetxlog -x "$MAX_XID" "$PGDATA"
if [ "$?" -ne 0 ]
then echo "Unable to set new XID. Exiting." 1>&2
exit 1
fi

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#4)
Re: Strange problem when upgrading to 7.2 with pg_upgrade.

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I suspect that pg_upgrade has neglected to make sure the clog is long
enough.

Here is the code that sets the transaction id. Tom, does pg_resetxlog
handle pg_clog file creation properly?

pg_resetxlog doesn't know a single solitary thing about the clog.

The problem here is that if you're going to move the current xact ID
forward, you need to be prepared to create pages of the clog
accordingly. Or maybe the clog routines need to be less rigid in their
assumptions, but I'm uncomfortable with relaxing their expectations
unless it can be shown that they may fail to cope with cases that
arise in normal system operation. This isn't such a case.

regards, tom lane

#6Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#5)
Re: Strange problem when upgrading to 7.2 with pg_upgrade.

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I suspect that pg_upgrade has neglected to make sure the clog is long
enough.

Here is the code that sets the transaction id. Tom, does pg_resetxlog
handle pg_clog file creation properly?

pg_resetxlog doesn't know a single solitary thing about the clog.

The problem here is that if you're going to move the current xact ID
forward, you need to be prepared to create pages of the clog
accordingly. Or maybe the clog routines need to be less rigid in their
assumptions, but I'm uncomfortable with relaxing their expectations
unless it can be shown that they may fail to cope with cases that
arise in normal system operation. This isn't such a case.

We increased the xid because the old files have xid's that are greater
than the newly initdb'ed database. We did a vacuum, so no one is going
to check clog, but we need to increase the transaction counter because
old rows could be seen as matching the current transaction.

Can you suggest how to create the needed clog files? I don't see any
value in changing your current clog code in the backend.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#7Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Bruce Momjian (#6)
Re: Strange problem when upgrading to 7.2 with pg_upgrade.

We increased the xid because the old files have xid's that are greater
than the newly initdb'ed database. We did a vacuum, so no one is going
to check clog, but we need to increase the transaction counter because
old rows could be seen as matching the current transaction.

Can you suggest how to create the needed clog files? I don't see any
value in changing your current clog code in the backend.

Tom, is there a way to increment the XID every 100 million and start the
postmaster to create the needed pg_clog files to get to the XID I need?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#8Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#5)
Re: Strange problem when upgrading to 7.2 with pg_upgrade.

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I suspect that pg_upgrade has neglected to make sure the clog is long
enough.

Here is the code that sets the transaction id. Tom, does pg_resetxlog
handle pg_clog file creation properly?

pg_resetxlog doesn't know a single solitary thing about the clog.

The problem here is that if you're going to move the current xact ID
forward, you need to be prepared to create pages of the clog
accordingly. Or maybe the clog routines need to be less rigid in their
assumptions, but I'm uncomfortable with relaxing their expectations
unless it can be shown that they may fail to cope with cases that
arise in normal system operation. This isn't such a case.

Tom, any suggestion on how I can increase clog as part of pg_upgrade?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#8)
Re: Strange problem when upgrading to 7.2 with pg_upgrade.

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Tom, any suggestion on how I can increase clog as part of pg_upgrade?

Append zeroes ...

regards, tom lane

#10Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#9)
Re: Strange problem when upgrading to 7.2 with pg_upgrade.

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Tom, any suggestion on how I can increase clog as part of pg_upgrade?

Append zeroes ...

OK, I can 'dd' /dev/zero to append zeros to pad out the file. How large
does the clog file get, 1gb? Do I need to rename it at all?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#10)
Re: Strange problem when upgrading to 7.2 with pg_upgrade.

Bruce Momjian <pgman@candle.pha.pa.us> writes:

OK, I can 'dd' /dev/zero to append zeros to pad out the file. How large
does the clog file get, 1gb? Do I need to rename it at all?

256KB per segment. Do *not* rename existing segments.

regards, tom lane

#12Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#11)
Re: Strange problem when upgrading to 7.2 with pg_upgrade.

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

OK, I can 'dd' /dev/zero to append zeros to pad out the file. How large
does the clog file get, 1gb? Do I need to rename it at all?

256KB per segment. Do *not* rename existing segments.

Right, no rename, but I will have to create additional files in 256kb
chunks, and I assume 1gb of chunks remains in pg_clog directory?

Since I have done a vacuum, I assume I just keep creating 256k chunks
until I reach the max xid from the previous release, and delete the
files prior to the 1gb size limit.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#12)
Re: Strange problem when upgrading to 7.2 with pg_upgrade.

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Since I have done a vacuum, I assume I just keep creating 256k chunks
until I reach the max xid from the previous release, and delete the
files prior to the 1gb size limit.

Keep your hands *off* the existing segments. The CLOG code will clean
them up when it's good and ready ...

regards, tom lane

#14Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#13)
Re: Strange problem when upgrading to 7.2 with pg_upgrade.

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Since I have done a vacuum, I assume I just keep creating 256k chunks
until I reach the max xid from the previous release, and delete the
files prior to the 1gb size limit.

Keep your hands *off* the existing segments. The CLOG code will clean
them up when it's good and ready ...

OK. Fill out the current clog and add additional ones to reach the
current max xid, rounded to the nearest 8k, assuming 256k file equals
1mb of xids.

Why do you take these things so personally?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#15Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Brian Hirt (#1)
Re: Strange problem when upgrading to 7.2 with pg_upgrade.

This is a good bug report. I can fix pg_upgrade by adding clog files
containing zeros to pad out to the proper length. However, my guess is
that most people have already upgrade to 7.2.X, so there isn't much
value in fixing it now. I have updated pg_upgrade CVS for 7.3, and
hopefully we will have it working and well tested by the time 7.3 is
released.

Compressed clog was new in 7.2, so I guess it is no surprise I missed
that change in pg_upgrade. In 7.3, pg_clog will be moved over from the
old install, so this shouldn't be a problem with 7.3.

Thanks for the report. Sorry I don't have a fix.

---------------------------------------------------------------------------

Brian Hirt wrote:

I've started playing around with 7.2 on one of my development machines.
I decided to try the pg_upgrade program, something I usually never do.

Anyway, I followed the steps in the pg_upgrade (going from 7.1.3 to
7.2), and then when I started the database up after the upgrade finished
and vacuumed one of my tables, i get these error messages from the
postmaster. After this point I cannot restart the postmaster without
resetting the xlog.

I've kept the PGDATA directory around incase someone thinks this is
worth looking into, i would be more than happy to help out.

If i migrate the data over manually like a always do (pg_dump then
pg_restore), i don't have any problems. Part of the problem might be
path names for shared libraries specified in CREATE FUNCTION; I started
using pg back when it was version 6 before '$libdir' was supported and I
haven't bothered to take the absolute path names out yet -- i've just
updated it with each release (each release is installed in a different
location in case i need to roll back, and so i can test multiple version
at one time). not sure if pg_upgrade even checks for this.

oby/pgsql@loopy pg_upgrade]$ /moby/pgsql-7.2/bin/postmaster -i -o -F -B
256 -D/mo
DEBUG: database system was shut down at 2002-02-14 12:20:53 MST
DEBUG: checkpoint record is at 1/A7000010
DEBUG: redo record is at 1/A7000010; undo record is at 1/A7000010;
shutdown TRUE
DEBUG: next transaction id: 589031; next oid: 19512
DEBUG: database system is ready

DEBUG: --Relation developer--
DEBUG: Pages 669: Changed 0, Empty 0; Tup 51508: Vac 0, Keep 0, UnUsed
0.
Total CPU 0.07s/0.03u sec elapsed 0.11 sec.
DEBUG: Analyzing developer
FATAL 2: read of clog file 0, offset 139264 failed: Success
DEBUG: server process (pid 17786) exited with exit code 2
DEBUG: terminating any other active server processes
NOTICE: Message from PostgreSQL backend:
The Postmaster has informed me that some other backend
died abnormally and possibly corrupted shared memory.
I have rolled back the current transaction and am
going to terminate your database system connection and exit.
Please reconnect to the database system and repeat your query.
DEBUG: all server processes terminated; reinitializing shared memory
and semaphores
DEBUG: database system was interrupted at 2002-02-14 12:20:58 MST
DEBUG: checkpoint record is at 1/A7000010
DEBUG: redo record is at 1/A7000010; undo record is at 1/A7000010;
shutdown TRUE
DEBUG: next transaction id: 589031; next oid: 19512
DEBUG: database system was not properly shut down; automatic recovery
in progress
DEBUG: redo starts at 1/A7000050
FATAL 2: read of clog file 0, offset 139264 failed: Success
DEBUG: startup process (pid 17788) exited with exit code 2
DEBUG: aborting startup due to startup process failure
[postgres@loopy pg_upgrade]$
[postgres@loopy pg_upgrade]$
[postgres@loopy pg_upgrade]$
[postgres@loopy pg_upgrade]$ df -k
Filesystem 1k-blocks Used Available Use% Mounted on
/dev/hda8 248895 192496 43549 82% /
/dev/hda1 31079 4988 24487 17% /boot
/dev/hda5 24080660 6601476 17479184 28% /home
/dev/hda6 5044156 1930892 2857032 41% /usr
/dev/hda9 248895 133875 102170 57% /var
/dev/hdd1 59919196 39090008 20829188 66% /disk
oby/pgsql@loopy pg_upgrade]$ /moby/pgsql-7.2/bin/postmaster -i -o -F -B
256 -D/mo
DEBUG: database system was interrupted being in recovery at 2002-02-14
12:21:06 MST
This probably means that some data blocks are corrupted
and you will have to use the last backup for recovery.
DEBUG: checkpoint record is at 1/A7000010
DEBUG: redo record is at 1/A7000010; undo record is at 1/A7000010;
shutdown TRUE
DEBUG: next transaction id: 589031; next oid: 19512
DEBUG: database system was not properly shut down; automatic recovery
in progress
DEBUG: redo starts at 1/A7000050
FATAL 2: read of clog file 0, offset 139264 failed: Success
DEBUG: startup process (pid 17793) exited with exit code 2
DEBUG: aborting startup due to startup process failure
[postgres@loopy pg_upgrade]$

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#16Mattew T. O'Connor
matthew@zeut.net
In reply to: Bruce Momjian (#15)
Re: Strange problem when upgrading to 7.2 with pg_upgrade.

I wouldn't be so quick to assume that almost everyone has upgraded by now. I
know we have not, at least not in production.

Show quoted text

On Tuesday 09 April 2002 02:14 pm, Bruce Momjian wrote:

This is a good bug report. I can fix pg_upgrade by adding clog files
containing zeros to pad out to the proper length. However, my guess is
that most people have already upgrade to 7.2.X, so there isn't much
value in fixing it now. I have updated pg_upgrade CVS for 7.3, and
hopefully we will have it working and well tested by the time 7.3 is
released.

Compressed clog was new in 7.2, so I guess it is no surprise I missed
that change in pg_upgrade. In 7.3, pg_clog will be moved over from the
old install, so this shouldn't be a problem with 7.3.

Thanks for the report. Sorry I don't have a fix.

#17Bradley McLean
brad@bradm.net
In reply to: Mattew T. O'Connor (#16)
Re: Strange problem when upgrading to 7.2 with pg_upgrade.

* Mattew T. O'Connor (matthew@zeut.net) [020409 15:34]:

I wouldn't be so quick to assume that almost everyone has upgraded by now. I
know we have not, at least not in production.

yeah, what he said. Test, QA and development yes, production, no.

-Brad

#18Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Bradley McLean (#17)
Re: Strange problem when upgrading to 7.2 with pg_upgrade.

Bradley McLean wrote:

* Mattew T. O'Connor (matthew@zeut.net) [020409 15:34]:

I wouldn't be so quick to assume that almost everyone has upgraded by now. I
know we have not, at least not in production.

yeah, what he said. Test, QA and development yes, production, no.

The question is anyone who has delayed installing 7.2 will be using
pg_upgrade. Odds are they will not, and clearly we can't get enough
testing on pg_upgrade to be sure it will work well with 7.2.X.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026