beta3 vacuum crash

Started by Frank Joerdensalmost 25 years ago3 messages

frank@joerdens.de

almost 25 years ago

I haven't tried everything to recover from this yet, but will quickly try to document the
crash before I lose track of what exactly went into it and what I did: Basically I deleted
a table and then ran vacuum verbose, with the net result that I cannot connect to this
database anymore with the error:

frank@kelis:/usr/local/httpd/htdocs > psql mpi
psql: FATAL 1: Index 'pg_trigger_tgrelid_index' does not exist

This is, fortunately, not the production system but my development machine. I was going to
go live with this in a couple of week's time on beta3. Should I reconsider and move back
to 7.03 (I'd hate to cuz I'll have rows bigger than 32K, potentially . . . )?

The vacuum went like this:

------------------------------- begin vacuum -------------------------------
mpi=# drop table wimis;
DROP
mpi=# vacuum verbose;
NOTICE: --Relation pg_type--
NOTICE: Pages 3: Changed 2, reaped 2, Empty 0, New 0; Tup 159: Vac 16, Keep/VTL 0/0,
Crash 0, UnUsed 0, MinLen 106, MaxLen 109; Re-using: Free/Avail. Space 6296/156;
EndEmpty/Avail. Pages 0/1. CPU 0.00s/0.00u sec.
NOTICE: Index pg_type_oid_index: Pages 2; Tuples 159: Deleted 16. CPU 0.00s/0.00u sec.
NOTICE: Index pg_type_typname_index: Pages 2; Tuples 159: Deleted 16. CPU 0.00s/0.00u
sec.
NOTICE: Rel pg_type: Pages: 3 --> 3; Tuple(s) moved: 1. CPU 0.01s/0.00u sec.
NOTICE: Index pg_type_oid_index: Pages 2; Tuples 159: Deleted 1. CPU 0.00s/0.00u sec.
NOTICE: Index pg_type_typname_index: Pages 2; Tuples 159: Deleted 1. CPU 0.00s/0.00u sec.
NOTICE: --Relation pg_attribute--
NOTICE: Pages 16: Changed 9, reaped 8, Empty 0, New 0; Tup 1021: Vac 160, Keep/VTL 0/0,
Crash 0, UnUsed 0, MinLen 98, MaxLen 98; Re-using: Free/Avail. Space 16480/16480;
EndEmpty/Avail. Pages 0/8. CPU 0.00s/0.00u sec.
NOTICE: Index pg_attribute_relid_attnam_index: Pages 16; Tuples 1021: Deleted 160. CPU
0.00s/0.01u sec.
NOTICE: Index pg_attribute_relid_attnum_index: Pages 8; Tuples 1021: Deleted 160. CPU
0.00s/0.00u sec.
NOTICE: Rel pg_attribute: Pages: 16 --> 14; Tuple(s) moved: 43. CPU 0.01s/0.01u sec.
NOTICE: Index pg_attribute_relid_attnam_index: Pages 16; Tuples 1021: Deleted 43. CPU
0.00s/0.00u sec.
NOTICE: Index pg_attribute_relid_attnum_index: Pages 8; Tuples 1021: Deleted 43. CPU
0.00s/0.00u sec.
NOTICE: --Relation pg_class--
NOTICE: Pages 7: Changed 1, reaped 7, Empty 0, New 0; Tup 136: Vac 257, Keep/VTL 0/0,
Crash 0, UnUsed 0, MinLen 115, MaxLen 160; Re-using: Free/Avail. Space 38880/31944;
EndEmpty/Avail. Pages 0/6. CPU 0.00s/0.00u sec.
NOTICE: Index pg_class_oid_index: Pages 2; Tuples 136: Deleted 257. CPU 0.00s/0.01u sec.
NOTICE: Index pg_class_relname_index: Pages 6; Tuples 136: Deleted 257. CPU 0.00s/0.00u
sec.
NOTICE: Rel pg_class: Pages: 7 --> 3; Tuple(s) moved: 76. CPU 0.01s/0.01u sec.
pqReadData() -- backend closed the channel unexpectedly.
This probably means the backend terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!# \q
------------------------------- end vacuum -------------------------------

The log says (I'm running the backend with -d 2):

------------------------------- begin log -------------------------------
DEBUG: query: vacuum verbose;
DEBUG: ProcessUtility: vacuum verbose;
NOTICE: --Relation pg_type--
NOTICE: Pages 3: Changed 2, reaped 2, Empty 0, New 0; Tup 159: Vac 16, Keep/VTL 0/0,
Crash 0, UnUsed 0, MinLen 106, MaxLen 109; Re-using: Free/Avail. Space 6296/156;
EndEmpty/Avail. Pages 0/1. CPU 0.00s/0.00u sec.
NOTICE: Index pg_type_oid_index: Pages 2; Tuples 159: Deleted 16. CPU 0.00s/0.00u sec.
NOTICE: Index pg_type_typname_index: Pages 2; Tuples 159: Deleted 16. CPU 0.00s/0.00u
sec.
NOTICE: Rel pg_type: Pages: 3 --> 3; Tuple(s) moved: 1. CPU 0.01s/0.00u sec.
NOTICE: Index pg_type_oid_index: Pages 2; Tuples 159: Deleted 1. CPU 0.00s/0.00u sec.
NOTICE: Index pg_type_typname_index: Pages 2; Tuples 159: Deleted 1. CPU 0.00s/0.00u sec.
NOTICE: --Relation pg_attribute--
NOTICE: Pages 16: Changed 9, reaped 8, Empty 0, New 0; Tup 1021: Vac 160, Keep/VTL 0/0,
Crash 0, UnUsed 0, MinLen 98, MaxLen 98; Re-using: Free/Avail. Space 16480/16480;
EndEmpty/Avail. Pages 0/8. CPU 0.00s/0.00u sec.
NOTICE: Index pg_attribute_relid_attnam_index: Pages 16; Tuples 1021: Deleted 160. CPU
0.00s/0.01u sec.
NOTICE: Index pg_attribute_relid_attnum_index: Pages 8; Tuples 1021: Deleted 160. CPU
0.00s/0.00u sec.
NOTICE: Rel pg_attribute: Pages: 16 --> 14; Tuple(s) moved: 43. CPU 0.01s/0.01u sec.
NOTICE: Index pg_attribute_relid_attnam_index: Pages 16; Tuples 1021: Deleted 43. CPU
0.00s/0.00u sec.
NOTICE: Index pg_attribute_relid_attnum_index: Pages 8; Tuples 1021: Deleted 43. CPU
0.00s/0.00u sec.
NOTICE: --Relation pg_class--
NOTICE: Pages 7: Changed 1, reaped 7, Empty 0, New 0; Tup 136: Vac 257, Keep/VTL 0/0,
Crash 0, UnUsed 0, MinLen 115, MaxLen 160; Re-using: Free/Avail. Space 38880/31944;
EndEmpty/Avail. Pages 0/6. CPU 0.00s/0.00u sec.
NOTICE: Index pg_class_oid_index: Pages 2; Tuples 136: Deleted 257. CPU 0.00s/0.01u sec.
NOTICE: Index pg_class_relname_index: Pages 6; Tuples 136: Deleted 257. CPU 0.00s/0.00u
sec.
NOTICE: Rel pg_class: Pages: 7 --> 3; Tuple(s) moved: 76. CPU 0.01s/0.01u sec.
/usr/local/pgsql/bin/postmaster: reaping dead processes...
/usr/local/pgsql/bin/postmaster: CleanupProc: pid 29832 exited with status 11
Server process (pid 29832) exited with status 11 at Sat Jan 20 21:46:22 2001
Terminating any active server processes...
/usr/local/pgsql/bin/postmaster: CleanupProc: sending SIGUSR1 to process 29335
/usr/local/pgsql/bin/postmaster: CleanupProc: sending SIGUSR1 to process 29330
NOTICE: Message from PostgreSQL backend:
The Postmaster has informed me that some other backend died abnormally and
possibly corrupted shared memory.
I have rolled back the current transaction and am going to terminate your database
system connection and exit.
Please reconnect to the database system and repeat your query.
/usr/local/pgsql/bin/postmaster: ServerLoop: handling reading 5
/usr/local/pgsql/bin/postmaster: ServerLoop: handling reading 5
The Data Base System is in recovery mode
/usr/local/pgsql/bin/postmaster: ServerLoop: handling writing 5
NOTICE: Message from PostgreSQL backend:
The Postmaster has informed me that some other backend died abnormally and
possibly corrupted shared memory.
I have rolled back the current transaction and am going to terminate your database
system connection and exit.
Please reconnect to the database system and repeat your query.
/usr/local/pgsql/bin/postmaster: reaping dead processes...
/usr/local/pgsql/bin/postmaster: CleanupProc: pid 29335 exited with status 256
/usr/local/pgsql/bin/postmaster: CleanupProc: sending SIGUSR1 to process 29330
/usr/local/pgsql/bin/postmaster: CleanupProc: pid 29330 exited with status 256
Server processes were terminated at Sat Jan 20 21:46:22 2001
Reinitializing shared memory and semaphores
invoking IpcMemoryCreate(size=17334272)
/usr/local/pgsql/bin/postmaster: reaping dead processes...
DEBUG: starting up
DEBUG: database system was interrupted at 2001-01-20 21:43:20
DEBUG: CheckPoint record at (0, 9309416)
DEBUG: Redo record at (0, 9309416); Undo record at (0, 0); Shutdown FALSE
DEBUG: NextTransactionId: 36262; NextOid: 38319
DEBUG: database system was not properly shut down; automatic recovery in progress...
DEBUG: redo starts at (0, 9309472)
DEBUG: redo done at (0, 9391148)
DEBUG: database system is in production state
------------------------------- end log -------------------------------

- Frank

BTW: I posted this earlier today from a different account, which is not
subscribed, and the message did not come through. I am assuming now that
non-subscribers can't post. If this is incorrect, and it eventually
comes through twice, I apologize and promise not to do it again!

Tom Lane

tgl@sss.pgh.pa.us

almost 25 years ago

In reply to: Frank Joerdens (#1)

Re: beta3 vacuum crash

Frank Joerdens <frank@joerdens.de> writes:

I haven't tried everything to recover from this yet, but will quickly
try to document the crash before I lose track of what exactly went
into it and what I did: Basically I deleted a table and then ran
vacuum verbose, with the net result that I cannot connect to this
database anymore with the error:

Any chance of a backtrace from the core dump file? The log only tells
us that the vacuum process crashed, which isn't much to go on ...

regards, tom lane

Frank Joerdens

frank@joerdens.de

almost 25 years ago

In reply to: Tom Lane (#2)

Re: beta3 vacuum crash

On Sat, Jan 20, 2001 at 05:35:41PM -0500, Tom Lane wrote:

Frank Joerdens <frank@joerdens.de> writes:

I haven't tried everything to recover from this yet, but will quickly
try to document the crash before I lose track of what exactly went
into it and what I did: Basically I deleted a table and then ran
vacuum verbose, with the net result that I cannot connect to this
database anymore with the error:

Any chance of a backtrace from the core dump file? The log only tells
us that the vacuum process crashed, which isn't much to go on ...

Silly me, this was in fact a December 13 snapshot. I have 3 machines
with an identical setup, two of which I already had moved to beta3. On
those, the error does not occur. On the December 13 version, I can
reproduce the error, which obviously has been fixed.

Sorry about the confusion.

- Frank