what could cause this PANIC on enterprice 7.3.4 db?

Started by Andriy Tkachukabout 22 years ago5 messages

ant@imt.com.ua

about 22 years ago

Usually db works fine & stable months, but couple days ago
it was PANIC, and i have no ideas why this happen.

Nov 5 20:22:30 monstr postgres[16071]: [1] LOG: connection received: host=[local]
Nov 5 20:22:30 monstr postgres[16071]: [2] LOG: connection authorized: user=ant database=tele
Nov 5 20:22:42 monstr postgres[16071]: [3] PANIC: open of /usr/local/pgsql/data/pg_clog/0040 failed: No such file or directory
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Nov 5 20:22:42 monstr postgres[16071]: [4-1] LOG: statement: select
[...] skip the log of failed query (it was just usual query)
Nov 5 20:22:42 monstr postgres[1109]: [1] LOG: server process (pid 16071) was terminated by signal 6
Nov 5 20:22:42 monstr postgres[1109]: [2] LOG: terminating any other active server processes
Nov 5 20:22:42 monstr postgres[15913]: [3-1] WARNING: Message from PostgreSQL backend:
Nov 5 20:22:42 monstr postgres[1412]: [143-1] WARNING: Message from PostgreSQL backend:
Nov 5 20:22:42 monstr postgres[15913]: [3-2] ^IThe Postmaster has informed me that some other backend
Nov 5 20:22:42 monstr postgres[1412]: [143-2] ^IThe Postmaster has informed me that some other backend
Nov 5 20:22:42 monstr postgres[15913]: [3-3] ^Idied abnormally and possibly corrupted shared memory.
Nov 5 20:22:42 monstr postgres[1412]: [143-3] ^Idied abnormally and possibly corrupted shared memory.
Nov 5 20:22:42 monstr postgres[15913]: [3-4] ^II have rolled back the current transaction and am
Nov 5 20:22:42 monstr postgres[1412]: [143-4] ^II have rolled back the current transaction and am
Nov 5 20:22:42 monstr postgres[15913]: [3-5] ^Igoing to terminate your database system connection and exit.
Nov 5 20:22:42 monstr postgres[1412]: [143-5] ^Igoing to terminate your database system connection and exit.
Nov 5 20:22:42 monstr postgres[15913]: [3-6] ^IPlease reconnect to the database system and repeat your query.
Nov 5 20:22:42 monstr postgres[1412]: [143-6] ^IPlease reconnect to the database system and repeat your query.
Nov 5 20:22:42 monstr postgres[1109]: [3] LOG: all server processes terminated; reinitializing shared memory and semaphores
Nov 5 20:22:42 monstr postgres[16077]: [4] LOG: database system was interrupted at 2003-11-05 20:17:48 EET
Nov 5 20:22:42 monstr postgres[16077]: [5] LOG: checkpoint record is at 14/E956482C
Nov 5 20:22:42 monstr postgres[16077]: [6] LOG: redo record is at 14/E956482C; undo record is at 0/0; shutdown FALSE
Nov 5 20:22:42 monstr postgres[16077]: [7] LOG: next transaction id: 13580788; next oid: 190057632
Nov 5 20:22:42 monstr postgres[16077]: [8] LOG: database system was not properly shut down; automatic recovery in progress
Nov 5 20:22:42 monstr postgres[16077]: [9] LOG: redo starts at 14/E956486C
Nov 5 20:22:42 monstr postgres[16077]: [10] LOG: ReadRecord: record with zero length at 14/E95ACA14
Nov 5 20:22:42 monstr postgres[16077]: [11] LOG: redo done at 14/E95AC9F0
Nov 5 20:22:45 monstr postgres[16077]: [12] LOG: database system is ready

so db was restarted succesfully itself but all applications that had
static connection (it's tcl apps) where fell

the hard disk is fine (might be, because i've used to check it
at night time from cron like this: cat /dev/hda > /dev/null),
so i have no idea why pg didn't find this file (see error from log).
Does someone?

system is: Linux 2.4.18-5 (RH)

Thank you,
Andriy Tkachuk.

--
http://www.imt.com.ua

Andriy Tkachuk

ant@imt.com.ua

about 22 years ago

In reply to: Andriy Tkachuk (#1)

Re: what could cause this PANIC on enterprise 7.3.4 db?

sorry , guys, for my english

i meant "enterprise db" in subj, not enterprice :)

On Fri, 7 Nov 2003, Andriy Tkachuk wrote:

Usually db works fine & stable months, but couple days ago
it was PANIC, and i have no ideas why this happen.

Nov 5 20:22:30 monstr postgres[16071]: [1] LOG: connection received: host=[local]
Nov 5 20:22:30 monstr postgres[16071]: [2] LOG: connection authorized: user=ant database=tele
Nov 5 20:22:42 monstr postgres[16071]: [3] PANIC: open of /usr/local/pgsql/data/pg_clog/0040 failed: No such file or directory
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Nov 5 20:22:42 monstr postgres[16071]: [4-1] LOG: statement: select
[...] skip the log of failed query (it was just usual query)
Nov 5 20:22:42 monstr postgres[1109]: [1] LOG: server process (pid 16071) was terminated by signal 6
Nov 5 20:22:42 monstr postgres[1109]: [2] LOG: terminating any other active server processes
Nov 5 20:22:42 monstr postgres[15913]: [3-1] WARNING: Message from PostgreSQL backend:
Nov 5 20:22:42 monstr postgres[1412]: [143-1] WARNING: Message from PostgreSQL backend:
Nov 5 20:22:42 monstr postgres[15913]: [3-2] ^IThe Postmaster has informed me that some other backend
Nov 5 20:22:42 monstr postgres[1412]: [143-2] ^IThe Postmaster has informed me that some other backend
Nov 5 20:22:42 monstr postgres[15913]: [3-3] ^Idied abnormally and possibly corrupted shared memory.
Nov 5 20:22:42 monstr postgres[1412]: [143-3] ^Idied abnormally and possibly corrupted shared memory.
Nov 5 20:22:42 monstr postgres[15913]: [3-4] ^II have rolled back the current transaction and am
Nov 5 20:22:42 monstr postgres[1412]: [143-4] ^II have rolled back the current transaction and am
Nov 5 20:22:42 monstr postgres[15913]: [3-5] ^Igoing to terminate your database system connection and exit.
Nov 5 20:22:42 monstr postgres[1412]: [143-5] ^Igoing to terminate your database system connection and exit.
Nov 5 20:22:42 monstr postgres[15913]: [3-6] ^IPlease reconnect to the database system and repeat your query.
Nov 5 20:22:42 monstr postgres[1412]: [143-6] ^IPlease reconnect to the database system and repeat your query.
Nov 5 20:22:42 monstr postgres[1109]: [3] LOG: all server processes terminated; reinitializing shared memory and semaphores
Nov 5 20:22:42 monstr postgres[16077]: [4] LOG: database system was interrupted at 2003-11-05 20:17:48 EET
Nov 5 20:22:42 monstr postgres[16077]: [5] LOG: checkpoint record is at 14/E956482C
Nov 5 20:22:42 monstr postgres[16077]: [6] LOG: redo record is at 14/E956482C; undo record is at 0/0; shutdown FALSE
Nov 5 20:22:42 monstr postgres[16077]: [7] LOG: next transaction id: 13580788; next oid: 190057632
Nov 5 20:22:42 monstr postgres[16077]: [8] LOG: database system was not properly shut down; automatic recovery in progress
Nov 5 20:22:42 monstr postgres[16077]: [9] LOG: redo starts at 14/E956486C
Nov 5 20:22:42 monstr postgres[16077]: [10] LOG: ReadRecord: record with zero length at 14/E95ACA14
Nov 5 20:22:42 monstr postgres[16077]: [11] LOG: redo done at 14/E95AC9F0
Nov 5 20:22:45 monstr postgres[16077]: [12] LOG: database system is ready

so db was restarted succesfully itself but all applications that had
static connection (it's tcl apps) where fell

the hard disk is fine (might be, because i've used to check it
at night time from cron like this: cat /dev/hda > /dev/null),
so i have no idea why pg didn't find this file (see error from log).
Does someone?

system is: Linux 2.4.18-5 (RH)

Thank you,
Andriy Tkachuk.

--
http://www.imt.com.ua

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

О©╫ О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫,
О©╫О©╫О©╫О©╫О©╫О©╫ О©╫О©╫О©╫О©╫О©╫О©╫.

http://www.imt.com.ua