"invalid memory alloc request size" + "Could not open file "pg_clog/XXXX"

Started by scheu_postgresqlabout 14 years ago2 messagesgeneral
Jump to latest
#1scheu_postgresql
scheu.postgresql@gmail.com

Hi

In my Postgresql 8.4.0 server, since this morning some tables are
unavailable, see example below :

--> pg_dump MY_DB > bkp_MY_DB.dmp
pg_dump: SQL command failed
pg_dump: Error message from server: ERROR: invalid memory alloc request
size 18446744073709551613
pg_dump: The command was: COPY <schema>.<unavailable_table> (col1, col2,
...).

--> vacuum analyze <schema>.<unavailable_table> ;
WARNING: terminating connection because of crash of another
server process
DETAIL: The postmaster has commanded this server process to
roll back the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the
database and repeat your command.

--> select * from <schema>.<unavailable_table> ;
ERROR: invalid memory alloc request size 18446744073709551613

--> server log file
Feb 29 05:31:44 my_server postgres[6686]: [17-1] user=,db= LOG: server
process (PID 3887) was terminated by signal 11: Segmentation fault
Feb 29 05:31:44 my_server postgres[6686]: [18-1] user=,db= LOG:
terminating any other active server processes
Feb 29 05:31:44 my_server postgres[6686]: [19-1] user=,db= LOG: all server
processes terminated; reinitializing
Feb 29 05:31:44 my_server postgres[3892]: [20-1] user=,db= LOG: database
system was interrupted; last known up at 2012-02-29 05:22:33 CET
Feb 29 05:31:44 my_server postgres[3892]: [21-1] user=,db= LOG: database
system was not properly shut down; automatic recovery in progress
Feb 29 05:31:44 my_server postgres[3892]: [22-1] user=,db= LOG: redo
starts at 10/67C2A3B8
Feb 29 05:31:45 my_server postgres[3892]: [23-1] user=,db= LOG: record
with zero length at 10/68BCF990
Feb 29 05:31:45 my_server postgres[3892]: [24-1] user=,db= LOG: redo done
at 10/68BCF960
Feb 29 05:31:45 my_server postgres[3892]: [25-1] user=,db= LOG: last
completed transaction was at log time 2012-02-29 05:31:42.618352+01
Feb 29 05:31:45 my_server postgres[6686]: [20-1] user=,db= LOG: database
system is ready to accept connections
Feb 29 05:32:52 my_server postgres[4469]: [21-1]
user=[unknown],db=[unknown] LOG: incomplete startup packet
Feb 29 05:33:52 my_server postgres[6686]: [21-1] user=,db= LOG: server
process (PID 5151) was terminated by signal 11: Segmentation fault
Feb 29 05:33:52 my_server postgres[6686]: [22-1] user=,db= LOG:
terminating any other active server processes
Feb 29 05:33:52 my_server postgres[6686]: [23-1] user=,db= LOG: all server
processes terminated; reinitializing
Feb 29 05:33:52 my_server postgres[5152]: [24-1] user=,db= LOG: database
system was interrupted; last known up at 2012-02-29 05:31:45 CET
Feb 29 05:33:52 my_server postgres[5152]: [25-1] user=,db= LOG: database
system was not properly shut down; automatic recovery in progress
Feb 29 05:33:52 my_server postgres[5152]: [26-1] user=,db= LOG: record
with zero length at 10/68BCF9D8
Feb 29 05:33:52 my_server postgres[5152]: [27-1] user=,db= LOG: redo is
not required
Feb 29 05:33:52 my_server postgres[5153]: [24-1] user=match,db=MY_DB
FATAL: the database system is in recovery mode
Feb 29 05:33:52 my_server postgres[6686]: [24-1] user=,db= LOG: database
system is ready to accept connections
Feb 29 05:37:19 my_server postgres[6686]: [25-1] user=,db= LOG: server
process (PID 8065) was terminated by signal 11: Segmentation fault
Feb 29 05:37:19 my_server postgres[6686]: [26-1] user=,db= LOG:
terminating any other active server processes
Feb 29 05:37:19 my_server postgres[6686]: [27-1] user=,db= LOG: all server
processes terminated; reinitializing
Feb 29 05:37:19 my_server postgres[8066]: [28-1] user=,db= LOG: database
system was interrupted; last known up at 2012-02-29 05:33:52 CET
Feb 29 05:37:19 my_server postgres[8066]: [29-1] user=,db= LOG: database
system was not properly shut down; automatic recovery in progress
Feb 29 05:37:19 my_server postgres[8066]: [30-1] user=,db= LOG: redo
starts at 10/68BCFA20
Feb 29 05:37:19 my_server postgres[8066]: [31-1] user=,db= LOG: record
with zero length at 10/68BD5BD0
Feb 29 05:37:19 my_server postgres[8066]: [32-1] user=,db= LOG: redo done
at 10/68BD5BA0
Feb 29 05:37:19 my_server postgres[8066]: [33-1] user=,db= LOG: last
completed transaction was at log time 2012-02-29 05:35:44.468968+01
Feb 29 05:37:19 my_server postgres[6686]: [28-1] user=,db= LOG: database
system is ready to accept connections
Feb 29 05:38:27 my_server postgres[8639]: [29-1]
user=[unknown],db=[unknown] LOG: incomplete startup packet
Feb 29 05:38:53 my_server postgres[6686]: [29-1] user=,db= LOG: server
process (PID 8809) was terminated by signal 11: Segmentation fault

I have tried to restart Postgresql but it did not solve these issues
I cannot backup the full database because some tables have become unreadable
I have got 7 databases on this server and only 2 have got this problem

What could be the cause of the problem ?
Is there a way to fix it without losing data and without dropping and
recreating the db with my nightly pg_dump backup ?

Thank you by advance for your help

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: scheu_postgresql (#1)
Re: "invalid memory alloc request size" + "Could not open file "pg_clog/XXXX"

scheu_postgresql wrote:

In my Postgresql 8.4.0 server, since this morning some tables are

unavailable, see example below :

--> pg_dump MY_DB > bkp_MY_DB.dmp
pg_dump: SQL command failed
pg_dump: Error message from server: ERROR: invalid memory alloc

request size 18446744073709551613

pg_dump: The command was: COPY <schema>.<unavailable_table> (col1,

col2, ...).

--> vacuum analyze <schema>.<unavailable_table> ;
WARNING: terminating connection because of crash of

another server process

DETAIL: The postmaster has commanded this server process

to roll back the current

transaction and exit, because another server process exited abnormally

and possibly corrupted shared

memory.
HINT: In a moment you should be able to reconnect to the

database and repeat your

command.

--> select * from <schema>.<unavailable_table> ;
ERROR: invalid memory alloc request size 18446744073709551613

--> server log file
Feb 29 05:31:44 my_server postgres[6686]: [17-1] user=,db= LOG:

server process (PID 3887) was

terminated by signal 11: Segmentation fault
Feb 29 05:31:44 my_server postgres[6686]: [18-1] user=,db= LOG:

terminating any other active server

processes
Feb 29 05:31:44 my_server postgres[6686]: [19-1] user=,db= LOG: all

server processes terminated;

reinitializing
Feb 29 05:31:44 my_server postgres[3892]: [20-1] user=,db= LOG:

database system was interrupted; last

known up at 2012-02-29 05:22:33 CET
Feb 29 05:31:44 my_server postgres[3892]: [21-1] user=,db= LOG:

database system was not properly shut

down; automatic recovery in progress
Feb 29 05:31:44 my_server postgres[3892]: [22-1] user=,db= LOG: redo

starts at 10/67C2A3B8

Feb 29 05:31:45 my_server postgres[3892]: [23-1] user=,db= LOG:

record with zero length at

10/68BCF990
Feb 29 05:31:45 my_server postgres[3892]: [24-1] user=,db= LOG: redo

done at 10/68BCF960

Feb 29 05:31:45 my_server postgres[3892]: [25-1] user=,db= LOG: last

completed transaction was at log

time 2012-02-29 05:31:42.618352+01
Feb 29 05:31:45 my_server postgres[6686]: [20-1] user=,db= LOG:

database system is ready to accept

connections
Feb 29 05:32:52 my_server postgres[4469]: [21-1]

user=[unknown],db=[unknown] LOG: incomplete startup

packet
Feb 29 05:33:52 my_server postgres[6686]: [21-1] user=,db= LOG:

server process (PID 5151) was

terminated by signal 11: Segmentation fault
Feb 29 05:33:52 my_server postgres[6686]: [22-1] user=,db= LOG:

terminating any other active server

processes
Feb 29 05:33:52 my_server postgres[6686]: [23-1] user=,db= LOG: all

server processes terminated;

reinitializing
Feb 29 05:33:52 my_server postgres[5152]: [24-1] user=,db= LOG:

database system was interrupted; last

known up at 2012-02-29 05:31:45 CET
Feb 29 05:33:52 my_server postgres[5152]: [25-1] user=,db= LOG:

database system was not properly shut

down; automatic recovery in progress
Feb 29 05:33:52 my_server postgres[5152]: [26-1] user=,db= LOG:

record with zero length at

10/68BCF9D8
Feb 29 05:33:52 my_server postgres[5152]: [27-1] user=,db= LOG: redo

is not required

Feb 29 05:33:52 my_server postgres[5153]: [24-1] user=match,db=MY_DB

FATAL: the database system is in

recovery mode
Feb 29 05:33:52 my_server postgres[6686]: [24-1] user=,db= LOG:

database system is ready to accept

connections
Feb 29 05:37:19 my_server postgres[6686]: [25-1] user=,db= LOG:

server process (PID 8065) was

terminated by signal 11: Segmentation fault
Feb 29 05:37:19 my_server postgres[6686]: [26-1] user=,db= LOG:

terminating any other active server

processes
Feb 29 05:37:19 my_server postgres[6686]: [27-1] user=,db= LOG: all

server processes terminated;

reinitializing
Feb 29 05:37:19 my_server postgres[8066]: [28-1] user=,db= LOG:

database system was interrupted; last

known up at 2012-02-29 05:33:52 CET
Feb 29 05:37:19 my_server postgres[8066]: [29-1] user=,db= LOG:

database system was not properly shut

down; automatic recovery in progress
Feb 29 05:37:19 my_server postgres[8066]: [30-1] user=,db= LOG: redo

starts at 10/68BCFA20

Feb 29 05:37:19 my_server postgres[8066]: [31-1] user=,db= LOG:

record with zero length at

10/68BD5BD0
Feb 29 05:37:19 my_server postgres[8066]: [32-1] user=,db= LOG: redo

done at 10/68BD5BA0

Feb 29 05:37:19 my_server postgres[8066]: [33-1] user=,db= LOG: last

completed transaction was at log

time 2012-02-29 05:35:44.468968+01
Feb 29 05:37:19 my_server postgres[6686]: [28-1] user=,db= LOG:

database system is ready to accept

connections
Feb 29 05:38:27 my_server postgres[8639]: [29-1]

user=[unknown],db=[unknown] LOG: incomplete startup

packet
Feb 29 05:38:53 my_server postgres[6686]: [29-1] user=,db= LOG:

server process (PID 8809) was

terminated by signal 11: Segmentation fault

I have tried to restart Postgresql but it did not solve these issues
I cannot backup the full database because some tables have become

unreadable

I have got 7 databases on this server and only 2 have got this problem

What could be the cause of the problem ?

If a sequential scan fails, I would say that the table is corrupted.
The cause could be faulty hardware, a corrupted file system or a
software bug.
I notice that you are running 8.4.0 which is a really bad idea.
A number of data corruption bugs have been fixed since.

Check the hardware and the file systems.

Is there a way to fix it without losing data and without dropping and

recreating the db with my

nightly pg_dump backup ?

Without losing data? Not unless you can poke around in the guts of
the corrupted blocks and make sense of what you find there...

If your requirement is "no data loss", you'll have to use a different
backup strategy.

Yours,
Laurenz Albe