Continuous archiving fails

Started by David Darvilleabout 19 years ago3 messagesbugs
Jump to latest
#1David Darville
ml@darville.vm.bytemark.co.uk

Hello everybody

While testing a continuous archiving setup using PostgreSQL 8.2.3, on Debian
Etch amd64, I found out that the slave database crashed when I did a 'DROP
DATABASE' on the master.

I was trying to stress test our setup, by continuously restoring a dump of
our database, dropping the database, restoring it etc.
But when I dropped the database I found out that the slave database crased,
leaving log messages like these:

....
LOG: restored log file "000000010000004F000000F0" from archive
LOG: restored log file "000000010000004F000000F1" from archive
LOG: restored log file "000000010000004F000000F2" from archive
LOG: restored log file "000000010000004F000000F3" from archive
LOG: could not fsync segment 0 of relation 19820534/105758957/125593540: No
such file or directory
CONTEXT: xlog redo checkpoint: redo 4F/F3859B60; undo 0/0; tli 1; xid
0/84778; oid 125601021; multi 1; offset 0; online
FATAL: storage sync failed on magnetic disk: No such file or directory
CONTEXT: xlog redo checkpoint: redo 4F/F3859B60; undo 0/0; tli 1; xid
0/84778; oid 125601021; multi 1; offset 0; online
LOG: startup process (PID 16101) exited with exit code 1
LOG: aborting startup due to startup process failure

---
David Darville

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Darville (#1)
Re: Continuous archiving fails

David Darville <ml@darville.vm.bytemark.co.uk> writes:

While testing a continuous archiving setup using PostgreSQL 8.2.3, on Debian
Etch amd64, I found out that the slave database crashed when I did a 'DROP
DATABASE' on the master.

Thanks for the report. I believe this will fix it:

Index: dbcommands.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/commands/dbcommands.c,v
retrieving revision 1.187.2.1
diff -c -r1.187.2.1 dbcommands.c
*** dbcommands.c	27 Jan 2007 20:15:47 -0000	1.187.2.1
--- dbcommands.c	12 Apr 2007 14:40:40 -0000
***************
*** 1438,1443 ****
--- 1438,1446 ----
  		/* Also, clean out any entries in the shared free space map */
  		FreeSpaceMapForgetDatabase(xlrec->db_id);
+ 		/* Also, clean out any fsync requests that might be pending in md.c */
+ 		ForgetDatabaseFsyncRequests(xlrec->db_id);
+ 
  		/* Clean out the xlog relcache too */
  		XLogDropDatabase(xlrec->db_id);

regards, tom lane

#3David Darville
ml@darville.vm.bytemark.co.uk
In reply to: Tom Lane (#2)
Re: Continuous archiving fails

On Thu, Apr 12, 2007 at 11:05:40AM -0400, Tom Lane wrote:

David Darville <ml@darville.vm.bytemark.co.uk> writes:

While testing a continuous archiving setup using PostgreSQL 8.2.3, on Debian
Etch amd64, I found out that the slave database crashed when I did a 'DROP
DATABASE' on the master.

Thanks for the report. I believe this will fix it:

The patch did indeed fix it, and my test setup has now been running for 2
days straight without any problems ;-)

---
David Darville