Power outage and funny chars in the logs

Started by Glyn Astillalmost 17 years ago7 messagesgeneral
Jump to latest
#1Glyn Astill
glynastill@yahoo.co.uk

Hi chaps,

We had a power outage today when a couple of computer controlled power strips crashed (my secondary psu's will stay firmly in the wall sockets now though).

I'd had a lot of fun pulling plugs out under load before we went into production so I wasn't particularly worried, and the databases came back up and appled the redo logs as expected.

What did make me scratch my head was a short stream of @ symbols (well they show up as @ symbols in vi) in the log file of the main server (others are slony subscribers).

My only reasoning so far is that it's just garbage from postgres as the power died? The contorllers have BBU cache and drive caches are off. The only other thing I can think is it's something to do with me using data=writeback on the data partition, and relying on the wal for journaling of the data. The logs are on that same partition...

Just wondered what you chaps thought about this?

Glyn

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Glyn Astill (#1)
Re: Power outage and funny chars in the logs

Glyn Astill wrote:

We had a power outage today when a couple of computer
controlled power strips crashed (my secondary psu's will stay
firmly in the wall sockets now though).

I'd had a lot of fun pulling plugs out under load before we
went into production so I wasn't particularly worried, and
the databases came back up and appled the redo logs as expected.

What did make me scratch my head was a short stream of @
symbols (well they show up as @ symbols in vi) in the log
file of the main server (others are slony subscribers).

My only reasoning so far is that it's just garbage from
postgres as the power died? The contorllers have BBU cache
and drive caches are off. The only other thing I can think is
it's something to do with me using data=writeback on the data
partition, and relying on the wal for journaling of the data.
The logs are on that same partition...

Just wondered what you chaps thought about this?

You mean the error log and not the transaction log, right?

I would say that the file system suffered data loss in the
system crash, and what you see is something that happened
during file system recovery.

The strange characters are towards the end of the file, right?
Can you find anything about file system recovery in the
operating system log files?

Yours,
Laurenz Albe

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Glyn Astill (#1)
Re: Power outage and funny chars in the logs

Glyn Astill <glynastill@yahoo.co.uk> writes:

What did make me scratch my head was a short stream of @ symbols (well they show up as @ symbols in vi) in the log file of the main server (others are slony subscribers).

There isn't anything making any effort to fsync the postmaster log, so
some data corruption in the log is hardly surprising.

regards, tom lane

#4Glyn Astill
glynastill@yahoo.co.uk
In reply to: Laurenz Albe (#2)
Re: Power outage and funny chars in the logs

From: Albe Laurenz <laurenz.albe@wien.gv.at>
Subject: RE: [GENERAL] Power outage and funny chars in the logs
To: glynastill@yahoo.co.uk, pgsql-general@postgresql.org
Date: Thursday, 7 May, 2009, 2:44 PM
Glyn Astill wrote:

We had a power outage today when a couple of computer

controlled power strips crashed (my secondary

psu's will stay

firmly in the wall sockets now though).

I'd had a lot of fun pulling plugs out under load

before we

went into production so I wasn't particularly

worried, and

the databases came back up and appled the redo logs as

expected.

What did make me scratch my head was a short stream of

@

symbols (well they show up as @ symbols in vi) in the

log

file of the main server (others are slony

subscribers).

My only reasoning so far is that it's just garbage

from

postgres as the power died? The contorllers have BBU

cache

and drive caches are off. The only other thing I can

think is

it's something to do with me using data=writeback

on the data

partition, and relying on the wal for journaling of

the data.

The logs are on that same partition...

Just wondered what you chaps thought about this?

You mean the error log and not the transaction log, right?

Yes just the text based server logs.

I would say that the file system suffered data loss in the
system crash, and what you see is something that happened
during file system recovery.

The strange characters are towards the end of the file,
right?

Yeah right at the end

Can you find anything about file system recovery in the
operating system log files?

As tom said in his post, I think this is just down to os cache of the server log etc - it's not actually flushed to disk with fsync like the wal.

In reply to: Glyn Astill (#1)
Re: Power outage and funny chars in the logs

What did make me scratch my head was a short stream of @ symbols (well they
show up as @ symbols in vi) in the log file of the main server (others are
slony subscribers).

mentioning those @@@@@ symbols ...

1,5 weeks ago there was reported on this list the problem "postgres service
not starting on windows"; after consulting event log the user reported as
message "bogus data in postmaster.pid". After deleting postmaster.pid the
service started up fine.

Soon after a customer of mine reported the same error, also on windows; and
before deleting postmaster.pid I got a copy of that "bogus one". AND: there
where also a lot of @@@@ symobols in postmaster.pid (hex 0)

After reading the answers to the funny chars in the logs and no fsync on the
logs: is there a fsync on postmaster.pid? Or is that file not considered
important enough?

(just digging for the reason for corrupted data in postmaster.pid)...

Harald

--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607
no fx, no carrier pigeon
-
LASIK good, steroids bad?

#6Glyn Astill
glynastill@yahoo.co.uk
In reply to: Massa, Harald Armin (#5)
Re: Power outage and funny chars in the logs
--- On Thu, 7/5/09, Massa, Harald Armin <chef@ghum.de> wrote:

mentioning those @@@@@ symbols ...

1,5 weeks ago there was reported on this list the problem
"postgres service
not starting on windows"; after consulting event log
the user reported as
message "bogus data in postmaster.pid". After
deleting postmaster.pid the
service started up fine.

Soon after a customer of mine reported the same error, also
on windows; and
before deleting postmaster.pid I got a copy of that
"bogus one". AND: there
where also a lot of @@@@ symobols in postmaster.pid (hex 0)

After reading the answers to the funny chars in the logs
and no fsync on the
logs: is there a fsync on postmaster.pid? Or is that file
not considered
important enough?

(just digging for the reason for corrupted data in
postmaster.pid)...

Aha, nice one Harald,

So the @ symbols are hex 0. Perhaps all the @ symbols are the pattern of the text that was written to the log - but since ext3 is in data=writeback mode it knows that there should be some data there *but* it doesn't know what that data is, so it just ends up as 0's.

With regards to your question, if the .pid is not fsynced I agree doing so would perhaps be a good idea, is there any reason why not to?

#7Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Glyn Astill (#6)
Re: Power outage and funny chars in the logs
--- On Thu, 7/5/09, Massa, Harald Armin <chef@ghum.de> wrote:

After reading the answers to the funny chars in the logs and no fsync on
the logs: is there a fsync on postmaster.pid? Or is that file not
considered important enough?

I think this strongly suggests that postmaster.pid should be fsync'ed.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support