Losing data
I have had a serious loss of data and wondered if anyone could shed any light
on what may have happened.
My users have been writing reports on students. No error messages have been
produced and when called back up the reports seem to be present at the time
of writing. However, next day they have disappeared, and they do not appear
in a pg_dump. They seem to have been kept in memory and never written to
disk.
We are using Zope and connecting to Postgres through psycopg on Centos 5.
I suspect a hard disk failure but any other ideas would be welcome.
Would these reports be in the WAL?
regards
garry
On Thu, Jun 19, 2008 at 9:55 AM, Garry Saddington
<garry@schoolteachers.co.uk> wrote:
I have had a serious loss of data and wondered if anyone could shed any light
on what may have happened.
My users have been writing reports on students. No error messages have been
produced and when called back up the reports seem to be present at the time
of writing. However, next day they have disappeared, and they do not appear
in a pg_dump. They seem to have been kept in memory and never written to
disk.
We are using Zope and connecting to Postgres through psycopg on Centos 5.
I suspect a hard disk failure but any other ideas would be welcome.
Would these reports be in the WAL?
regards
Just a guess, but do you have some kind of search engine pointing at
the application? Could it be hitting a "delete" link maybe?
Garry Saddington wrote:
On Thursday 19 June 2008 18:15, Alan Hodgson wrote:
On Thursday 19 June 2008, Garry Saddington <garry@schoolteachers.co.uk>
wrote:
I read in a
Postgres manual that the hard disk may report to the OS that a write has
occured when it actually has not, is this possible?Yeah. But unless the power suddenly turned off that wouldn't cause data
loss.Oh, and the problem
has been intermittant. Another thing that happened this morning is that
Postgres had today as 18/06/2008 when in fact it was 19/06/2008 and the
OS reported this correctly. Restarting postgres sorted it, could this be
the problem?I strongly suspect the problem is between the keyboard and the chair.
I'd love to agree, but I have seen this first hand as a user!
In any case, however, if PostgreSQL reported the transaction complete and
the machine didn't experience any hardware problems (like sudden power or
disk failure), I would certainly not suspect PostgreSQL as the source of
the problem.What has happened to the reports then? I have used this combination of Zope
and Postgres for 5 years with no problems like this before and we have
written one complete set of reports on this server in the past 6 weeks. The
problem seems to have started last friday, when reports started to go
missing.
What has changed prior to Friday?
--
Until later, Geoffrey
Those who would give up essential Liberty, to purchase a little
temporary Safety, deserve neither Liberty nor Safety.
- Benjamin Franklin
Import Notes
Reply to msg id not found: 200806191906.38107.garry@schoolteachers.co.uk
On Thu, 2008-06-19 at 19:06 +0100, Garry Saddington wrote:
On Thursday 19 June 2008 18:15, Alan Hodgson wrote:
On Thursday 19 June 2008, Garry Saddington <garry@schoolteachers.co.uk>
Although I appreciate that this is a funky problem, the problem doesn't
yet exist and we are operating in a diagnostic vacuum.
Can you duplicate the problem now?
Do you have any logs from PostgreSQL that describe the session that
either previously or currently (based on your tests) the problem?
Do those logs actually show ERROR?
What about your apachelog or zope logs?
Have you turned up the logging in all three components to ensure that
the problem actually does exist?
Sincerely,
Joshua D. Drake
Show quoted text
regards
Garry
Import Notes
Reply to msg id not found: 200806191906.38107.garry@schoolteachers.co.uk
On Thursday 19 June 2008 18:09, Scott Marlowe wrote:
On Thu, Jun 19, 2008 at 9:55 AM, Garry Saddington
<garry@schoolteachers.co.uk> wrote:
I have had a serious loss of data and wondered if anyone could shed any
light on what may have happened.
My users have been writing reports on students. No error messages have
been produced and when called back up the reports seem to be present at
the time of writing. However, next day they have disappeared, and they do
not appear in a pg_dump. They seem to have been kept in memory and never
written to disk.
We are using Zope and connecting to Postgres through psycopg on Centos 5.
I suspect a hard disk failure but any other ideas would be welcome.
Would these reports be in the WAL?
regardsJust a guess, but do you have some kind of search engine pointing at
the application? Could it be hitting a "delete" link maybe?
Don't think so, once a teacher has entered a report there is no way that they
can delete it.
Regards
Garry
-------------- Original message ----------------------
From: Garry Saddington <garry@schoolteachers.co.uk>
On Thursday 19 June 2008 18:52, Adrian Klaver wrote:
-------------- Original message ----------------------
From: Garry Saddington <garry@schoolteachers.co.uk>On Thursday 19 June 2008 16:55, Joshua D. Drake wrote:
On Thu, 2008-06-19 at 16:55 +0100, Garry Saddington wrote:
I have had a serious loss of data and wondered if anyone could shed
any light on what may have happened.
My users have been writing reports on students. No error messages
have been produced and when called back up the reports seem to be
present at the time of writing. However, next day they have
disappeared, and they do not appear in a pg_dump. They seem to have
been kept in memory and never written to disk.
We are using Zope and connecting to Postgres through psycopg on
Centos 5. I suspect a hard disk failure but any other ideas would be
welcome. Would these reports be in the WAL?If it was hardware related you would know, quickly. This sounds a great
deal more like an application level interaction. Perhaps your zope
application caches things for a while before committing to disk?Yes I thought of this but once the report is sent to the DB a separate
query is run to get all of that teacher's reports and these are then
displayed on a new page. They all appear here but then disappear later.
Zope has transaction machinery that rolls everything back on an error, so
Postgres must have indicated a successful write somehow. I read in a
Postgres manual that the hard disk may report to the OS that a write has
occured when it actually has not, is this possible? Oh, and the problem
has been intermittant. Another thing that happened this morning is that
Postgres had today as 18/06/2008 when in fact it was 19/06/2008 and the
OS reported this correctly. Restarting postgres sorted it, could this be
the problem?
Regards
GarrySeems like a transaction with no commit. Basically along as the session is
active the data is there but once the session is closed the data does not
persist.Makes sense but what is to blame?
Therein lies the rub. As Josh mentioned in another post you will need turn up the logging for the various components and rerun the suspect procedure. I find the Postgres logs to most informative as they show what is actually hitting the database. Once I see the actual SQL statement I work backwards to see how it got there and in that form.
Regards
Garry
--
Adrian Klaver
aklaver@comcast.net
Import Notes
Resolved by subject fallback
On Thursday 19 June 2008 19:03, Joshua D. Drake wrote:
On Thu, 2008-06-19 at 19:06 +0100, Garry Saddington wrote:
On Thursday 19 June 2008 18:15, Alan Hodgson wrote:
On Thursday 19 June 2008, Garry Saddington <garry@schoolteachers.co.uk>
Although I appreciate that this is a funky problem, the problem doesn't
yet exist and we are operating in a diagnostic vacuum.Can you duplicate the problem now?
The problem is not consistent, some reports are lsot others not.
Do you have any logs from PostgreSQL that describe the session that
either previously or currently (based on your tests) the problem?
These are at school so I will look tomorrow
Do those logs actually show ERROR?
There are no errors relating to reports
What about your apachelog or zope logs?
Again tomorrow
Have you turned up the logging in all three components to ensure that
the problem actually does exist?
Will do this tomorrow
regards and thanks for your efforts.
PS
Could it be somethiing to do with libpq.so.4/5 required by psycopg and will
the WAL hold data to be replayed?
On Thu, 2008-06-19 at 20:29 +0100, Garry Saddington wrote:
On Thursday 19 June 2008 19:03, Joshua D. Drake wrote:
On Thu, 2008-06-19 at 19:06 +0100, Garry Saddington wrote:
On Thursday 19 June 2008 18:15, Alan Hodgson wrote:
On Thursday 19 June 2008, Garry Saddington
<garry@schoolteachers.co.uk>
PS
Could it be somethiing to do with libpq.so.4/5 required by psycopg and will
I think it certainly has something to do with psycopg and the way it
handles commits. Especially based on the description of what you say
above where some work and some don't. My guess (as I mentioned
previously) is there is a bit of code in your report generator that
doesn't call commit correctly.
the WAL hold data to be replayed?
Only if you are pushing the logs off for standby operations and only if
you commit.
Sincerely,
Joshua D. Drake