Unlinked files in PGDATA/base following unclean shutdown

Started by Jack Orensteinover 17 years ago4 messagesgeneral
Jump to latest
#1Jack Orenstein
jack.orenstein@hds.com

Our application is running Postgres 7.4, (working on conversion to 8.3
right now). Our testing involves various forms of violence, including
shutting off power and kill -9 postmaster.

Occasionally we observe a form of database corruption in which one of
the files storing a table or index disappears. The logs will contain
ERRORs that look like this:

could not open relation "some_table_name": No such file or directory

When this happens, and I cross-reference the pg_class.oid with the
expected file under PGDATA, the file is missing (and does not appear
to be in lost+found).

I have fsync set to true, and wal_sync_method set to fsync.

A few questions about this:

1) Why is this happening?

2) To help investigate this problem, I've written a script to
cross-reference pg_class and the files in PGDATA/base. (I know that I
should use pg_class.relfilenode instead of pg_class.oid -- I'll fix
that.) The question is how to check for consistency in the case of
large tables, which are split into multiple segments, (e.g. 123456.1,
123456.2). I.e., how can I find out how many segments there should be?
Any chance it's as simple as (pg_class.relpages + SUITABLE_CONSTANT -
1) / SUITABLE_CONSTANT?

Jack

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jack Orenstein (#1)
Re: Unlinked files in PGDATA/base following unclean shutdown

Jack Orenstein <jack.orenstein@hds.com> writes:

Our application is running Postgres 7.4, (working on conversion to 8.3
right now). Our testing involves various forms of violence, including
shutting off power and kill -9 postmaster.

Do you have reason to trust either your kernel or your disk drives under
such abuse? In general I'd bet on the drives being at fault a lot
sooner than any other part of the food chain. Complete disappearance of
files that should be there is a fault at the filesystem level or lower,
anyway.

The question is how to check for consistency in the case of
large tables, which are split into multiple segments, (e.g. 123456.1,
123456.2). I.e., how can I find out how many segments there should be?

The kernel-defined EOF is the truth, the whole truth, and nothing but
the truth. There is no other authority.

All segments before the last one should be exactly 1GB, but the last
one can be anything up to that. Consult the comments in md.c for
more details. (I think 7.4 may treat some corner cases differently
from 8.3 anyway.)

regards, tom lane

#3Jack Orenstein
jack.orenstein@hds.com
In reply to: Tom Lane (#2)
Re: Unlinked files in PGDATA/base following unclean shutdown

Tom Lane wrote:

Jack Orenstein <jack.orenstein@hds.com> writes:

The question is how to check for consistency in the case of
large tables, which are split into multiple segments, (e.g. 123456.1,
123456.2). I.e., how can I find out how many segments there should be?

The kernel-defined EOF is the truth, the whole truth, and nothing but
the truth. There is no other authority.

All segments before the last one should be exactly 1GB, but the last
one can be anything up to that. Consult the comments in md.c for
more details. (I think 7.4 may treat some corner cases differently
from 8.3 anyway.)

regards, tom lane

Just to be clear, you mean 2**30 (1,073,741,824) bytes, not 10**9?

Jack

#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Jack Orenstein (#3)
Re: Unlinked files in PGDATA/base following unclean shutdown

Jack Orenstein wrote:

Tom Lane wrote:

All segments before the last one should be exactly 1GB, but the last
one can be anything up to that. Consult the comments in md.c for
more details. (I think 7.4 may treat some corner cases differently
from 8.3 anyway.)

Just to be clear, you mean 2**30 (1,073,741,824) bytes, not 10**9?

It's RELSEG_SIZE * BLCKSZ, where

#define RELSEG_SIZE 131072
#define BLCKSZ 8192

(both from pg_config.h)

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support