BUG #2667: vacuuming a 70GB table causes a "crash"

Started by Stephen R. van den Bergover 19 years ago4 messagesbugs
Jump to latest

The following bug has been logged online:

Bug reference: 2667
Logged by: Stephen R. van den Berg
Email address: srb@cuci.nl
PostgreSQL version: 8.1.4
Operating system: Linux 2.6.16.20 Debian (pg: 8.1.4-6)
Description: vacuuming a 70GB table causes a "crash"
Details:

datalog=# vacuum verbose fnetsense;
INFO: vacuuming "public.fnetsense"
PANIC: right sibling is not next child in "fnetsense_utstamp_key"
server closed the connection unexpectedly

Previously, the database was fully packed, no deletetions, just insertions
at the end (kind of a logging DB).
Then I deleted around 50GB worth of data (the first 50GB).
That worked just fine, then I ran vacuum the first
time (on this table). It ran, and did 3 or 4 passes
successfully, then it bombed reporting the
error above. Then I tried running vacuum again, same
error as above again.

Restarted the server, didn't help. Error is persistent
now, DB is still working, don't know how to repair the
DB.

The table being vacuumed is inserted into in realtime
at a rate of 23 new entries per second.

This is the table definition of the table in question:
Table "public.fnetsense"
Column | Type | Modifiers
-------------------------------+---------+-----------
nsb | integer | not null
utstamp | integer | not null
ifoutoctets_1 | bigint | not null
ifoutucastpkts_1 | bigint | not null
ifinoctets_1 | bigint | not null
ifinucastpkts_1 | bigint | not null
opampsdownchannelcolor | "char" | not null
opampsdownchannelmains | "char" | not null
opampsupchannelcolor | "char" | not null
opampssigqcorrecteds | "char" | not null
opampssigquncorrectables | "char" | not null
opampssigqsignalwater | "char" | not null
opampssigqmicrobeachsand | "char" | not null
opampsfostatustxmains | "char" | not null
opampsfostatustttimeouts | "char" | not null
opampsfotsfostatusvalue | "char" | not null
opampsfotsfostatussignalwater | "char" | not null
opampsfotsfostatusrxmains | "char" | not null
fotsnloss | "char" | not null
fotsnpadj | "char" | not null
fotsrnginvminreqs | "char" | not null
Indexes:
"fnetsense_nsbutstamp_key" btree (nsb, utstamp)
"fnetsense_utstamp_key" btree (utstamp)
Foreign-key constraints:
"fnetsense_nsb_fkey" FOREIGN KEY (nsb) REFERENCES netsensebase(nsb)
ON UPDATE CASCADE ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephen R. van den Berg (#1)
Re: BUG #2667: vacuuming a 70GB table causes a "crash"

"Stephen R. van den Berg" <srb@cuci.nl> writes:

datalog=# vacuum verbose fnetsense;
INFO: vacuuming "public.fnetsense"
PANIC: right sibling is not next child in "fnetsense_utstamp_key"
server closed the connection unexpectedly

Were you running with full_page_writes off during some earlier use of
this database? There is a known failure mode that can produce this
situation in 8.1.x for x < 4, see thread here:

http://archives.postgresql.org/pgsql-bugs/2006-04/msg00135.php

Restarted the server, didn't help. Error is persistent
now, DB is still working, don't know how to repair the
DB.

REINDEX that index.

regards, tom lane

In reply to: Tom Lane (#2)
Re: BUG #2667: vacuuming a 70GB table causes a "crash"

On 9/30/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Stephen R. van den Berg" <srb@cuci.nl> writes:

datalog=# vacuum verbose fnetsense;
INFO: vacuuming "public.fnetsense"
PANIC: right sibling is not next child in "fnetsense_utstamp_key"
server closed the connection unexpectedly

Were you running with full_page_writes off during some earlier use of
this database? There is a known failure mode that can produce this
situation in 8.1.x for x < 4, see thread here:

My postgres.conf file says:

full_page_writes = off

As I understand it, the 8.1.4 postgres ignores this setting?
--
Sincerely,
Stephen R. van den Berg (AKA BuGless).

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephen R. van den Berg (#3)
Re: BUG #2667: vacuuming a 70GB table causes a "crash"

"Stephen R. van den Berg" <srb@cuci.nl> writes:

On 9/30/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Were you running with full_page_writes off during some earlier use of
this database?

My postgres.conf file says:
full_page_writes = off
As I understand it, the 8.1.4 postgres ignores this setting?

Indeed. But I'm speculating that the damage was done by an earlier
8.1.x release and you've only just now discovered it.

regards, tom lane