right sibling is not next child

Started by Kevin Grittnerabout 20 years ago44 messagesbugs
Jump to latest
#1Kevin Grittner
Kevin.Grittner@wicourts.gov

Apologies if this is a duplicate, but my original post stalled and I
noticed I had omitted the postgres version, which you will want.

I'm reporting this as a PostgreSQL bug because it involves an index
corruption. I can't see any other way our application should be able to
corrupt an index. I will attach the tail of the log when the corruption
was detected (and the postmaster shut itself down), as well as the
subsequent attempt to start. Fortunately we run our web site off of a
farm of four database servers, so we are taking one of the others out of
the mix, stopping postmaster, and copying its data directory over to
this machine for recovery, so we don't need advice on that aspect of
things; but, we'd like to do what we can to help track down the cause to
prevent a recurrence. We have renamed the data directory to make room
for recovery at the normal location, but otherwise the failing data
directory structure is unmodified.

For context, this is running on Windows 2003 Server. Eight Xeon box,
no HT, 6 GB RAM, 13 drive 15,000 RPM RAID5 array through battery backed
controller for everything. This database is about 180 GB with about 300
tables. We are running 8.1.3 modified with a patch we have submitted
(pending review last I saw) to implement the standard_conforming_strings
TODO. We have autovacuum running every ten seconds because of a few
very small tables with very high update rates, and we have a scheduled
VACUUM ANALYZE VERBOSE every night. It appears that last night's vacuum
found the problem, which the previous night's vacuum didn't. We had
some event which started at 14:25 yesterday which persisted until we
restarted the middle tier at 15:04. The symptom was that a fraction of
the queries which normally run in a few ms were timing out on a 20
second limit. pg_locks showed no blocking. We've been getting episodes
with these symptoms occassionally, but they have only lasted a minute or
two; this duration was unusual. We haven't identified a cause. One odd
thing is that with the number of queries per second that we run, the
number of timeouts during an episode is too small to support the notion
that _all_ similar queries are failing.

How best to proceed?

-Kevin

Attachments:

postgresql-2006-04-06_000000-tail.logapplication/octet-stream; name=postgresql-2006-04-06_000000-tail.logDownload
postgresql-2006-04-06_072250.logapplication/octet-stream; name=postgresql-2006-04-06_072250.logDownload
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Kevin Grittner (#1)
Re: right sibling is not next child

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

[2006-04-06 02:19:57.460 ] 3848 <postgres bigbird 127.0.0.1(3944)> PANIC: right sibling is not next child in "Panel_pkey"

This should be repeatable by re-attempting a VACUUM, right? Please find
out which page exactly it's unhappy about (either gdb the crash or add a
printout of the "parent" variable to the elog call in nbtpage.c), then
pg_filedump the index and look to see what the index contains.

regards, tom lane

#3Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Tom Lane (#2)
Re: right sibling is not next child

Right now the postmaster refuses to start. What is the best way to get
past that to try what you suggest?

[2006-04-06 07:22:50.347 ] 3984 LOG: database system was interrupted
while in recovery at 2006-04-06 02:19:59 Central Daylight Time
[2006-04-06 07:22:50.347 ] 3984 HINT: This probably means that some
data is corrupted and you will have to use the last backup for
recovery.
[2006-04-06 07:22:50.347 ] 3984 LOG: checkpoint record is at
F6/50052F10
[2006-04-06 07:22:50.347 ] 3984 LOG: redo record is at F6/50052F10;
undo record is at 0/0; shutdown FALSE
[2006-04-06 07:22:50.347 ] 3984 LOG: next transaction ID: 313381941;
next OID: 2321597
[2006-04-06 07:22:50.347 ] 3984 LOG: next MultiXactId: 1; next
MultiXactOffset: 0
[2006-04-06 07:22:50.347 ] 3984 LOG: database system was not properly
shut down; automatic recovery in progress
[2006-04-06 07:22:50.347 ] 3984 LOG: redo starts at F6/50052F58
[2006-04-06 07:22:50.347 ] 4076 <bigbird bigbird 127.0.0.1(2236)>
FATAL: the database system is starting up
[2006-04-06 07:22:50.347 ] 4064 <bigbird bigbird 127.0.0.1(2235)>
FATAL: the database system is starting up
[2006-04-06 07:22:50.378 ] 3984 PANIC: heap_clean_redo: no block

-Kevin

On Thu, Apr 6, 2006 at 12:40 pm, in message

<25774.1144345220@sss.pgh.pa.us>,
Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

[2006- 04- 06 02:19:57.460 ] 3848 <postgres bigbird 127.0.0.1(3944)>

PANIC: right

sibling is not next child in "Panel_pkey"

This should be repeatable by re- attempting a VACUUM, right? Please

find

out which page exactly it's unhappy about (either gdb the crash or

add a

printout of the "parent" variable to the elog call in nbtpage.c),

then

Show quoted text

pg_filedump the index and look to see what the index contains.

regards, tom lane

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Kevin Grittner (#3)
Re: right sibling is not next child

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

Right now the postmaster refuses to start. What is the best way to get
past that to try what you suggest?

[2006-04-06 07:22:50.378 ] 3984 PANIC: heap_clean_redo: no block

Hm, did this start happening immediately after the other problem?
That would suggest that you've got worse problems than just a corrupt
index. You weren't by any chance running with full_page_writes = off
were you?

You could get past the startup failure with pg_resetxlog, but it's not
clear whether you'd have a consistent database afterward. What I'd
suggest first is saving a copy of the entire $PGDATA tree for forensic
purposes (not to mention being able to go back to that state if you need
to).

Is there any chance of letting someone else have a look at the database
contents? Otherwise you're going to have to do your own sleuthing to
figure out what went wrong ...

regards, tom lane

#5Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Tom Lane (#4)
Re: right sibling is not next child

On Thu, Apr 6, 2006 at 12:57 pm, in message

<25913.1144346246@sss.pgh.pa.us>,
Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

Right now the postmaster refuses to start. What is the best way to

get

past that to try what you suggest?

[2006- 04- 06 07:22:50.378 ] 3984 PANIC: heap_clean_redo: no block

Hm, did this start happening immediately after the other problem?

This started happening on the first attempt to start the postmaster
after the other error, which left the postmaster down, apparently after
a failed restart attempt.

That would suggest that you've got worse problems than just a

corrupt

index. You weren't by any chance running with full_page_writes =

off

were you?

Yes we were. Apparently I have misunderstood the implications of this.
Somehow I had convinced myself that this setting was relatively safe in
our environment, due to our battery-backed controllers. I'd convinced
myself, after reading carefully through the documentation of this
setting, that I would be OK as long as that functioned correctly, and
have problems regardless of this setting if it didn't. If you show me
where I went wrong, maybe I can suggest a patch to the docs to prevent
others from going down the wrong path in this regard. (Of course, maybe
it's all there and I just had a bad day when I thought this through.)

You could get past the startup failure with pg_resetxlog, but it's

not

clear whether you'd have a consistent database afterward. What I'd
suggest first is saving a copy of the entire $PGDATA tree for

forensic

purposes

We already have this forensic copy and a replacement production copy on
this box. I think we'll need to copy to another box to get a second
forensic copy, to avoid risking an out-of-space condition. That can be
done, but it'll take a few hours.

(not to mention being able to go back to that state if you need
to).

That's not an issue for production purposes.

Is there any chance of letting someone else have a look at the

database

contents?

There is a lot of data in the database which is confidential by law.
I'd have to jump through a lot of hoops to get anyone to even consider
letting me ship it off site. If you're asking whether you could access
in to our site, that might be arranged.

-Kevin

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Kevin Grittner (#5)
Re: right sibling is not next child

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

Tom Lane <tgl@sss.pgh.pa.us> wrote:

You weren't by any chance running with full_page_writes = off
were you?

Yes we were. Apparently I have misunderstood the implications of this.

So had we all :-(. It just plain doesn't work in 8.1.*, and will be
disabled in 8.1.4 --- see discussion last week. It might or might not
get resurrected in 8.2, depending on how messy it seems to be to fix.

Anyway, that explains your "heap_clean_redo: no block" failure. I think
you're stuck risking a pg_resetxlog to try to get back into the
database. If that results in a hopelessly corrupt database, we can try
modifying the WAL replay code to not consider this a fatal error, and
see if that produces anything we can use for debugging. I'm glad this
isn't your only copy of the database ...

There is a lot of data in the database which is confidential by law.
I'd have to jump through a lot of hoops to get anyone to even consider
letting me ship it off site. If you're asking whether you could access
in to our site, that might be arranged.

It sounds like the DB is too big to consider shipping anywhere anyway.
As long as you're comfortable doing stuff like pg_filedump and modifying
the code to get more debug info, we can proceed without getting into the
question of remote access.

regards, tom lane

#7Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Tom Lane (#6)
Re: right sibling is not next child

On Thu, Apr 6, 2006 at 1:26 pm, in message

<26154.1144348014@sss.pgh.pa.us>,

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

Tom Lane <tgl@sss.pgh.pa.us> wrote:

You weren't by any chance running with full_page_writes = off
were you?

Yes we were. Apparently I have misunderstood the implications of

this.

So had we all :- (. It just plain doesn't work in 8.1.*, and will

be

disabled in 8.1.4 --- see discussion last week.

Dang! I've not been able to keep up with the lists, and I missed that
thread. I'll change that setting as soon as possible.

Anyway, that explains your "heap_clean_redo: no block" failure. I

think

you're stuck risking a pg_resetxlog to try to get back into the
database. If that results in a hopelessly corrupt database, we can

try

modifying the WAL replay code to not consider this a fatal error,

and

see if that produces anything we can use for debugging.

Will do. Before I do that, though, is it worth making a copy? (The
down side is primarily the delay of copying it.)

I'm glad this isn't your only copy of the database ...

We go beyond being a belt-and-suspenders shop. Think staples and
glue-gun, too. ;-)

This is one of four copies of what is redundant data to start with, and
we have backups. So recovery is no sweat, but we got emails about
timeouts from the public during this incident, so the underlying bug is
important to us.

As long as you're comfortable doing stuff like pg_filedump and

modifying

the code to get more debug info, we can proceed without getting into

the

question of remote access.

OK. If you later think we do need to go that direction, send me an
email off-list.

-Kevin

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Kevin Grittner (#7)
Re: right sibling is not next child

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Anyway, that explains your "heap_clean_redo: no block" failure. I think
you're stuck risking a pg_resetxlog to try to get back into the
database.

Will do. Before I do that, though, is it worth making a copy? (The
down side is primarily the delay of copying it.)

Yeah, I think you'd better --- pg_resetxlog is destructive, and if we
find the database unusable for debugging afterward, we'll need that
copy.

regards, tom lane

#9Peter Brant
Peter.Brant@wicourts.gov
In reply to: Tom Lane (#8)
Re: right sibling is not next child

Sorry about the delay in responding. We had a bit of difficulty with
the test machine. Kevin is also on vacation this week.

The problem is repeatable with a VACUUM. I've found the offending
block. A (partial) pg_filedump of that block is pasted in below. I'm a
little lost as to what the next step is though. Any pointers would be
appreciated.

Pete

"Kevin Grittner" <Kevin ( dot ) Grittner ( at ) wicourts ( dot ) gov>
writes:

[2006-04-06 02:19:57.460 ] 3848 <postgres bigbird 127.0.0.1(3944)>

PANIC: right sibling is not next child in "Panel_pkey"

This should be repeatable by re-attempting a VACUUM, right? Please
find
out which page exactly it's unhappy about (either gdb the crash or add
a
printout of the "parent" variable to the elog call in nbtpage.c), then
pg_filedump the index and look to see what the index contains.

regards, tom lane

PANIC: right sibling is not next child in "Panel_pkey", parent is 271

*******************************************************************
* PostgreSQL File/Block Formatted Dump Utility - Version 8.1.1
*
* File: 180571
* Options used: -R 271 -i
*
* Dump created on: Tue Apr 11 13:38:44 2006
*******************************************************************

Block 271 ********************************************************
<Header> -----
Block Offset: 0x0021e000 Offsets: Lower 468 (0x01d4)
Block: Size 8192 Version 3 Upper 1952 (0x07a0)
LSN: logid 246 recoff 0x265d47c0 Special 8176 (0x1ff0)
Items: 112 Free Space: 1484
Length (including item array): 472

<Data> ------
Item 1 -- Length: 56 Offset: 8120 (0x1fb8) Flags: USED
Block Id: 132 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 2 -- Length: 8 Offset: 8112 (0x1fb0) Flags: USED
Block Id: 201 linp Index: 1 Size: 8
Has Nulls: 0 Has Varwidths: 0

Item 3 -- Length: 56 Offset: 8056 (0x1f78) Flags: USED
Block Id: 257 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

... snip ...

Item 109 -- Length: 56 Offset: 4920 (0x1338) Flags: USED
Block Id: 121 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 110 -- Length: 56 Offset: 4864 (0x1300) Flags: USED
Block Id: 133 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 111 -- Length: 56 Offset: 4808 (0x12c8) Flags: USED
Block Id: 134 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 112 -- Length: 56 Offset: 4752 (0x1290) Flags: USED
Block Id: 137 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

<Special Section> -----
BTree Index Section:
Flags: 0x0000 ()
Blocks: Previous (167) Next (455) Level (1)

*** End of Requested Range Encountered. Last Block Read: 271 ***

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Brant (#9)
Re: right sibling is not next child

"Peter Brant" <Peter.Brant@wicourts.gov> writes:

PANIC: right sibling is not next child in "Panel_pkey", parent is 271

Hmm ... that's not actually enough info to tell us where to look,
is it :-(. Please add the following variables to the elog message, or
gdb for them if you can:
target
rightsib
nextoffset

regards, tom lane

#11Peter Brant
Peter.Brant@wicourts.gov
In reply to: Tom Lane (#10)
Re: right sibling is not next child

Try as I might, I wasn't able to get a JIT debugger give me a memory
dump. It seems like postgres.exe is not really crashing in the
"unhandled exception" sense (see gdb log below)? Am I missing a
configure option?

(As an aside, what's the best way to get a core dump on Windows? Can
gdb read memory dumps produced by NTSD?)

I was able to attach to the running backend using gdb. That worked
(kind of, see log below).

I ended up modifying the elog again with the following results:

PANIC: right sibling is not next child in "Panel_pkey", parent is 271,
target is 635, rightsib is 629, nextoffset is 87

C:\WINDOWS\system32>C:\mingw\bin\gdb C:\pgsql\bin\postgres.exe
... snip ...
(gdb) attach 2084
Attaching to program `C:\pgsql\bin\postgres.exe', process 2084
[Switching to thread 2084.0x930]
(gdb) break nbtpage.c:983
Breakpoint 1 at 0x41f042: file nbtpage.c, line 983.
(gdb) continue
Continuing.
[Switching to thread 2084.0x81c]

Breakpoint 1, _bt_pagedel (rel=0x196bbb0, buf=1089, vacuum_full=0
'\0')
at nbtpage.c:983
983 nbtpage.c: No such file or directory.
in nbtpage.c
(gdb) print parent
$1 = 271
(gdb) print target
$2 = 635
(gdb) print rightsib
No symbol "rightsib" in current context.
(gdb) print nextoffset
$3 = 87
(gdb) print leftsib
$4 = 636
(gdb) print rightsib
No symbol "rightsib" in current context.
(gdb) continue
Continuing.

Program exited with code 03.
(gdb)

Pete

Tom Lane <tgl@sss.pgh.pa.us> 04/11/06 9:19 pm >>>

"Peter Brant" <Peter.Brant@wicourts.gov> writes:

PANIC: right sibling is not next child in "Panel_pkey", parent is

271

Hmm ... that's not actually enough info to tell us where to look,
is it :-(. Please add the following variables to the elog message, or
gdb for them if you can:
target
rightsib
nextoffset

regards, tom lane

---------------------------(end of
broadcast)---------------------------
TIP 6: explain analyze is your friend

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Brant (#11)
Re: right sibling is not next child

"Peter Brant" <Peter.Brant@wicourts.gov> writes:

I ended up modifying the elog again with the following results:
PANIC: right sibling is not next child in "Panel_pkey", parent is 271,
target is 635, rightsib is 629, nextoffset is 87

OK, so the part of the pg_filedump info we need to see is items 86/87
in block 271, plus a few around them for good measure.

Are the values of the keys in this index sensitive data, or are they
just serial numbers of some kind? If we could see pg_dump -i -f rather
than just -i, it might help. I'd like to see the dumps of pages 635,
636, 629 as well (mostly their "special space" contents).

regards, tom lane

#13Peter Brant
Peter.Brant@wicourts.gov
In reply to: Tom Lane (#12)
Re: right sibling is not next child

The index data isn't sensitive, but I should ask for permission
nonetheless. I'll send over the '-f' output tomorrow morning.

Pete

*******************************************************************
* PostgreSQL File/Block Formatted Dump Utility - Version 8.1.1
*
* File: 180571
* Options used: -i -R 271
*
* Dump created on: Tue Apr 11 18:22:24 2006
*******************************************************************

Block 271 ********************************************************
<Header> -----
Block Offset: 0x0021e000 Offsets: Lower 468 (0x01d4)
Block: Size 8192 Version 3 Upper 1952 (0x07a0)
LSN: logid 246 recoff 0x265d47c0 Special 8176 (0x1ff0)
Items: 112 Free Space: 1484
Length (including item array): 472

<Data> ------

... snip ...

Item 80 -- Length: 56 Offset: 3352 (0x0d18) Flags: USED
Block Id: 581 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 81 -- Length: 56 Offset: 3128 (0x0c38) Flags: USED
Block Id: 580 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 82 -- Length: 56 Offset: 2848 (0x0b20) Flags: USED
Block Id: 606 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 83 -- Length: 56 Offset: 2736 (0x0ab0) Flags: USED
Block Id: 608 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 84 -- Length: 56 Offset: 2792 (0x0ae8) Flags: USED
Block Id: 601 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 85 -- Length: 56 Offset: 2120 (0x0848) Flags: USED
Block Id: 640 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 86 -- Length: 56 Offset: 2176 (0x0880) Flags: USED
Block Id: 635 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 87 -- Length: 56 Offset: 2232 (0x08b8) Flags: USED
Block Id: 636 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 88 -- Length: 56 Offset: 2288 (0x08f0) Flags: USED
Block Id: 635 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 89 -- Length: 56 Offset: 2400 (0x0960) Flags: USED
Block Id: 629 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 90 -- Length: 56 Offset: 5704 (0x1648) Flags: USED
Block Id: 166 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 91 -- Length: 56 Offset: 5648 (0x1610) Flags: USED
Block Id: 339 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 92 -- Length: 56 Offset: 5592 (0x15d8) Flags: USED
Block Id: 372 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 93 -- Length: 56 Offset: 4416 (0x1140) Flags: USED
Block Id: 480 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 94 -- Length: 56 Offset: 5536 (0x15a0) Flags: USED
Block Id: 208 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

... snip ...

<Special Section> -----
BTree Index Section:
Flags: 0x0000 ()
Blocks: Previous (167) Next (455) Level (1)

*** End of Requested Range Encountered. Last Block Read: 271 ***

*******************************************************************
* PostgreSQL File/Block Formatted Dump Utility - Version 8.1.1
*
* File: 180571
* Options used: -i -R 635
*
* Dump created on: Tue Apr 11 18:22:43 2006
*******************************************************************

Block 635 ********************************************************
<Header> -----
Block Offset: 0x004f6000 Offsets: Lower 24 (0x0018)
Block: Size 8192 Version 3 Upper 8120 (0x1fb8)
LSN: logid 239 recoff 0xcabae788 Special 8176 (0x1ff0)
Items: 1 Free Space: 8096
Length (including item array): 28

<Data> ------
Item 1 -- Length: 56 Offset: 8120 (0x1fb8) Flags: USED
Block Id: 452 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

<Special Section> -----
BTree Index Section:
Flags: 0x0001 (LEAF)
Blocks: Previous (636) Next (629) Level (0)

*** End of Requested Range Encountered. Last Block Read: 635 ***

*******************************************************************
* PostgreSQL File/Block Formatted Dump Utility - Version 8.1.1
*
* File: 180571
* Options used: -i -R 636
*
* Dump created on: Tue Apr 11 18:22:50 2006
*******************************************************************

Block 636 ********************************************************
<Header> -----
Block Offset: 0x004f8000 Offsets: Lower 24 (0x0018)
Block: Size 8192 Version 3 Upper 8120 (0x1fb8)
LSN: logid 245 recoff 0x9bfa3b60 Special 8176 (0x1ff0)
Items: 1 Free Space: 8096
Length (including item array): 28

<Data> ------
Item 1 -- Length: 56 Offset: 8120 (0x1fb8) Flags: USED
Block Id: 454 linp Index: 27 Size: 56
Has Nulls: 0 Has Varwidths: 16384

<Special Section> -----
BTree Index Section:
Flags: 0x0001 (LEAF)
Blocks: Previous (640) Next (635) Level (0)

*** End of Requested Range Encountered. Last Block Read: 636 ***

*******************************************************************
* PostgreSQL File/Block Formatted Dump Utility - Version 8.1.1
*
* File: 180571
* Options used: -i -R 629
*
* Dump created on: Tue Apr 11 18:23:01 2006
*******************************************************************

Block 629 ********************************************************
<Header> -----
Block Offset: 0x004ea000 Offsets: Lower 244 (0x00f4)
Block: Size 8192 Version 3 Upper 5040 (0x13b0)
LSN: logid 239 recoff 0xcabae788 Special 8176 (0x1ff0)
Items: 56 Free Space: 4796
Length (including item array): 248

<Data> ------
... snip ...

<Special Section> -----
BTree Index Section:
Flags: 0x0001 (LEAF)
Blocks: Previous (635) Next (166) Level (0)

*** End of Requested Range Encountered. Last Block Read: 629 ***

Tom Lane <tgl@sss.pgh.pa.us> 04/12/06 1:06 am >>>

OK, so the part of the pg_filedump info we need to see is items 86/87
in block 271, plus a few around them for good measure.

Are the values of the keys in this index sensitive data, or are they
just serial numbers of some kind? If we could see pg_dump -i -f
rather
than just -i, it might help. I'd like to see the dumps of pages 635,
636, 629 as well (mostly their "special space" contents).

regards, tom lane

#14Peter Brant
Peter.Brant@wicourts.gov
In reply to: Tom Lane (#12)
Re: right sibling is not next child

Also, when I tried to run a database-wide VACUUM ANALYZE VERBOSE it
actually doesn't even get to Panel and errors out with:

INFO: analyzing "public.MaintCode"
INFO: "MaintCode": scanned 1 of 1 pages, containing 19 live rows and 0
dead rows; 19 rows in sample, 19 estimated total rows
ERROR: duplicate key violates unique constraint
"pg_statistic_relid_att_index"

MaintCode is a tiny, static table. Does that indicate further
corruption or is there a more benign explanation? Could these errors
not be Postgres' fault? (e.g. hardware-related)

Pete

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Brant (#14)
Re: right sibling is not next child

"Peter Brant" <Peter.Brant@wicourts.gov> writes:

Also, when I tried to run a database-wide VACUUM ANALYZE VERBOSE it
actually doesn't even get to Panel and errors out with:

ERROR: duplicate key violates unique constraint
"pg_statistic_relid_att_index"

Hm, my eyebrows just disappeared over the back of my head somewhere.
Is that repeatable? Does a REINDEX on pg_statistic make it go away?

regards, tom lane

#16Peter Brant
Peter.Brant@wicourts.gov
In reply to: Tom Lane (#12)
Re: right sibling is not next child

It is repeatable. A reindex doesn't work.

Pete

bigbird=# vacuum analyze "MaintCode";
ERROR: duplicate key violates unique constraint
"pg_statistic_relid_att_index"

bigbird=# vacuum analyze verbose "MaintCode";
INFO: vacuuming "public.MaintCode"
INFO: index "MaintCode_pkey" now contains 19 row versions in 2 pages
DETAIL: 0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "MaintCode": found 0 removable, 19 nonremovable row versions in
1 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: vacuuming "pg_toast.pg_toast_90472"
INFO: index "pg_toast_90472_index" now contains 0 row versions in 1
pages
DETAIL: 0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_90472": found 0 removable, 0 nonremovable row versions
in 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: analyzing "public.MaintCode"
INFO: "MaintCode": scanned 1 of 1 pages, containing 19 live rows and 0
dead rows; 19 rows in sample, 19 estimated total rows
ERROR: duplicate key violates unique constraint
"pg_statistic_relid_att_index"

bigbird=# reindex table pg_statistic;
ERROR: could not create unique index
DETAIL: Table contains duplicated values.

bigbird=# select * from "MaintCode";
maintCode | colorAttrb | descr | isActive |
maintPriority | stopsTheClock | videoAttrb | wcisSpecStatus
-----------+------------+---------------------------+----------+---------------+---------------+------------+----------------
AA | R | Annual accounts | t |
75 | t | N |
AP | R | Appeal | t |
30 | f | N |
AW | R | Awaiting Papers | t |
80 | f | N |
BA | R | Bankruptcy | t |
20 | t | N | S05
CI | R | Case Inactive | t |
60 | f | N | S03
CS | R | Consent decree | t |
90 | t | N |
DP | R | Deferred judgt./prosecute | t |
40 | t | N | S07
EF | R | Electronic filing | t |
33 | f | N |
ES | R | Extension | t |
55 | t | N |
EX | R | Expungement | t |
5 | f | N |
FX | R | Future expungement | t |
93 | f | N |
IN | R | Interpreter | t |
53 | f | N |
JR | R | Judicial review | t |
75 | t | N |
RO | R | Reopen | t |
35 | f | N |
SL | R | Sealed Record | t |
85 | f | N |
SM | R | Sentence modification | t |
43 | f | N |
SU | R | Case suspended | t |
45 | t | N |
UA | R | Under Advisement | t |
50 | f | N | S02
WA | R | Warrant issued | t |
10 | t | F | S04
(19 rows)

bigbird=#

Tom Lane <tgl@sss.pgh.pa.us> 04/12/06 5:00 am >>>

"Peter Brant" <Peter.Brant@wicourts.gov> writes:

Also, when I tried to run a database-wide VACUUM ANALYZE VERBOSE it
actually doesn't even get to Panel and errors out with:

ERROR: duplicate key violates unique constraint
"pg_statistic_relid_att_index"

Hm, my eyebrows just disappeared over the back of my head somewhere.
Is that repeatable? Does a REINDEX on pg_statistic make it go away?

regards, tom lane

---------------------------(end of
broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Brant (#16)
Re: right sibling is not next child

"Peter Brant" <Peter.Brant@wicourts.gov> writes:

bigbird=# vacuum analyze "MaintCode";
ERROR: duplicate key violates unique constraint
"pg_statistic_relid_att_index"

Hm, can you see any rows in pg_statistic with duplicate values of
(starelid, staattnum)? If so it'd be useful to look at their
ctid/xmin/xmax/cmin/cmax values.

regards, tom lane

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Brant (#13)
Re: right sibling is not next child

"Peter Brant" <Peter.Brant@wicourts.gov> writes:

Item 85 -- Length: 56 Offset: 2120 (0x0848) Flags: USED
Block Id: 640 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 86 -- Length: 56 Offset: 2176 (0x0880) Flags: USED
Block Id: 635 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 87 -- Length: 56 Offset: 2232 (0x08b8) Flags: USED
Block Id: 636 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 88 -- Length: 56 Offset: 2288 (0x08f0) Flags: USED
Block Id: 635 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 89 -- Length: 56 Offset: 2400 (0x0960) Flags: USED
Block Id: 629 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 90 -- Length: 56 Offset: 5704 (0x1648) Flags: USED
Block Id: 166 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Well, that's pretty dang interesting. How did block 635 get to be
listed twice? The sibling links say that the correct sequence is
640, 636, 635, 629, 166 ... so something screwed up the parent level's
keys.

What would be most useful at this point is to look at the keys in
these entries, and compare them to the "high keys" (item 1) of the
individual leaf pages. I'm wondering what key is in that extra
entry for 635 ... Did you get permission to show us the keys?

regards, tom lane

#19Peter Brant
Peter.Brant@wicourts.gov
In reply to: Tom Lane (#12)
Re: right sibling is not next child

I can't find any duplicates?!?

The query

select starelid, staattnum, ctid, xmin, xmax, cmin, cmax
from pg_statistic p1
where (select count(*) from pg_statistic p2 where
p1.starelid = p2.starelid and p1.staattnum = p2.staattnum) > 1

doesn't turn up anything. Nor does dumping

select starelid, staattnum from pg_statistic

to a file and using sort | uniq -c

Pete

Tom Lane <tgl@sss.pgh.pa.us> 04/12/06 4:37 pm >>>

Hm, can you see any rows in pg_statistic with duplicate values of
(starelid, staattnum)? If so it'd be useful to look at their
ctid/xmin/xmax/cmin/cmax values.

regards, tom lane

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Brant (#19)
Re: right sibling is not next child

"Peter Brant" <Peter.Brant@wicourts.gov> writes:

I can't find any duplicates?!?

Weirder and weirder. Maybe the table is OK but the index is corrupt?

Could it be another symptom of the same problem we're seeing in the
Panel_pkey index? I'm currently theorizing that that index might've
been corrupted during WAL replay (specifically, if the replayer somehow
failed to recognize a split completion, btree_xlog_cleanup would've
inserted an extra entry) and one could certainly imagine multiple
indexes getting corrupted in the same fashion at the same time. You
did say that this problem showed up shortly after a database crash,
right?

Could you send me a copy of pg_statistic_relid_att_index, off-list
(the actual disk file, not a pg_filedump)? There's nothing but table
OIDs and attribute numbers in it, so I can't see any reason anyone
would consider it sensitive data. I've got some really crude code
laying about for scanning an index and looking for inconsistencies ...
it's too ugly to give out, but I'd like to see if it can find anything
wrong with that index. (I'll probably ask for a copy of Panel_pkey
for the same purpose, if you get permission to show us its keys.)

regards, tom lane

#21Peter Brant
Peter.Brant@wicourts.gov
In reply to: Tom Lane (#12)
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Brant (#21)
#23Peter Brant
Peter.Brant@wicourts.gov
In reply to: Tom Lane (#12)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Brant (#23)
#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#24)
#26Peter Brant
Peter.Brant@wicourts.gov
In reply to: Tom Lane (#12)
#27Peter Brant
Peter.Brant@wicourts.gov
In reply to: Tom Lane (#12)
#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Brant (#26)
#29Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#28)
#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#29)
#31Peter Brant
Peter.Brant@wicourts.gov
In reply to: Tom Lane (#12)
#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Brant (#31)
#33Peter Brant
Peter.Brant@wicourts.gov
In reply to: Tom Lane (#12)
#34Peter Brant
Peter.Brant@wicourts.gov
In reply to: Tom Lane (#12)
#35Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Brant (#34)
#36Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#35)
#37Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#36)
#38Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#37)
#39Peter Brant
Peter.Brant@wicourts.gov
In reply to: Tom Lane (#37)
#40Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#38)
#41Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#40)
#42Peter Brant
Peter.Brant@wicourts.gov
In reply to: Magnus Hagander (#41)
#43Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Brant (#42)
#44Peter Brant
Peter.Brant@wicourts.gov
In reply to: Tom Lane (#43)