BUG #4838: Database corruption after btree_gin index creation

Started by Daniele Bortoluzzialmost 17 years ago7 messagesbugs
Jump to latest
#1Daniele Bortoluzzi
bortoluz@gmail.com

The following bug has been logged online:

Bug reference: 4838
Logged by: Daniele Bortoluzzi
Email address: bortoluz@gmail.com
PostgreSQL version: 8.4beta2
Operating system: Linux amd64 2.6.24 (Debian 4.0)
Description: Database corruption after btree_gin index creation
Details:

I am testing this db
I created a multicolumn GIN index with btree_gin functionality (fulltext
column + timestamp). After creating the index the db segfaulted:

LOG: server process (PID 14195) was terminated by signal 11: Segmentation
fault
LOG: terminating any other active server processes
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and
possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.

The WARNING-DETAIL-HINT messages repeated 4 times, then postgres restarted:

LOG: all server processes terminated; reinitializing
LOG: database system was interrupted; last known up at 2009-06-04 12:47:19
CEST
LOG: database system was not properly shut down; automatic recovery in
progress
LOG: redo starts at 2/778687D0
LOG: record with zero length at 2/779392A8
LOG: redo done at 2/77938E20
LOG: last completed transaction was at log time 2009-06-04
12:47:35.55392+02
LOG: autovacuum launcher started
LOG: database system is ready to accept connections

but segfaulted 2 times more.

Then I launched a VACUUM FULL ANALYZE, no segmentation faults, it completed
succesfully, but now it throws this error:

ERROR: tuple offset out of range: 48090

or

ERROR: tuple offset out of range: 0

when doing fulltext queries.

I was using postgres 8.4devel (SVN revision 28901) happily...

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniele Bortoluzzi (#1)
Re: BUG #4838: Database corruption after btree_gin index creation

"Daniele Bortoluzzi" <bortoluz@gmail.com> writes:

Description: Database corruption after btree_gin index creation

Can you provide a self-contained test case to reproduce this problem?
We had a similar report yesterday but no one can reproduce it.

regards, tom lane

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniele Bortoluzzi (#1)
Re: BUG #4838: Database corruption after btree_gin index creation

"Daniele Bortoluzzi" <bortoluz@gmail.com> writes:

I created a multicolumn GIN index with btree_gin functionality (fulltext
column + timestamp). After creating the index the db segfaulted:

LOG: server process (PID 14195) was terminated by signal 11: Segmentation
fault

I cannot replicate this problem based on the little information
provided. The GIN bug we found a couple of days ago would explain
the "tuple offset out of range" errors, and if you had had Asserts
enabled it would explain Assert failures; but I don't see that it
explains a segfault. Can you still reproduce this with CVS HEAD,
and if so would you submit a test case? Or at least a stack trace
from the crash?

regards, tom lane

#4Daniele Bortoluzzi
bortoluz@gmail.com
In reply to: Tom Lane (#3)
Re: BUG #4838: Database corruption after btree_gin index creation

2009/6/10 Tom Lane <tgl@sss.pgh.pa.us>:
[...]

I cannot replicate this problem based on the little information
provided.  The GIN bug we found a couple of days ago would explain
the "tuple offset out of range" errors, and if you had had Asserts
enabled it would explain Assert failures; but I don't see that it
explains a segfault.  Can you still reproduce this with CVS HEAD,
and if so would you submit a test case?  Or at least a stack trace
from the crash?

I tried to replicate the error with a little set of data (our db
weights ~700MB) but I could not achieve it.
Now I'm checking out from the CVS server, will post a new message
today or at least tomorrow.

If I cannot reproduce the error, what is the best way to catch the
stack trace? Do I have to recompile with --enable-debug?
I read this article:
http://wiki.postgresql.org/wiki/Developer_FAQ#What_debugging_features_are_available.3F
but I never debugged postgresql with gdb. Can you give me some hint?

I am sorry for the megadelay. Thank you for supporting.

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniele Bortoluzzi (#4)
Re: BUG #4838: Database corruption after btree_gin index creation

Daniele Bortoluzzi <bortoluz@gmail.com> writes:

If I cannot reproduce the error, what is the best way to catch the
stack trace? Do I have to recompile with --enable-debug?

Yes, that would be the best thing. If you are using gcc there is no
harm in using --enable-debug all the time; it just makes the executable
files a bit bigger, there's no performance change.

Make sure the postmaster is started with "ulimit -c unlimited", else
the crash might not drop a core file. The core file will normally
appear in $PGDATA, but sometimes in a system-dependent special place
such as /cores/.

Once you've got a core file, do

$ gdb /path/to/postgres-executable /path/to/core-file
gdb> bt
... stack trace ...
gdb> quit

and send the whole output of gdb.

regards, tom lane

#6Daniele Bortoluzzi
bortoluz@gmail.com
In reply to: Tom Lane (#3)
Re: BUG #4838: Database corruption after btree_gin index creation

[...]

Can you still reproduce this with CVS HEAD,

with CVS HEAD the error is not occurring. Did you fix some GIN bug in
this version?

Thank you for your support

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniele Bortoluzzi (#6)
Re: BUG #4838: Database corruption after btree_gin index creation

Daniele Bortoluzzi <bortoluz@gmail.com> writes:

Can you still reproduce this with CVS HEAD,

with CVS HEAD the error is not occurring. Did you fix some GIN bug in
this version?

Yes, I told you so.
http://archives.postgresql.org/pgsql-committers/2009-06/msg00081.php

But I don't see how that bug would've led to a segfault. Bogus TIDs
in the index should be caught without that.

regards, tom lane