Crash in vacuum analyze

Started by Robert L Mathewsover 24 years ago12 messagesgeneral
Jump to latest
#1Robert L Mathews
lists@tigertech.com

I'm using 7.1.2 on a Red Hat 7.1 box.

I have a database where the backend crashes when I do "vacuum analyze".
It does not happen when I do pg_dump or vacuum, and other databases on
the same box work fine with vacuum analyze. Here's the tail end of the
"vacuum verbose analyze" log:

NOTICE: Analyzing...
NOTICE: --Relation document--
NOTICE: Pages 15: Changed 0, reaped 0, Empty 0, New 0; Tup 79: Vac 0,
Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 72, MaxLen 2028; Re-using:
Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.00s/0.00u sec.
NOTICE: Index document_pkey: Pages 2; Tuples 79. CPU 0.00s/0.00u sec.
NOTICE: Index document_unique_region_period: Pages 2; Tuples 79. CPU
0.00s/0.00u sec.
NOTICE: --Relation pg_toast_28441--
NOTICE: Pages 59: Changed 0, reaped 1, Empty 0, New 0; Tup 262: Vac 0,
Keep/VTL 0/0, Crash 0, UnUsed 2, MinLen 76, MaxLen 2034; Re-using:
Free/Avail. Space 4084/0; EndEmpty/Avail. Pages 0/0. CPU 0.00s/0.01u sec.
NOTICE: Index pg_toast_28441_idx: Pages 2; Tuples 262: Deleted 0. CPU
0.00s/0.00u sec.
NOTICE: Analyzing...
pqReadData() -- backend closed the channel unexpectedly.
This probably means the backend terminated abnormally
before or while processing the request.
connection to server was lost

It happens at the same point each time I try it. I haven't noticed any
particular problems using the database, but this (obviously) worries me.

I've already tried dropping the database, recreating it, and re-importing
it from the pg_dump; the same thing happens.

What else should I try?

--
Robert L Mathews, Tiger Technologies

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert L Mathews (#1)
Re: Crash in vacuum analyze

Robert L Mathews <lists@tigertech.com> writes:

I have a database where the backend crashes when I do "vacuum analyze".

What shows up in the postmaster log? Is a core file produced, and if so
can you provide a stack trace from it? What is the schema of the table
producing the problem ("document", apparently)?

regards, tom lane

#3Dave Cramer
pg@fastcrypt.com
In reply to: Tom Lane (#2)
Re: Crash in vacuum analyze

There is a bug in the glibc library that causes this. I think there is
some documentation on the list about it.

Tom?

Dave

Show quoted text

On Mon, 2001-09-03 at 17:55, Tom Lane wrote:

Robert L Mathews <lists@tigertech.com> writes:

I have a database where the backend crashes when I do "vacuum analyze".

What shows up in the postmaster log? Is a core file produced, and if so
can you provide a stack trace from it? What is the schema of the table
producing the problem ("document", apparently)?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dave Cramer (#3)
Re: Crash in vacuum analyze

Dave Cramer <Dave@micro-automation.net> writes:

There is a bug in the glibc library that causes this.

Hmm ... he *could* be suffering from that strcoll() bug, but with no
info about his platform I'm hesitant to jump to that conclusion.

regards, tom lane

#5Robert L Mathews
lists@tigertech.com
In reply to: Tom Lane (#4)
Re: Crash in vacuum analyze

At 9/3/01 4:11 PM, Tom Lane wrote:

Hmm ... he *could* be suffering from that strcoll() bug, but with no
info about his platform I'm hesitant to jump to that conclusion.

It was indeed the strcoll bug in glibc 2.2.2. The database in question
has some long strings that triggered it.

Upgrading to a later glibc fixed the problem. Thanks for the pointer (and
your time); I appreciate it.

--
Robert L Mathews, Tiger Technologies

#6Sean Chittenden
sean-pgsql-general@chittenden.org
In reply to: Robert L Mathews (#5)
Re: Crash in vacuum analyze

Hmm ... he *could* be suffering from that strcoll() bug, but with no
info about his platform I'm hesitant to jump to that conclusion.

It was indeed the strcoll bug in glibc 2.2.2. The database in question
has some long strings that triggered it.

Upgrading to a later glibc fixed the problem. Thanks for the pointer (and
your time); I appreciate it.

Can we test for this at configure time and spit out a warning
message to the user that they need to upgrade their version of glibc?
-sc

--
Sean Chittenden

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Sean Chittenden (#6)
Re: Crash in vacuum analyze

Sean Chittenden <sean-pgsql-general@chittenden.org> writes:

Can we test for this at configure time and spit out a warning
message to the user that they need to upgrade their version of glibc?

I think most of the people who are getting bitten have installed PG from
RPMs, so configure couldn't help them anyway.

Perhaps our future RPMs should have a dependency that requires glibc
version >= first fixed version.

regards, tom lane

#8Robert L Mathews
lists@tigertech.com
In reply to: Tom Lane (#7)
Re: Crash in vacuum analyze

At 9/3/01 6:00 PM, sean-pgsql-general@chittenden.org wrote:

Upgrading to a later glibc fixed the problem. Thanks for the pointer (and
your time); I appreciate it.

Can we test for this at configure time and spit out a warning
message to the user that they need to upgrade their version of glibc?
-sc

I was using RPMs, so that wouldn't have helped in my case (unless the RPM
script also had such a test).

--
Robert L Mathews, Tiger Technologies

#9Sean Chittenden
sean-pgsql-general@chittenden.org
In reply to: Tom Lane (#7)
Re: Crash in vacuum analyze

Can we test for this at configure time and spit out a warning
message to the user that they need to upgrade their version of glibc?

I think most of the people who are getting bitten have installed PG from
RPMs, so configure couldn't help them anyway.

::grrr:: I have nothing nice to say about RPM's and the headaches they
have caused me in the past. Is it late to lobby Linux and/or Red Hat
and ask them for a new package format similar to the ports tree?
::grin:: In any case, Tom, you're right again as always.

Perhaps our future RPMs should have a dependency that requires glibc
version >= first fixed version.

Far beit for me to disagree... was this bug in only Linux's
glibc?

-sc

--
Sean Chittenden

#10Jeff Boes
jboes@nexcerpt.com
In reply to: Dave Cramer (#3)
Re: Crash in vacuum analyze

In article <999558317.8648.1.camel@inspiron.cramers>, "Dave Cramer"
<Dave@micro-automation.net> wrote:

There is a bug in the glibc library that causes this. I think there is
some documentation on the list about it.

Anybody have a pointer to more info about this? How do I determine if
this affects my system? (I'm having problems similar to this with VACUUM
ANALYZE on one particular, long-row table.)

--
Jeff Boes vox 616.226.9550
Database Engineer fax 616.349.9076
Nexcerpt, Inc. jboes@nexcerpt.com

#11Robert L Mathews
lists@tigertech.com
In reply to: Jeff Boes (#10)
Re: Crash in vacuum analyze

At 9/6/01 6:34 PM, Jeff Boes <jboes@nexcerpt.com> wrote:

In article <999558317.8648.1.camel@inspiron.cramers>, "Dave Cramer"
<Dave@micro-automation.net> wrote:

There is a bug in the glibc library that causes this. I think there is
some documentation on the list about it.

Anybody have a pointer to more info about this? How do I determine if
this affects my system? (I'm having problems similar to this with VACUUM
ANALYZE on one particular, long-row table.)

That sounds like it's probably it (especially if you can do a normal
vacuum with no trouble). If you're using glibc version 2.2.2 or earlier,
your machine is vulnerable.

The solution is to upgrade to glibc 2.2.3 or later.

--
Robert L Mathews, Tiger Technologies

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Boes (#10)
Re: Crash in vacuum analyze

"Jeff Boes" <jboes@nexcerpt.com> writes:

In article <999558317.8648.1.camel@inspiron.cramers>, "Dave Cramer"
<Dave@micro-automation.net> wrote:

There is a bug in the glibc library that causes this. I think there is
some documentation on the list about it.

Anybody have a pointer to more info about this?

See the thread starting at

http://fts.postgresql.org/db/mw/msg.html?mid=1021209

Bottom line is that strcoll() is broken in glibc versions before 2.2.3.
If you are running a Postgres installation with locale support compiled
in, then you are vulnerable to this bug.

This may or may not explain your particular problem, of course, but
it's a good thing to check before digging further.

regards, tom lane