postgres dies while doing vacuum analyze

Started by Manuel Sugawaraover 24 years ago11 messages
#1Manuel Sugawara
masm@fciencias.unam.mx

Guys,

Just installed a new data base in my server and while running vacuum
analyze postgres dies with the following message:

[...]
NOTICE: Index pg_rewrite_oid_index: Pages 2; Tuples 16. CPU 0.00s/0.00u sec.
NOTICE: Index pg_rewrite_rulename_index: Pages 2; Tuples 16. CPU 0.00s/0.00u sec.
NOTICE: --Relation pg_toast_17058--
NOTICE: Pages 4: Changed 0, reaped 0, Empty 0, New 0; Tup 17: Vac 0, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 219, MaxLen 2034; Re-using: Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.00s/0.00u sec.
NOTICE: Index pg_toast_17058_idx: Pages 2; Tuples 17. CPU 0.00s/0.00u sec.
NOTICE: Analyzing...
pqReadData() -- backend closed the channel unexpectedly.
This probably means the backend terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!#

The postgres version is 7.1.2 and the data base was initialized with

$ LANG=es_MX /usr/bin/initdb -D /var/lib/pgsql/data -E latin1

It is running on Redhat Linux 7.1 i686 with 2.4.2-2 kernel.
Here is the back trace from gdb

(gdb) bt
#0 strcoll () at strcoll.c:229
#1 0x081348e7 in varstr_cmp () at eval.c:41
#2 0x0813493f in varstr_cmp () at eval.c:41
#3 0x08134b7c in text_gt () at eval.c:41
#4 0x08148ca2 in FunctionCall2 () at eval.c:41
#5 0x080b3b09 in analyze_rel () at eval.c:41
#6 0x080b3795 in analyze_rel () at eval.c:41
#7 0x080afa76 in vacuum () at eval.c:41
#8 0x080af9c7 in vacuum () at eval.c:41
#9 0x0810a3ca in ProcessUtility () at eval.c:41
#10 0x0810808b in pg_exec_query_string () at eval.c:41
#11 0x081091ce in PostgresMain () at eval.c:41
#12 0x080f208b in PostmasterMain () at eval.c:41
#13 0x080f1c45 in PostmasterMain () at eval.c:41
#14 0x080f0d0c in PostmasterMain () at eval.c:41
#15 0x080f0684 in PostmasterMain () at eval.c:41
#16 0x080cf3c8 in main () at eval.c:41
#17 0x401e2177 in __libc_start_main (main=0x80cf260 <main>, argc=3, ubp_av=0xbffffa7c, init=0x8065c20 <_init>,
fini=0x8154bb0 <_fini>, rtld_fini=0x4000e184 <_dl_fini>, stack_end=0xbffffa6c) at ../sysdeps/generic/libc-start.c:129
(gdb)

Seems like a problem with my locale settings. The
strange thing is that postgres dies while analyzing a system
table; however I'm able to vacuum my tables individually:

$ for t in `psql dep dep -c '\dt' -t -A | cut -d\| -f1`; do psql dep -c "vacuum analyze $t"; done

Any ideas?

best regards,
Manuel.

#2mordicus
mordicus@free.fr
In reply to: Manuel Sugawara (#1)
Re: postgres dies while doing vacuum analyze

Manuel Sugawara wrote:

Guys,

Just installed a new data base in my server and while running vacuum
analyze postgres dies with the following message:

[...]
NOTICE: Index pg_rewrite_oid_index: Pages 2; Tuples 16. CPU 0.00s/0.00u
sec.
NOTICE: Index pg_rewrite_rulename_index: Pages 2; Tuples 16. CPU
0.00s/0.00u sec.
NOTICE: --Relation pg_toast_17058--
NOTICE: Pages 4: Changed 0, reaped 0, Empty 0, New 0; Tup 17: Vac 0,
Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 219, MaxLen 2034; Re-using:
Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.00s/0.00u sec.
NOTICE: Index pg_toast_17058_idx: Pages 2; Tuples 17. CPU 0.00s/0.00u
sec.
NOTICE: Analyzing...
pqReadData() -- backend closed the channel unexpectedly.
This probably means the backend terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!#

The postgres version is 7.1.2 and the data base was initialized with

$ LANG=es_MX /usr/bin/initdb -D /var/lib/pgsql/data -E latin1

It is running on Redhat Linux 7.1 i686 with 2.4.2-2 kernel.
Here is the back trace from gdb

Try 2.4.5 Kernel, I have the same problem with Suse 7.1 2.4.2 Kernel, since
update, no more problems

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Manuel Sugawara (#1)
Re: postgres dies while doing vacuum analyze

Manuel Sugawara <masm@fciencias.unam.mx> writes:

[ vacuum analyze dies ]
It is running on Redhat Linux 7.1 i686 with 2.4.2-2 kernel.
Here is the back trace from gdb

(gdb) bt
#0 strcoll () at strcoll.c:229

We've heard reports before of strcoll() crashing on apparently valid
input. It seems to be a Red Hat-specific problem; the three reports
I have in my notes are from people running RH 7.0 (check the archives
from 1/1/01, 1/24/01, 3/1/01 if you want to see the prior reports).

It's possible that Postgres is doing something that confuses RH's
locale library, but I dunno what. Since no other platform is reporting
it, it could also be a plain old bug in that locale library.

We need some RH-er to burrow in with a debugger and figure out what's
going wrong. The previous reporters don't seem to have done anything;
are you the man to fix it?

regards, tom lane

#4Noname
teg@redhat.com
In reply to: Tom Lane (#3)
Re: postgres dies while doing vacuum analyze

Tom Lane <tgl@sss.pgh.pa.us> writes:

Manuel Sugawara <masm@fciencias.unam.mx> writes:

[ vacuum analyze dies ]
It is running on Redhat Linux 7.1 i686 with 2.4.2-2 kernel.
Here is the back trace from gdb

(gdb) bt
#0 strcoll () at strcoll.c:229

We've heard reports before of strcoll() crashing on apparently valid
input.

We haven't AFAIK, but would be very interested if it can be reproduced.

--
Trond Eivind Glomsr�d
Red Hat, Inc.

#5Manuel Sugawara
masm@fciencias.unam.mx
In reply to: Manuel Sugawara (#1)
1 attachment(s)
Re: postgres dies while doing vacuum analyze

Tom Lane <tgl@sss.pgh.pa.us> writes:

Manuel Sugawara <masm@fciencias.unam.mx> writes:

[ vacuum analyze dies ]
It is running on Redhat Linux 7.1 i686 with 2.4.2-2 kernel.
Here is the back trace from gdb

(gdb) bt
#0 strcoll () at strcoll.c:229

We've heard reports before of strcoll() crashing on apparently valid
input. It seems to be a Red Hat-specific problem; the three reports
I have in my notes are from people running RH 7.0 (check the archives
from 1/1/01, 1/24/01, 3/1/01 if you want to see the prior reports).

It's possible that Postgres is doing something that confuses RH's
locale library, but I dunno what. Since no other platform is reporting
it, it could also be a plain old bug in that locale library.

After a look into strcoll I found the bug. Attached is a tarball
including a patch for strcoll, glibc.spec and an small program that
shows the bug. Hopefully Trond can address this to the glibc and rpm
experts.

best regards,
Manuel.

Show quoted text

We need some RH-er to burrow in with a debugger and figure out what's
going wrong. The previous reporters don't seem to have done anything;
are you the man to fix it?

regards, tom lane

Attachments:

strcoll.tar.gzapplication/octet-streamDownload
#6Noname
teg@redhat.com
In reply to: Manuel Sugawara (#5)
Re: postgres dies while doing vacuum analyze

Manuel Sugawara <masm@fciencias.unam.mx> writes:

Tom Lane <tgl@sss.pgh.pa.us> writes:

Manuel Sugawara <masm@fciencias.unam.mx> writes:

[ vacuum analyze dies ]
It is running on Redhat Linux 7.1 i686 with 2.4.2-2 kernel.
Here is the back trace from gdb

(gdb) bt
#0 strcoll () at strcoll.c:229

We've heard reports before of strcoll() crashing on apparently valid
input. It seems to be a Red Hat-specific problem; the three reports
I have in my notes are from people running RH 7.0 (check the archives
from 1/1/01, 1/24/01, 3/1/01 if you want to see the prior reports).

It's possible that Postgres is doing something that confuses RH's
locale library, but I dunno what. Since no other platform is reporting
it, it could also be a plain old bug in that locale library.

After a look into strcoll I found the bug. Attached is a tarball
including a patch for strcoll, glibc.spec and an small program that
shows the bug.

Will do... what is the expected result of the testcase? It seems to
work alright for me, but I'm running a slightly newer version than we
have released yet... (glibc-2.2.3-11, look in rawhide).

--
Trond Eivind Glomsr�d
Red Hat, Inc.

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noname (#6)
Re: postgres dies while doing vacuum analyze

teg@redhat.com (Trond Eivind =?iso-8859-1?q?Glomsr=F8d?=) writes:

Will do... what is the expected result of the testcase?

Given a sufficiently large discrepancy between the string lengths,
a core dump is the likely result. Try increasing the "16k" numbers
if it doesn't crash for you.

Good work, Manuel! I'm surprised this hasn't been found before, because
you'd think it'd be biting lots of people ...

regards, tom lane

#8Manuel Sugawara
masm@fciencias.unam.mx
In reply to: Manuel Sugawara (#1)
Re: postgres dies while doing vacuum analyze

teg@redhat.com (Trond Eivind Glomsr�d) writes:

Will do... what is the expected result of the testcase? It seems to
work alright for me, but I'm running a slightly newer version than we
have released yet... (glibc-2.2.3-11, look in rawhide).

a core dump, at least on glibc-2.2.2-10. Try with some locale
different than C or POSIX.

masm@dep1$ LC_COLLATE=es_MX ./strcoll-bug
es_MX
zsh: 25041 segmentation fault (core dumped) LC_COLLATE=es_MX ./strcoll-bug
masm@dep1$ LC_COLLATE=C ./strcoll-bug
C
strcoll returned -1
masm@dep1$

regards,
Manuel.

Show quoted text

--
Trond Eivind Glomsr�d
Red Hat, Inc.

#9Noname
teg@redhat.com
In reply to: Manuel Sugawara (#8)
Re: postgres dies while doing vacuum analyze

Manuel Sugawara <masm@fciencias.unam.mx> writes:

teg@redhat.com (Trond Eivind Glomsr�d) writes:

Will do... what is the expected result of the testcase? It seems to
work alright for me, but I'm running a slightly newer version than we
have released yet... (glibc-2.2.3-11, look in rawhide).

a core dump, at least on glibc-2.2.2-10. Try with some locale
different than C or POSIX.

masm@dep1$ LC_COLLATE=es_MX ./strcoll-bug
es_MX
zsh: 25041 segmentation fault (core dumped) LC_COLLATE=es_MX ./strcoll-bug
masm@dep1$ LC_COLLATE=C ./strcoll-bug
C
strcoll returned -1
masm@dep1$

OK, this works with my system - no coredump, correct results. I'll
take a look at the glibc sources to verify that, but it looks like
this was fixed by drepper@redhat.com and included in glibc 2.2.3:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=36539

--
Trond Eivind Glomsr�d
Red Hat, Inc.

#10Manuel Sugawara
masm@fciencias.unam.mx
In reply to: Manuel Sugawara (#1)
Re: postgres dies while doing vacuum analyze

teg@redhat.com (Trond Eivind Glomsr�d) writes:

[...]

OK, this works with my system - no coredump, correct results. I'll
take a look at the glibc sources to verify that, but it looks like
this was fixed by drepper@redhat.com and included in glibc 2.2.3:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=36539

yes, is already fixed on glibc-2.2.3. It's safe to install this
version on my 7.1 systems or should I use my rpms?

regards,
Manuel.

Show quoted text

--
Trond Eivind Glomsr�d
Red Hat, Inc.

In reply to: Manuel Sugawara (#10)
Re: postgres dies while doing vacuum analyze

On 16 Jun 2001, Manuel Sugawara wrote:

teg@redhat.com (Trond Eivind Glomsr�d) writes:

[...]

OK, this works with my system - no coredump, correct results. I'll
take a look at the glibc sources to verify that, but it looks like
this was fixed by drepper@redhat.com and included in glibc 2.2.3:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=36539

yes, is already fixed on glibc-2.2.3. It's safe to install this
version on my 7.1 systems

The 2.2.3-11 should be safe, we would be very interested to hear
othwerwise.

--
Trond Eivind Glomsr�d
Red Hat, Inc.