postgres dies while doing vacuum analyze
Guys,
Just installed a new data base in my server and while running vacuum
analyze postgres dies with the following message:
[...]
NOTICE: Index pg_rewrite_oid_index: Pages 2; Tuples 16. CPU 0.00s/0.00u sec.
NOTICE: Index pg_rewrite_rulename_index: Pages 2; Tuples 16. CPU 0.00s/0.00u sec.
NOTICE: --Relation pg_toast_17058--
NOTICE: Pages 4: Changed 0, reaped 0, Empty 0, New 0; Tup 17: Vac 0, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 219, MaxLen 2034; Re-using: Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.00s/0.00u sec.
NOTICE: Index pg_toast_17058_idx: Pages 2; Tuples 17. CPU 0.00s/0.00u sec.
NOTICE: Analyzing...
pqReadData() -- backend closed the channel unexpectedly.
This probably means the backend terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!#
The postgres version is 7.1.2 and the data base was initialized with
$ LANG=es_MX /usr/bin/initdb -D /var/lib/pgsql/data -E latin1
It is running on Redhat Linux 7.1 i686 with 2.4.2-2 kernel.
Here is the back trace from gdb
(gdb) bt
#0 strcoll () at strcoll.c:229
#1 0x081348e7 in varstr_cmp () at eval.c:41
#2 0x0813493f in varstr_cmp () at eval.c:41
#3 0x08134b7c in text_gt () at eval.c:41
#4 0x08148ca2 in FunctionCall2 () at eval.c:41
#5 0x080b3b09 in analyze_rel () at eval.c:41
#6 0x080b3795 in analyze_rel () at eval.c:41
#7 0x080afa76 in vacuum () at eval.c:41
#8 0x080af9c7 in vacuum () at eval.c:41
#9 0x0810a3ca in ProcessUtility () at eval.c:41
#10 0x0810808b in pg_exec_query_string () at eval.c:41
#11 0x081091ce in PostgresMain () at eval.c:41
#12 0x080f208b in PostmasterMain () at eval.c:41
#13 0x080f1c45 in PostmasterMain () at eval.c:41
#14 0x080f0d0c in PostmasterMain () at eval.c:41
#15 0x080f0684 in PostmasterMain () at eval.c:41
#16 0x080cf3c8 in main () at eval.c:41
#17 0x401e2177 in __libc_start_main (main=0x80cf260 <main>, argc=3, ubp_av=0xbffffa7c, init=0x8065c20 <_init>,
fini=0x8154bb0 <_fini>, rtld_fini=0x4000e184 <_dl_fini>, stack_end=0xbffffa6c) at ../sysdeps/generic/libc-start.c:129
(gdb)
Seems like a problem with my locale settings. The
strange thing is that postgres dies while analyzing a system
table; however I'm able to vacuum my tables individually:
$ for t in `psql dep dep -c '\dt' -t -A | cut -d\| -f1`; do psql dep -c "vacuum analyze $t"; done
Any ideas?
best regards,
Manuel.
Manuel Sugawara wrote:
Guys,
Just installed a new data base in my server and while running vacuum
analyze postgres dies with the following message:[...]
NOTICE: Index pg_rewrite_oid_index: Pages 2; Tuples 16. CPU 0.00s/0.00u
sec.
NOTICE: Index pg_rewrite_rulename_index: Pages 2; Tuples 16. CPU
0.00s/0.00u sec.
NOTICE: --Relation pg_toast_17058--
NOTICE: Pages 4: Changed 0, reaped 0, Empty 0, New 0; Tup 17: Vac 0,
Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 219, MaxLen 2034; Re-using:
Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.00s/0.00u sec.
NOTICE: Index pg_toast_17058_idx: Pages 2; Tuples 17. CPU 0.00s/0.00u
sec.
NOTICE: Analyzing...
pqReadData() -- backend closed the channel unexpectedly.
This probably means the backend terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!#The postgres version is 7.1.2 and the data base was initialized with
$ LANG=es_MX /usr/bin/initdb -D /var/lib/pgsql/data -E latin1
It is running on Redhat Linux 7.1 i686 with 2.4.2-2 kernel.
Here is the back trace from gdb
Try 2.4.5 Kernel, I have the same problem with Suse 7.1 2.4.2 Kernel, since
update, no more problems
Manuel Sugawara <masm@fciencias.unam.mx> writes:
[ vacuum analyze dies ]
It is running on Redhat Linux 7.1 i686 with 2.4.2-2 kernel.
Here is the back trace from gdb
(gdb) bt
#0 strcoll () at strcoll.c:229
We've heard reports before of strcoll() crashing on apparently valid
input. It seems to be a Red Hat-specific problem; the three reports
I have in my notes are from people running RH 7.0 (check the archives
from 1/1/01, 1/24/01, 3/1/01 if you want to see the prior reports).
It's possible that Postgres is doing something that confuses RH's
locale library, but I dunno what. Since no other platform is reporting
it, it could also be a plain old bug in that locale library.
We need some RH-er to burrow in with a debugger and figure out what's
going wrong. The previous reporters don't seem to have done anything;
are you the man to fix it?
regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes:
Manuel Sugawara <masm@fciencias.unam.mx> writes:
[ vacuum analyze dies ]
It is running on Redhat Linux 7.1 i686 with 2.4.2-2 kernel.
Here is the back trace from gdb(gdb) bt
#0 strcoll () at strcoll.c:229We've heard reports before of strcoll() crashing on apparently valid
input.
We haven't AFAIK, but would be very interested if it can be reproduced.
--
Trond Eivind Glomsr�d
Red Hat, Inc.
Tom Lane <tgl@sss.pgh.pa.us> writes:
Manuel Sugawara <masm@fciencias.unam.mx> writes:
[ vacuum analyze dies ]
It is running on Redhat Linux 7.1 i686 with 2.4.2-2 kernel.
Here is the back trace from gdb(gdb) bt
#0 strcoll () at strcoll.c:229We've heard reports before of strcoll() crashing on apparently valid
input. It seems to be a Red Hat-specific problem; the three reports
I have in my notes are from people running RH 7.0 (check the archives
from 1/1/01, 1/24/01, 3/1/01 if you want to see the prior reports).It's possible that Postgres is doing something that confuses RH's
locale library, but I dunno what. Since no other platform is reporting
it, it could also be a plain old bug in that locale library.
After a look into strcoll I found the bug. Attached is a tarball
including a patch for strcoll, glibc.spec and an small program that
shows the bug. Hopefully Trond can address this to the glibc and rpm
experts.
best regards,
Manuel.
Show quoted text
We need some RH-er to burrow in with a debugger and figure out what's
going wrong. The previous reporters don't seem to have done anything;
are you the man to fix it?regards, tom lane
Attachments:
Import Notes
Reply to msg id not found: TomLanesmessageofFri15Jun2001160450-0400
Manuel Sugawara <masm@fciencias.unam.mx> writes:
Tom Lane <tgl@sss.pgh.pa.us> writes:
Manuel Sugawara <masm@fciencias.unam.mx> writes:
[ vacuum analyze dies ]
It is running on Redhat Linux 7.1 i686 with 2.4.2-2 kernel.
Here is the back trace from gdb(gdb) bt
#0 strcoll () at strcoll.c:229We've heard reports before of strcoll() crashing on apparently valid
input. It seems to be a Red Hat-specific problem; the three reports
I have in my notes are from people running RH 7.0 (check the archives
from 1/1/01, 1/24/01, 3/1/01 if you want to see the prior reports).It's possible that Postgres is doing something that confuses RH's
locale library, but I dunno what. Since no other platform is reporting
it, it could also be a plain old bug in that locale library.After a look into strcoll I found the bug. Attached is a tarball
including a patch for strcoll, glibc.spec and an small program that
shows the bug.
Will do... what is the expected result of the testcase? It seems to
work alright for me, but I'm running a slightly newer version than we
have released yet... (glibc-2.2.3-11, look in rawhide).
--
Trond Eivind Glomsr�d
Red Hat, Inc.
teg@redhat.com (Trond Eivind =?iso-8859-1?q?Glomsr=F8d?=) writes:
Will do... what is the expected result of the testcase?
Given a sufficiently large discrepancy between the string lengths,
a core dump is the likely result. Try increasing the "16k" numbers
if it doesn't crash for you.
Good work, Manuel! I'm surprised this hasn't been found before, because
you'd think it'd be biting lots of people ...
regards, tom lane
teg@redhat.com (Trond Eivind Glomsr�d) writes:
Will do... what is the expected result of the testcase? It seems to
work alright for me, but I'm running a slightly newer version than we
have released yet... (glibc-2.2.3-11, look in rawhide).
a core dump, at least on glibc-2.2.2-10. Try with some locale
different than C or POSIX.
masm@dep1$ LC_COLLATE=es_MX ./strcoll-bug
es_MX
zsh: 25041 segmentation fault (core dumped) LC_COLLATE=es_MX ./strcoll-bug
masm@dep1$ LC_COLLATE=C ./strcoll-bug
C
strcoll returned -1
masm@dep1$
regards,
Manuel.
Show quoted text
--
Trond Eivind Glomsr�d
Red Hat, Inc.
Import Notes
Reply to msg id not found: teg@redhat.comsmessageof16Jun2001094137-0400
Manuel Sugawara <masm@fciencias.unam.mx> writes:
teg@redhat.com (Trond Eivind Glomsr�d) writes:
Will do... what is the expected result of the testcase? It seems to
work alright for me, but I'm running a slightly newer version than we
have released yet... (glibc-2.2.3-11, look in rawhide).a core dump, at least on glibc-2.2.2-10. Try with some locale
different than C or POSIX.masm@dep1$ LC_COLLATE=es_MX ./strcoll-bug
es_MX
zsh: 25041 segmentation fault (core dumped) LC_COLLATE=es_MX ./strcoll-bug
masm@dep1$ LC_COLLATE=C ./strcoll-bug
C
strcoll returned -1
masm@dep1$
OK, this works with my system - no coredump, correct results. I'll
take a look at the glibc sources to verify that, but it looks like
this was fixed by drepper@redhat.com and included in glibc 2.2.3:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=36539
--
Trond Eivind Glomsr�d
Red Hat, Inc.
teg@redhat.com (Trond Eivind Glomsr�d) writes:
[...]
OK, this works with my system - no coredump, correct results. I'll
take a look at the glibc sources to verify that, but it looks like
this was fixed by drepper@redhat.com and included in glibc 2.2.3:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=36539
yes, is already fixed on glibc-2.2.3. It's safe to install this
version on my 7.1 systems or should I use my rpms?
regards,
Manuel.
Show quoted text
--
Trond Eivind Glomsr�d
Red Hat, Inc.
Import Notes
Reply to msg id not found: teg@redhat.comsmessageof16Jun2001135625-0400
On 16 Jun 2001, Manuel Sugawara wrote:
teg@redhat.com (Trond Eivind Glomsr�d) writes:
[...]
OK, this works with my system - no coredump, correct results. I'll
take a look at the glibc sources to verify that, but it looks like
this was fixed by drepper@redhat.com and included in glibc 2.2.3:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=36539yes, is already fixed on glibc-2.2.3. It's safe to install this
version on my 7.1 systems
The 2.2.3-11 should be safe, we would be very interested to hear
othwerwise.
--
Trond Eivind Glomsr�d
Red Hat, Inc.