Crash during elog.c...

Started by Jim C. Nasbyabout 20 years ago8 messages
#1Jim C. Nasby
jnasby@pervasive.com

My client (same one with the slru.c issue) has had 3 of these in the
past day...

The backtrace:
Program terminated with signal 11, Segmentation fault.
(gdb) bt
#0 0x0000003b8946fb20 in strlen () from /lib64/tls/libc.so.6
#1 0x0000003b894428dc in vfprintf () from /lib64/tls/libc.so.6
#2 0x0000003b89461ba4 in vsnprintf () from /lib64/tls/libc.so.6
#3 0x00000000004ff420 in appendStringInfoVA (str=0x7fbfffde30, fmt=0x65f59e "%s", args=0x7fbfffdb50) at stringinfo.c:125
#4 0x00000000004ff746 in appendStringInfo (str=0x7fbfffde30, fmt=0x65f59e "%s") at stringinfo.c:75
#5 0x00000000005d3a26 in log_line_prefix (buf=0x7fbfffde30) at elog.c:1425
#6 0x00000000005d4beb in EmitErrorReport () at elog.c:1465
#7 0x00000000005d4345 in errfinish (dummy=Variable "dummy" is not available.
) at elog.c:382
#8 0x000000000056625f in exec_simple_query (
query_string=0x89e760 "update summary_clicks set clicks = t.clicks, impressions = t.impressions, dollars = t.dollars from pending_summary_clicks_2005_11_02 t where summary_clicks.listingindex = t.listingindex and summary_cl"...) at postgres.c:1030
#9 0x0000000000567bb3 in PostgresMain (argc=4, argv=0x846380, username=0x846350 "iacm") at postgres.c:3007
#10 0x000000000053acf0 in ServerLoop () at postmaster.c:2836
#11 0x000000000053c3f4 in PostmasterMain (argc=5, argv=0x843530) at postmaster.c:918
#12 0x000000000050806f in main (argc=5, argv=0x843530) at main.c:268
(gdb) f 3
#3 0x00000000004ff420 in appendStringInfoVA (str=0x7fbfffde30, fmt=0x65f59e "%s", args=0x7fbfffdb50)
at stringinfo.c:125
125 nprinted = vsnprintf(str->data + str->len, avail, fmt, args);
(gdb) print *str
$39 = {data = 0x848030 "2005-11-04 00:01:02 EST|2005-11-04 00:00:08 EST|216.187.113.78(39476)|didit|", len = 76,
maxlen = 256, cursor = 0}

Asserts are on, but for performance reasons the memory checking stuff is
commented out.

The good news is there's been no slru.c asserts...
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim C. Nasby (#1)
Re: Crash during elog.c...

"Jim C. Nasby" <jnasby@pervasive.com> writes:

The backtrace:
Program terminated with signal 11, Segmentation fault.
(gdb) bt
#0 0x0000003b8946fb20 in strlen () from /lib64/tls/libc.so.6
#1 0x0000003b894428dc in vfprintf () from /lib64/tls/libc.so.6
#2 0x0000003b89461ba4 in vsnprintf () from /lib64/tls/libc.so.6
#3 0x00000000004ff420 in appendStringInfoVA (str=0x7fbfffde30, fmt=0x65f59e "%s", args=0x7fbfffdb50) at stringinfo.c:125

Hrm ... what's the platform again?

regards, tom lane

#3Jim C. Nasby
jnasby@pervasive.com
In reply to: Tom Lane (#2)
Re: Crash during elog.c...

On Fri, Nov 04, 2005 at 02:45:41PM -0500, Tom Lane wrote:

"Jim C. Nasby" <jnasby@pervasive.com> writes:

The backtrace:
Program terminated with signal 11, Segmentation fault.
(gdb) bt
#0 0x0000003b8946fb20 in strlen () from /lib64/tls/libc.so.6
#1 0x0000003b894428dc in vfprintf () from /lib64/tls/libc.so.6
#2 0x0000003b89461ba4 in vsnprintf () from /lib64/tls/libc.so.6
#3 0x00000000004ff420 in appendStringInfoVA (str=0x7fbfffde30, fmt=0x65f59e "%s", args=0x7fbfffdb50) at stringinfo.c:125

Hrm ... what's the platform again?

8-way opteron, RHEL4.

BTW, should I be opening bugs for things like this? I guess I probably
should...
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#4Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Jim C. Nasby (#3)
Re: Crash during elog.c...

Jim C. Nasby wrote:

On Fri, Nov 04, 2005 at 02:45:41PM -0500, Tom Lane wrote:

"Jim C. Nasby" <jnasby@pervasive.com> writes:

The backtrace:
Program terminated with signal 11, Segmentation fault.
(gdb) bt
#0 0x0000003b8946fb20 in strlen () from /lib64/tls/libc.so.6
#1 0x0000003b894428dc in vfprintf () from /lib64/tls/libc.so.6
#2 0x0000003b89461ba4 in vsnprintf () from /lib64/tls/libc.so.6
#3 0x00000000004ff420 in appendStringInfoVA (str=0x7fbfffde30, fmt=0x65f59e "%s", args=0x7fbfffdb50) at stringinfo.c:125

Hrm ... what's the platform again?

8-way opteron, RHEL4.

BTW, should I be opening bugs for things like this? I guess I probably
should...

Nope, reporting it here is fine.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#5Jim C. Nasby
jnasby@pervasive.com
In reply to: Bruce Momjian (#4)
Re: Crash during elog.c...

On Fri, Nov 04, 2005 at 04:34:35PM -0500, Bruce Momjian wrote:

Jim C. Nasby wrote:

On Fri, Nov 04, 2005 at 02:45:41PM -0500, Tom Lane wrote:

"Jim C. Nasby" <jnasby@pervasive.com> writes:

The backtrace:
Program terminated with signal 11, Segmentation fault.
(gdb) bt
#0 0x0000003b8946fb20 in strlen () from /lib64/tls/libc.so.6
#1 0x0000003b894428dc in vfprintf () from /lib64/tls/libc.so.6
#2 0x0000003b89461ba4 in vsnprintf () from /lib64/tls/libc.so.6
#3 0x00000000004ff420 in appendStringInfoVA (str=0x7fbfffde30, fmt=0x65f59e "%s", args=0x7fbfffdb50) at stringinfo.c:125

Hrm ... what's the platform again?

8-way opteron, RHEL4.

BTW, should I be opening bugs for things like this? I guess I probably
should...

Nope, reporting it here is fine.

I'm soon to be AFK all weekend... is there any more info anyone wanted
about this?
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim C. Nasby (#1)
Re: Crash during elog.c...

"Jim C. Nasby" <jnasby@pervasive.com> writes:

My client (same one with the slru.c issue) has had 3 of these in the
past day...

(gdb) print *str
$39 = {data = 0x848030 "2005-11-04 00:01:02 EST|2005-11-04 00:00:08 EST|216.187.113.78(39476)|didit|", len = 76,
maxlen = 256, cursor = 0}

Um, what's your log_line_prefix setting, and is the next format code
%i by any chance? I've just noticed an utterly brain-dead assumption
somebody stuck into ps_status.c awhile back.

regards, tom lane

#7Jim C. Nasby
jnasby@pervasive.com
In reply to: Tom Lane (#6)
Re: Crash during elog.c...

On Fri, Nov 04, 2005 at 08:06:39PM -0500, Tom Lane wrote:

"Jim C. Nasby" <jnasby@pervasive.com> writes:

My client (same one with the slru.c issue) has had 3 of these in the
past day...

(gdb) print *str
$39 = {data = 0x848030 "2005-11-04 00:01:02 EST|2005-11-04 00:00:08 EST|216.187.113.78(39476)|didit|", len = 76,
maxlen = 256, cursor = 0}

Um, what's your log_line_prefix setting, and is the next format code
%i by any chance? I've just noticed an utterly brain-dead assumption
somebody stuck into ps_status.c awhile back.

log_line_prefix = '%t|%s|%r|%d|%i|%p'

So yeah, looks like %i is next. I recall seeing something about %i in
the backtrace or something else related to this, but I can't find it
now.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim C. Nasby (#7)
Re: Crash during elog.c...

"Jim C. Nasby" <jnasby@pervasive.com> writes:

On Fri, Nov 04, 2005 at 08:06:39PM -0500, Tom Lane wrote:

Um, what's your log_line_prefix setting, and is the next format code
%i by any chance? I've just noticed an utterly brain-dead assumption
somebody stuck into ps_status.c awhile back.

log_line_prefix = '%t|%s|%r|%d|%i|%p'

So yeah, looks like %i is next.

The quickest way to get rid of the crash will be to remove %i, then.
If you don't want to do that, see the patch I committed to CVS.

regards, tom lane