Thread: Crash during elog.c...

Crash during elog.c...

From
"Jim C. Nasby"
Date:
My client (same one with the slru.c issue) has had 3 of these in the
past day...

The backtrace:
Program terminated with signal 11, Segmentation fault.
(gdb) bt
#0  0x0000003b8946fb20 in strlen () from /lib64/tls/libc.so.6
#1  0x0000003b894428dc in vfprintf () from /lib64/tls/libc.so.6
#2  0x0000003b89461ba4 in vsnprintf () from /lib64/tls/libc.so.6
#3  0x00000000004ff420 in appendStringInfoVA (str=0x7fbfffde30, fmt=0x65f59e "%s", args=0x7fbfffdb50) at
stringinfo.c:125
#4  0x00000000004ff746 in appendStringInfo (str=0x7fbfffde30, fmt=0x65f59e "%s") at stringinfo.c:75
#5  0x00000000005d3a26 in log_line_prefix (buf=0x7fbfffde30) at elog.c:1425
#6  0x00000000005d4beb in EmitErrorReport () at elog.c:1465
#7  0x00000000005d4345 in errfinish (dummy=Variable "dummy" is not available.
) at elog.c:382
#8  0x000000000056625f in exec_simple_query (   query_string=0x89e760 "update summary_clicks set clicks = t.clicks,
impressions= t.impressions, dollars = t.dollars from pending_summary_clicks_2005_11_02 t where
summary_clicks.listingindex= t.listingindex and summary_cl"...) at postgres.c:1030
 
#9  0x0000000000567bb3 in PostgresMain (argc=4, argv=0x846380, username=0x846350 "iacm") at postgres.c:3007
#10 0x000000000053acf0 in ServerLoop () at postmaster.c:2836
#11 0x000000000053c3f4 in PostmasterMain (argc=5, argv=0x843530) at postmaster.c:918
#12 0x000000000050806f in main (argc=5, argv=0x843530) at main.c:268
(gdb) f 3
#3  0x00000000004ff420 in appendStringInfoVA (str=0x7fbfffde30, fmt=0x65f59e "%s", args=0x7fbfffdb50)   at
stringinfo.c:125
125             nprinted = vsnprintf(str->data + str->len, avail, fmt, args);
(gdb) print *str
$39 = {data = 0x848030 "2005-11-04 00:01:02 EST|2005-11-04 00:00:08 EST|216.187.113.78(39476)|didit|", len = 76, maxlen
=256, cursor = 0}
 

Asserts are on, but for performance reasons the memory checking stuff is
commented out.

The good news is there's been no slru.c asserts...
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: Crash during elog.c...

From
Tom Lane
Date:
"Jim C. Nasby" <jnasby@pervasive.com> writes:
> The backtrace:
> Program terminated with signal 11, Segmentation fault.
> (gdb) bt
> #0  0x0000003b8946fb20 in strlen () from /lib64/tls/libc.so.6
> #1  0x0000003b894428dc in vfprintf () from /lib64/tls/libc.so.6
> #2  0x0000003b89461ba4 in vsnprintf () from /lib64/tls/libc.so.6
> #3  0x00000000004ff420 in appendStringInfoVA (str=0x7fbfffde30, fmt=0x65f59e "%s", args=0x7fbfffdb50) at
stringinfo.c:125

Hrm ... what's the platform again?
        regards, tom lane


Re: Crash during elog.c...

From
"Jim C. Nasby"
Date:
On Fri, Nov 04, 2005 at 02:45:41PM -0500, Tom Lane wrote:
> "Jim C. Nasby" <jnasby@pervasive.com> writes:
> > The backtrace:
> > Program terminated with signal 11, Segmentation fault.
> > (gdb) bt
> > #0  0x0000003b8946fb20 in strlen () from /lib64/tls/libc.so.6
> > #1  0x0000003b894428dc in vfprintf () from /lib64/tls/libc.so.6
> > #2  0x0000003b89461ba4 in vsnprintf () from /lib64/tls/libc.so.6
> > #3  0x00000000004ff420 in appendStringInfoVA (str=0x7fbfffde30, fmt=0x65f59e "%s", args=0x7fbfffdb50) at
stringinfo.c:125
> 
> Hrm ... what's the platform again?

8-way opteron, RHEL4.

BTW, should I be opening bugs for things like this? I guess I probably
should...
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: Crash during elog.c...

From
Bruce Momjian
Date:
Jim C. Nasby wrote:
> On Fri, Nov 04, 2005 at 02:45:41PM -0500, Tom Lane wrote:
> > "Jim C. Nasby" <jnasby@pervasive.com> writes:
> > > The backtrace:
> > > Program terminated with signal 11, Segmentation fault.
> > > (gdb) bt
> > > #0  0x0000003b8946fb20 in strlen () from /lib64/tls/libc.so.6
> > > #1  0x0000003b894428dc in vfprintf () from /lib64/tls/libc.so.6
> > > #2  0x0000003b89461ba4 in vsnprintf () from /lib64/tls/libc.so.6
> > > #3  0x00000000004ff420 in appendStringInfoVA (str=0x7fbfffde30, fmt=0x65f59e "%s", args=0x7fbfffdb50) at
stringinfo.c:125
> > 
> > Hrm ... what's the platform again?
> 
> 8-way opteron, RHEL4.
> 
> BTW, should I be opening bugs for things like this? I guess I probably
> should...

Nope, reporting it here is fine.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Crash during elog.c...

From
"Jim C. Nasby"
Date:
On Fri, Nov 04, 2005 at 04:34:35PM -0500, Bruce Momjian wrote:
> Jim C. Nasby wrote:
> > On Fri, Nov 04, 2005 at 02:45:41PM -0500, Tom Lane wrote:
> > > "Jim C. Nasby" <jnasby@pervasive.com> writes:
> > > > The backtrace:
> > > > Program terminated with signal 11, Segmentation fault.
> > > > (gdb) bt
> > > > #0  0x0000003b8946fb20 in strlen () from /lib64/tls/libc.so.6
> > > > #1  0x0000003b894428dc in vfprintf () from /lib64/tls/libc.so.6
> > > > #2  0x0000003b89461ba4 in vsnprintf () from /lib64/tls/libc.so.6
> > > > #3  0x00000000004ff420 in appendStringInfoVA (str=0x7fbfffde30, fmt=0x65f59e "%s", args=0x7fbfffdb50) at
stringinfo.c:125
> > > 
> > > Hrm ... what's the platform again?
> > 
> > 8-way opteron, RHEL4.
> > 
> > BTW, should I be opening bugs for things like this? I guess I probably
> > should...
> 
> Nope, reporting it here is fine.

I'm soon to be AFK all weekend... is there any more info anyone wanted
about this?
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: Crash during elog.c...

From
Tom Lane
Date:
"Jim C. Nasby" <jnasby@pervasive.com> writes:
> My client (same one with the slru.c issue) has had 3 of these in the
> past day...

> (gdb) print *str
> $39 = {data = 0x848030 "2005-11-04 00:01:02 EST|2005-11-04 00:00:08 EST|216.187.113.78(39476)|didit|", len = 76,
>   maxlen = 256, cursor = 0}

Um, what's your log_line_prefix setting, and is the next format code
%i by any chance?  I've just noticed an utterly brain-dead assumption
somebody stuck into ps_status.c awhile back.
        regards, tom lane


Re: Crash during elog.c...

From
"Jim C. Nasby"
Date:
On Fri, Nov 04, 2005 at 08:06:39PM -0500, Tom Lane wrote:
> "Jim C. Nasby" <jnasby@pervasive.com> writes:
> > My client (same one with the slru.c issue) has had 3 of these in the
> > past day...
> 
> > (gdb) print *str
> > $39 = {data = 0x848030 "2005-11-04 00:01:02 EST|2005-11-04 00:00:08 EST|216.187.113.78(39476)|didit|", len = 76,
> >   maxlen = 256, cursor = 0}
> 
> Um, what's your log_line_prefix setting, and is the next format code
> %i by any chance?  I've just noticed an utterly brain-dead assumption
> somebody stuck into ps_status.c awhile back.

log_line_prefix = '%t|%s|%r|%d|%i|%p'

So yeah, looks like %i is next. I recall seeing something about %i in
the backtrace or something else related to this, but I can't find it
now.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: Crash during elog.c...

From
Tom Lane
Date:
"Jim C. Nasby" <jnasby@pervasive.com> writes:
> On Fri, Nov 04, 2005 at 08:06:39PM -0500, Tom Lane wrote:
>> Um, what's your log_line_prefix setting, and is the next format code
>> %i by any chance?  I've just noticed an utterly brain-dead assumption
>> somebody stuck into ps_status.c awhile back.

> log_line_prefix = '%t|%s|%r|%d|%i|%p'

> So yeah, looks like %i is next.

The quickest way to get rid of the crash will be to remove %i, then.
If you don't want to do that, see the patch I committed to CVS.
        regards, tom lane