Thread: BUG #3901: Received error

BUG #3901: Received error

From
"Chris Hoover"
Date:
The following bug has been logged online:

Bug reference:      3901
Logged by:          Chris Hoover
Email address:      revoohc@gmail.com
PostgreSQL version: 8.3.rc2
Operating system:   RHES 4.0
Description:        Received error
Details:

I just received this error on my command line.  I think this came from
postgres as it is the only app running on this server.  I am not sure what
caused this since the server was idle.

TRAP: BadArgument("!(((header->context) != ((void *)0) &&
(((((Node*)((header->context)))-
>type) == T_AllocSetContext))))", File: "mcxt.c", Line: 589)

Re: BUG #3901: Received error

From
Tom Lane
Date:
"Chris Hoover" <revoohc@gmail.com> writes:
> I just received this error on my command line.  I think this came from
> postgres as it is the only app running on this server.  I am not sure what
> caused this since the server was idle.

> TRAP: BadArgument("!(((header->context) != ((void *)0) &&
> (((((Node*)((header->context)))-
>> type) == T_AllocSetContext))))", File: "mcxt.c", Line: 589)

This is an Assert failure, but there's not enough info here to guess
what caused it.  Did it produce a core file in $PGDATA?

            regards, tom lane

Re: BUG #3901: Received error

From
"Gurjeet Singh"
Date:
It sure looks like Postgres's ASSERT failure. And mcxt.c is Postgres' memory
management module.

    Can you please figure out, and try to reproduce what were you (or your
app) doing when this happened.

Best regards,

On Jan 24, 2008 12:37 PM, Chris Hoover <revoohc@gmail.com> wrote:

>
> The following bug has been logged online:
>
> Bug reference:      3901
> Logged by:          Chris Hoover
> Email address:      revoohc@gmail.com
> PostgreSQL version: 8.3.rc2
> Operating system:   RHES 4.0
> Description:        Received error
> Details:
>
> I just received this error on my command line.  I think this came from
> postgres as it is the only app running on this server.  I am not sure what
> caused this since the server was idle.
>
> TRAP: BadArgument("!(((header->context) !=3D ((void *)0) &&
> (((((Node*)((header->context)))-
> >type) =3D=3D T_AllocSetContext))))", File: "mcxt.c", Line: 589)
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>               http://archives.postgresql.org
>



--=20
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB      http://www.enterprisedb.com

17=B0 29' 34.37"N,   78=B0 30' 59.76"E - Hyderabad
18=B0 32' 57.25"N,   73=B0 56' 25.42"E - Pune
37=B0 47' 19.72"N, 122=B0 24' 1.69" W - San Francisco *

http://gurjeet.frihost.net

Mail sent from my BlackLaptop device

Re: BUG #3901: Received error

From
"Chris Hoover"
Date:
On Jan 24, 2008 9:39 PM, Gurjeet Singh <singh.gurjeet@gmail.com> wrote:

> It sure looks like Postgres's ASSERT failure. And mcxt.c is Postgres'
> memory management module.
>
>     Can you please figure out, and try to reproduce what were you (or your
> app) doing when this happened.
>
> Best regards,
>
> On Jan 24, 2008 12:37 PM, Chris Hoover <revoohc@gmail.com> wrote:
>
> >
> > The following bug has been logged online:
> >
> > Bug reference:      3901
> > Logged by:          Chris Hoover
> > Email address:      revoohc@gmail.com
> > PostgreSQL version: 8.3.rc2
> > Operating system:   RHES 4.0
> > Description:        Received error
> > Details:
> >
> > I just received this error on my command line.  I think this came from
> > postgres as it is the only app running on this server.  I am not sure
> > what
> > caused this since the server was idle.
> >
> > TRAP: BadArgument("!(((header->context) !=3D ((void *)0) &&
> > (((((Node*)((header->context)))-
> > >type) =3D=3D T_AllocSetContext))))", File: "mcxt.c ", Line: 589)
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 4: Have you searched our list archives?
> >
> >               http://archives.postgresql.org
> >
>
>
>
> --
> gurjeet[.singh]@EnterpriseDB.com
> singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com
>
> EnterpriseDB       http://www.enterprisedb.com
>
> 17=B0 29' 34.37"N,   78=B0 30' 59.76"E - Hyderabad
> 18=B0 32' 57.25"N,   73=B0 56' 25.42"E - Pune
> 37=B0 47' 19.72"N, 122=B0 24' 1.69" W - San Francisco *
>
> http://gurjeet.frihost.net
>
> Mail sent from my BlackLaptop device


Unfortunately, I can't find anything more for this error.  There are no core
files, and no unexpected error entries in the postgres log files (running
syslog and csvlog).  I really wish I could provide more, but I just went to
that terminal, and that line was there.

If you have any other ideas on where/what to look for, I'd be glad to try
and help.

Thanks

--=20
Come see how to SAVE money on fuel, decrease harmful emissions, and even
make MONEY.  Visit http://colafuelguy.mybpi.com and join the revolution!

Re: BUG #3901: Received error

From
Alvaro Herrera
Date:
Chris Hoover escribió:

> Unfortunately, I can't find anything more for this error.  There are no core
> files, and no unexpected error entries in the postgres log files (running
> syslog and csvlog).  I really wish I could provide more, but I just went to
> that terminal, and that line was there.
>
> If you have any other ideas on where/what to look for, I'd be glad to try
> and help.

Please set up your postmaster environment so that it will dump core next
time this happens.  Usually this is a matter of adding an "ulimit -c
unlimited" line somewhere in the init script that starts it.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: BUG #3901: Received error

From
Tom Lane
Date:
"Chris Hoover" <revoohc@gmail.com> writes:
> In the noon log, I have this entry:
> 2008-01-25 12:00:00.063 EST,,,3040,,479a01f1.be0,2,,2008-01-25 10:36:17
> EST,,0,LOG,00000,"system logger process (PID 3041) was terminated by signal
> 6: Aborted",,,,,,,,

> In the 1400 log, I have this entry:
> 2008-01-25 14:00:00.898 EST,,,3040,,479a01f1.be0,3,,2008-01-25 10:36:17
> EST,,0,LOG,00000,"system logger process (PID 3391) was terminated by signal
> 6: Aborted",,,,,,,,

System logger, eh?  What are all your non-default logging parameter
settings?

The corefiles are not going to be useful to anyone else with a different
setup.  Please use gdb to get a backtrace from them yourself.

            regards, tom lane

Re: BUG #3901: Received error

From
"Chris Hoover"
Date:
On Jan 25, 2008 2:23 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> "Chris Hoover" <revoohc@gmail.com> writes:
> > In the noon log, I have this entry:
> > 2008-01-25 12:00:00.063 EST,,,3040,,479a01f1.be0,2,,2008-01-25 10:36:17
> > EST,,0,LOG,00000,"system logger process (PID 3041) was terminated by
> signal
> > 6: Aborted",,,,,,,,
>
> > In the 1400 log, I have this entry:
> > 2008-01-25 14:00:00.898 EST,,,3040,,479a01f1.be0,3,,2008-01-25 10:36:17
> > EST,,0,LOG,00000,"system logger process (PID 3391) was terminated by
> signal
> > 6: Aborted",,,,,,,,
>
> System logger, eh?  What are all your non-default logging parameter
> settings?
>
> The corefiles are not going to be useful to anyone else with a different
> setup.  Please use gdb to get a backtrace from them yourself.
>
>                        regards, tom lane
>

Ok, I have not done much debugging.  How do I go about getting the backtrace
for you.


Here is the config settings for the log settings:
#------------------------------------------------------------------------------
# ERROR REPORTING AND LOGGING
#------------------------------------------------------------------------------

# - Where to Log -

log_destination = 'syslog,csvlog'               # Valid values are
combinations of
logging_collector = on                  # Enable capturing of stderr and
csvlog
log_directory = 'pg_log'                # directory where log files are
written,
log_truncate_on_rotation = on           # If on, an existing log file of the
log_rotation_age = 1h                   # Automatic rotation of logfiles
will
log_rotation_size = 0                   # Automatic rotation of logfiles
will
syslog_facility = 'LOCAL0'
syslog_ident = 'postgres'
log_min_duration_statement = 0  # -1 is disabled, 0 logs all statements
log_checkpoints = on
log_connections = on
log_disconnections = on
log_duration = on
log_line_prefix = '%d,%p,%u,%r,%p,%m,%c,%l,%s,%v,%x,%i: '
log_lock_waits = on                     # log lock waits >= deadlock_timeout
log_statement = 'all'                   # none, ddl, mod, all

Thanks,

Chris

--
Come see how to SAVE money on fuel, decrease harmful emissions, and even
make MONEY.  Visit http://colafuelguy.mybpi.com and join the revolution!

Re: BUG #3901: Received error

From
"Chris Hoover"
Date:
On Jan 25, 2008 2:30 PM, Chris Hoover <revoohc@gmail.com> wrote:

> On Jan 25, 2008 2:23 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> > "Chris Hoover" <revoohc@gmail.com> writes:
> > > In the noon log, I have this entry:
> > > 2008-01-25 12:00:00.063 EST,,,3040,,479a01f1.be0,2,,2008-01-25
> > 10:36:17
> > > EST,,0,LOG,00000,"system logger process (PID 3041) was terminated by
> > signal
> > > 6: Aborted",,,,,,,,
> >
> > > In the 1400 log, I have this entry:
> > > 2008-01-25 14:00:00.898 EST,,,3040,,479a01f1.be0,3,,2008-01-25
> > 10:36:17
> > > EST,,0,LOG,00000,"system logger process (PID 3391) was terminated by
> > signal
> > > 6: Aborted",,,,,,,,
> >
> > System logger, eh?  What are all your non-default logging parameter
> > settings?
> >
> > The corefiles are not going to be useful to anyone else with a different
> > setup.  Please use gdb to get a backtrace from them yourself.
> >
> >                        regards, tom lane
> >
>
> Ok, I think I figured out how to get the backtrace (praise google!)

Core.3391

#0  0x006497a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
(gdb) bt
#0  0x006497a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x006897f5 in raise () from /lib/tls/libc.so.6
#2  0x0068b199 in abort () from /lib/tls/libc.so.6
#3  0x082c8fa6 in ExceptionalCondition (
    conditionName=0x83bc514 "!(((header->context) != ((void *)0) &&
(((((Node*)((header->context)))->type) == T_AllocSetContext))))",
errorType=0x82f89d2 "BadArgument", fileName=0xd3f "", lineNumber=0) at
assert.c:57
#4  0x082e58ce in pfree (pointer=0x89cde14) at mcxt.c:589
#5  0x081f9620 in SysLogger_Start () at syslogger.c:1172
#6  0x081f6bd1 in reaper (postgres_signal_arg=17) at postmaster.c:2283
#7  <signal handler called>
#8  0x006497a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#9  0x007223dd in ___newselect_nocancel () from /lib/tls/libc.so.6
#10 0x081f5477 in ServerLoop () at postmaster.c:1234
#11 0x081f79d3 in PostmasterMain (argc=3, argv=0x89b7508) at postmaster.c
:1029
#12 0x081aa10c in main (argc=3, argv=0x89b7508) at main.c:188


core.3041

#0  0x006497a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
(gdb) bt
#0  0x006497a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x006897f5 in raise () from /lib/tls/libc.so.6
#2  0x0068b199 in abort () from /lib/tls/libc.so.6
#3  0x082c8fa6 in ExceptionalCondition (
    conditionName=0x83bc514 "!(((header->context) != ((void *)0) &&
(((((Node*)((header->context)))->type) == T_AllocSetContext))))",
errorType=0x82f89d2 "BadArgument", fileName=0xbe1 "", lineNumber=0) at
assert.c:57
#4  0x082e58ce in pfree (pointer=0x89cde14) at mcxt.c:589
#5  0x081f9620 in SysLogger_Start () at syslogger.c:1172
#6  0x081f796c in PostmasterMain (argc=3, argv=0x89b7508) at postmaster.c
:993
#7  0x081aa10c in main (argc=3, argv=0x89b7508) at main.c:188



Let me know if you need more.

HTH,

Chris



--
Come see how to SAVE money on fuel, decrease harmful emissions, and even
make MONEY.  Visit http://colafuelguy.mybpi.com and join the revolution!

Re: BUG #3901: Received error

From
Tom Lane
Date:
"Chris Hoover" <revoohc@gmail.com> writes:
> #3  0x082c8fa6 in ExceptionalCondition (
>     conditionName=0x83bc514 "!(((header->context) != ((void *)0) &&
> (((((Node*)((header->context)))->type) == T_AllocSetContext))))",
> errorType=0x82f89d2 "BadArgument", fileName=0xd3f "", lineNumber=0) at
> assert.c:57
> #4  0x082e58ce in pfree (pointer=0x89cde14) at mcxt.c:589
> #5  0x081f9620 in SysLogger_Start () at syslogger.c:1172

Ah-hah: copy-and-paste-o.  This chunk of code is assigning the
wrong thing to last_csvfile_name.

        /* instead of pfree'ing filename, remember it for next time */
        if (last_csvfile_name != NULL)
            pfree(last_csvfile_name);
        last_csvfile_name = filename;
    }

This is a bit distressing in terms of the apparent lack of developer
testing on the CSV code; this should have been caught by anyone who'd
used CSV logging with --enable-cassert for any length of time.

            regards, tom lane