Thread: How to debug this crash?

How to debug this crash?

From
Jorge Godoy
Date:
Hi!


I've updated my system and after importing some old data I started getting
this message:

================================================================================
*** glibc detected *** postgres: godoy neo localhost(34476) SELECT: double free or corruption (out): 0x08494440 ***
======= Backtrace: =========
/lib/libc.so.6[0xb7b4f6e1]
/lib/libc.so.6(cfree+0x89)[0xb7b50d79]
/usr/lib/postgresql/plpython.so[0xb7ee3944]
/usr/lib/libpython2.5.so.1.0[0xb6b8d85a]
/usr/lib/libpython2.5.so.1.0(PyEval_EvalFrameEx+0x55dc)[0xb6be80ac]
/usr/lib/libpython2.5.so.1.0(PyEval_EvalCodeEx+0x7c4)[0xb6be9734]
/usr/lib/libpython2.5.so.1.0(PyEval_EvalCode+0x63)[0xb6be97b3]
/usr/lib/postgresql/plpython.so[0xb7ee4dfe]
/usr/lib/postgresql/plpython.so[0xb7ee5c97]
/usr/lib/postgresql/plpython.so(plpython_call_handler+0xf6)[0xb7ee7016]
postgres: godoy neo localhost(34476) SELECT(ExecMakeFunctionResult+0xec)[0x813813c]
postgres: godoy neo localhost(34476) SELECT(ExecProject+0x1c6)[0x81364e6]
postgres: godoy neo localhost(34476) SELECT(ExecNestLoop+0x127)[0x8143f37]
postgres: godoy neo localhost(34476) SELECT(ExecProcNode+0x130)[0x8135c40]
postgres: godoy neo localhost(34476) SELECT(ExecutorRun+0x30b)[0x8134f4b]
postgres: godoy neo localhost(34476) SELECT[0x81b8e50]
postgres: godoy neo localhost(34476) SELECT(PortalRun+0x198)[0x81b9aa8]
postgres: godoy neo localhost(34476) SELECT[0x81b58cc]
postgres: godoy neo localhost(34476) SELECT(PostgresMain+0x1481)[0x81b7491]
postgres: godoy neo localhost(34476) SELECT[0x818f92a]
postgres: godoy neo localhost(34476) SELECT(PostmasterMain+0xc5b)[0x8190aeb]
postgres: godoy neo localhost(34476) SELECT(main+0x249)[0x8153f89]
/lib/libc.so.6(__libc_start_main+0xdc)[0xb7b00f9c]
postgres: godoy neo localhost(34476) SELECT[0x8078ce1]
======= Memory map: ========
08048000-082d6000 r-xp 00000000 03:05 1037143    /usr/bin/postgres
082d6000-082e0000 rw-p 0028d000 03:05 1037143    /usr/bin/postgres
082e0000-08590000 rw-p 082e0000 00:00 0          [heap]
b6a00000-b6a21000 rw-p b6a00000 00:00 0
b6a21000-b6b00000 ---p b6a21000 00:00 0
b6b4e000-b6c3e000 r-xp 00000000 03:05 945257     /usr/lib/libpython2.5.so.1.0
b6c3e000-b6c64000 rw-p 000ef000 03:05 945257     /usr/lib/libpython2.5.so.1.0
b6c64000-b6cec000 rw-p b6c64000 00:00 0
b6cec000-b6d09000 r-xp 00000000 03:05 1404258    /usr/lib/postgresql/plpgsql.so
b6d09000-b6d0b000 rw-p 0001d000 03:05 1404258    /usr/lib/postgresql/plpgsql.so
b6d4c000-b6dcd000 rw-p b6d4c000 00:00 0
b6dcd000-b6dd7000 r-xp 00000000 03:05 651576     /lib/libgcc_s.so.1
b6dd7000-b6dd9000 rw-p 00009000 03:05 651576     /lib/libgcc_s.so.1
b6dd9000-b6dee000 r--p 00000000 03:05 1030633    /usr/share/locale/pt_BR/LC_MESSAGES/libc.mo
b6dee000-b6e50000 rw-p b6dee000 00:00 0
b6e50000-b784c000 rw-s 00000000 00:08 9666560    /SYSV0052e2c1 (deleted)
b784c000-b7892000 r--p 00000000 03:05 1037190    /usr/share/locale/pt_BR/LC_MESSAGES/postgres.mo
b7892000-b78c7000 r--s 00000000 03:05 392441     /var/run/nscd/db0rBNlN (deleted)
b78c7000-b7902000 r--p 00000000 03:05 1062284    /usr/lib/locale/pt_BR.utf8/LC_CTYPE
b7902000-b79d9000 r--p 00000000 03:05 1062285    /usr/lib/locale/pt_BR.utf8/LC_COLLATE
b79d9000-b7a0e000 r--s 00000000 03:05 392440     /var/run/nscd/group
b7a0e000-b7a43000 r--s 00000000 03:05 392433     /var/run/nscd/passwd
b7a43000-b7a45000 rw-p b7a43000 00:00 0
b7a45000-b7a59000 r-xp 00000000 03:05 651558     /lib/libpthread-2.5.so
b7a59000-b7a5b000 rw-p 00013000 03:05 651558     /lib/libpthread-2.5.so
b7a5b000-b7a5d000 rw-p b7a5b000 00:00 0
b7a5d000-b7a98000 r-xp 00000000 03:05 651584     /lib/libncurses.so.5.5
b7a98000-b7a9f000 r--p 0003a000 03:05 651584     /lib/libncurses.so.5.5
b7a9f000-b7aa4000 rw-p 00041000 03:05 651584     /lib/libncurses.so.5.5
b7aa4000-b7aab000 r-xp 00000000 03:05 1032787    /usr/lib/libkrb5support.so.0.1
b7aab000-b7aad000 rw-p 00006000 03:05 1032787    /usr/lib/libkrb5support.so.0.1
b7aad000-b7aae000 rw-p b7aad000 00:00 0
b7aae000-b7ad2000 r-xp 00000000 03:05 1032775    /usr/lib/libk5crypto.so.3.0
b7ad2000-b7ad4000 rw-p 00023000 03:05 1032775    /usr/lib/libk5crypto.so.3.0
b7ad4000-b7ae5000 r-xp 00000000 03:05 651569     /lib/libaudit.so.0.0.0
b7ae5000-b7ae7000 rw-p 00010000 03
================================================================================

(the last line ends like that already...  I believe it is missing
 information...)



This happens when running this view:

================================================================================
CREATE OR REPLACE VIEW neolab.v_resultado_amostra_analise AS
    SELECT vrr.amostra_analise_id AS id, vrr.id AS amostra_id,
        ar.liberado AS is_liberado, ar.alterado_em,
        an.metodologia, an.ibmp, an.valores_referencia, an.nome AS analise,
        neolab.f_v_formata_valor_resultado(
            (SELECT o_resultado
            FROM neolab.f_v_resultado_amostra_analise(vrr.amostra_analise_id)),
        an.id) AS resultado,
        (SELECT o_is_limite_quantificacao
         FROM neolab.f_v_resultado_amostra_analise(vrr.amostra_analise_id))
        AS is_limite_quantificacao
    FROM neolab.v_resultados_resumo vrr
    LEFT JOIN neolab.amostras_resultados ar ON ar.amostra_analise_id = vrr.amostra_analise_id
    JOIN neolab.amostras_analises aa ON aa.id = vrr.amostra_analise_id
    JOIN neolab.analises an ON an.id = aa.analise_id;
================================================================================

where the function called is:

================================================================================
CREATE OR REPLACE FUNCTION neolab.f_v_formata_valor_resultado(
    p_resultado FLOAT, p_analise_id INTEGER,
    OUT o_resultado TEXT)
AS $_$
    p_resultado = args[0]
    p_analise_id = args[1]

    precisao=plpy.execute('''
        SELECT precisao FROM neolab.analises
        WHERE id=%s''' % p_analise_id)
    unidade=plpy.execute('''
        SELECT u.simbolo FROM neolab.unidades u
        JOIN neolab.analises a ON a.unidade_relatorio_id = u.id
        WHERE a.id = %s''' % p_analise_id)
    formato='%%.%sf %s' % (precisao[0]['precisao'], unidade[0]['simbolo'])
    return ((formato % p_resultado).replace('.', ','))
$_$ LANGUAGE plpythonu STABLE STRICT;

-- -----------------------------------------------------------------------------
-- Sample output:
-- -----------------------------------------------------------------------------
--
-- neo=# select * from neolab.f_v_formata_valor_resultado(3.23456, 53);
--  o_resultado
-- ----------------
--  3,23 mg/g crea
-- (1 registro)
--
-- neo=# select * from neolab.f_v_formata_valor_resultado(3.23456, 59);
--  o_resultado
-- -------------
--  3,23 g/L
-- (1 registro)


CREATE OR REPLACE FUNCTION neolab.f_v_resultado_amostra_analise(
    p_amostra_analise_id INTEGER,
    OUT o_resultado FLOAT, OUT o_is_limite_quantificacao BOOL) AS $_$
DECLARE
    w_resultado FLOAT;
    w_analise_id INTEGER;
    w_limite_quantificacao FLOAT;
BEGIN
    o_is_limite_quantificacao:=FALSE;
    w_resultado:=resultado_calculado FROM neolab.amostras_resultados
        WHERE amostra_analise_id=p_amostra_analise_id;
    w_analise_id:=analise_id FROM neolab.amostras_analises
        WHERE id=p_amostra_analise_id;
    w_limite_quantificacao:=limite_quantificacao FROM neolab.analises
        WHERE id=w_analise_id;
    IF w_resultado < w_limite_quantificacao THEN
        o_is_limite_quantificacao:=TRUE;
        w_resultado:=w_limite_quantificacao;
    END IF;
    o_resultado:=w_resultado;
END;
$_$ LANGUAGE plpgsql STRICT STABLE;

================================================================================



Sometimes I can run (repeatedly) the offending view, but when it crashes I
have to restart the machine to be able to run it again.

On psql I get:

================================================================================
neo=# select * from neolab.v_resultado_amostra_analise ;
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!> \q
================================================================================



This is with:

OpenSuSE 10.2
postgresql-server-8.1.5-13
postgresql-libs-8.1.5-13
postgresql-docs-8.1.5-13
postgresql-devel-8.1.5-13
postgresql-8.1.5-13
postgresql-pl-8.1.5-15

jupiter:/var/lib/pgsql/data/pg_log # psql -V
psql (PostgreSQL) 8.1.5
contains support for command-line editing


Linux jupiter 2.6.18.2-34-default #1 SMP Mon Nov 27 11:46:27 UTC 2006 i686 i686 i386 GNU/Linux


processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 4
model name      : Mobile Intel(R) Pentium(R) 4 CPU 3.06GHz
stepping        : 1
cpu MHz         : 1867.000
cache size      : 1024 KB
physical id     : 0
siblings        : 1
core id         : 0
cpu cores       : 1
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse
sse2ss ht tm pbe constant_tsc up pni monitor ds_cpl est tm2 cid xtpr 
bogomips        : 6137.03


(HT is disabled)



So, my doubt is: where do I start debugging that? :-)



Thanks for any help,
--
Jorge Godoy      <jgodoy@gmail.com>

Re: How to debug this crash?

From
Tom Lane
Date:
Jorge Godoy <jgodoy@gmail.com> writes:
> This is with:

> OpenSuSE 10.2
> postgresql-server-8.1.5-13
> postgresql-libs-8.1.5-13
> postgresql-docs-8.1.5-13
> postgresql-devel-8.1.5-13
> postgresql-8.1.5-13
> postgresql-pl-8.1.5-15

What python version?  (Hint: pre-8.2 plpython is known not to work
with python 2.5)

            regards, tom lane

Re: How to debug this crash?

From
Jorge Godoy
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Jorge Godoy <jgodoy@gmail.com> writes:
>> This is with:
>
>> OpenSuSE 10.2
>> postgresql-server-8.1.5-13
>> postgresql-libs-8.1.5-13
>> postgresql-docs-8.1.5-13
>> postgresql-devel-8.1.5-13
>> postgresql-8.1.5-13
>> postgresql-pl-8.1.5-15
>
> What python version?  (Hint: pre-8.2 plpython is known not to work
> with python 2.5)

Bingo! ;-)

I'll upgrade to 8.2.


Be seeing you,
--
Jorge Godoy      <jgodoy@gmail.com>

Re: How to debug this crash?

From
Jorge Godoy
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> What python version?  (Hint: pre-8.2 plpython is known not to work
> with python 2.5)

This is more to confirm what I've found in practice and couldn't find at the
online docs for 8.2: is it possible to use output variables to write stored
procedures in plpythonu or am I restricted to functions only?

For a function where I declared an output parameter I'm getting the following
message:



================================================================================
neo=# select neolab.f_v_formata_valor_resultado(3.23456, 53);
ERRO:  proargnames must have the same number of elements as the function has arguments
neo=#
================================================================================



The code to this function was provided on my first message, the signature of
it is:




================================================================================
neo=# \df neolab.f_v_formata_valor_resultado
                                                         List of functions
 Schema |            Name             | Result data type |                           Argument data types
            

--------+-----------------------------+------------------+--------------------------------------------------------------------------
 neolab | f_v_formata_valor_resultado | text             | p_resultado double precision, p_analise_id integer, OUT
o_resultadotext 
(1 row)

neo=#
================================================================================


This worked with Python 2.4 and PG 8.1.4 / 8.1.5.



I'm getting this error with PostgreSQL 8.2.3 and Python 2.5.  Is it a bug?  It
is correct behavior?  (If this is correct behavior it would be very nice to
get the 'out' parameter as it was before and also being able to reference
variables directly without resorting to using the args array...)


TIA,
--
Jorge Godoy      <jgodoy@gmail.com>