How to debug this crash? - Mailing list pgsql-general

From Jorge Godoy
Subject How to debug this crash?
Date
Msg-id 87odnfr7od.fsf@gmail.com
Whole thread Raw
Responses Re: How to debug this crash?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Hi!


I've updated my system and after importing some old data I started getting
this message:

================================================================================
*** glibc detected *** postgres: godoy neo localhost(34476) SELECT: double free or corruption (out): 0x08494440 ***
======= Backtrace: =========
/lib/libc.so.6[0xb7b4f6e1]
/lib/libc.so.6(cfree+0x89)[0xb7b50d79]
/usr/lib/postgresql/plpython.so[0xb7ee3944]
/usr/lib/libpython2.5.so.1.0[0xb6b8d85a]
/usr/lib/libpython2.5.so.1.0(PyEval_EvalFrameEx+0x55dc)[0xb6be80ac]
/usr/lib/libpython2.5.so.1.0(PyEval_EvalCodeEx+0x7c4)[0xb6be9734]
/usr/lib/libpython2.5.so.1.0(PyEval_EvalCode+0x63)[0xb6be97b3]
/usr/lib/postgresql/plpython.so[0xb7ee4dfe]
/usr/lib/postgresql/plpython.so[0xb7ee5c97]
/usr/lib/postgresql/plpython.so(plpython_call_handler+0xf6)[0xb7ee7016]
postgres: godoy neo localhost(34476) SELECT(ExecMakeFunctionResult+0xec)[0x813813c]
postgres: godoy neo localhost(34476) SELECT(ExecProject+0x1c6)[0x81364e6]
postgres: godoy neo localhost(34476) SELECT(ExecNestLoop+0x127)[0x8143f37]
postgres: godoy neo localhost(34476) SELECT(ExecProcNode+0x130)[0x8135c40]
postgres: godoy neo localhost(34476) SELECT(ExecutorRun+0x30b)[0x8134f4b]
postgres: godoy neo localhost(34476) SELECT[0x81b8e50]
postgres: godoy neo localhost(34476) SELECT(PortalRun+0x198)[0x81b9aa8]
postgres: godoy neo localhost(34476) SELECT[0x81b58cc]
postgres: godoy neo localhost(34476) SELECT(PostgresMain+0x1481)[0x81b7491]
postgres: godoy neo localhost(34476) SELECT[0x818f92a]
postgres: godoy neo localhost(34476) SELECT(PostmasterMain+0xc5b)[0x8190aeb]
postgres: godoy neo localhost(34476) SELECT(main+0x249)[0x8153f89]
/lib/libc.so.6(__libc_start_main+0xdc)[0xb7b00f9c]
postgres: godoy neo localhost(34476) SELECT[0x8078ce1]
======= Memory map: ========
08048000-082d6000 r-xp 00000000 03:05 1037143    /usr/bin/postgres
082d6000-082e0000 rw-p 0028d000 03:05 1037143    /usr/bin/postgres
082e0000-08590000 rw-p 082e0000 00:00 0          [heap]
b6a00000-b6a21000 rw-p b6a00000 00:00 0
b6a21000-b6b00000 ---p b6a21000 00:00 0
b6b4e000-b6c3e000 r-xp 00000000 03:05 945257     /usr/lib/libpython2.5.so.1.0
b6c3e000-b6c64000 rw-p 000ef000 03:05 945257     /usr/lib/libpython2.5.so.1.0
b6c64000-b6cec000 rw-p b6c64000 00:00 0
b6cec000-b6d09000 r-xp 00000000 03:05 1404258    /usr/lib/postgresql/plpgsql.so
b6d09000-b6d0b000 rw-p 0001d000 03:05 1404258    /usr/lib/postgresql/plpgsql.so
b6d4c000-b6dcd000 rw-p b6d4c000 00:00 0
b6dcd000-b6dd7000 r-xp 00000000 03:05 651576     /lib/libgcc_s.so.1
b6dd7000-b6dd9000 rw-p 00009000 03:05 651576     /lib/libgcc_s.so.1
b6dd9000-b6dee000 r--p 00000000 03:05 1030633    /usr/share/locale/pt_BR/LC_MESSAGES/libc.mo
b6dee000-b6e50000 rw-p b6dee000 00:00 0
b6e50000-b784c000 rw-s 00000000 00:08 9666560    /SYSV0052e2c1 (deleted)
b784c000-b7892000 r--p 00000000 03:05 1037190    /usr/share/locale/pt_BR/LC_MESSAGES/postgres.mo
b7892000-b78c7000 r--s 00000000 03:05 392441     /var/run/nscd/db0rBNlN (deleted)
b78c7000-b7902000 r--p 00000000 03:05 1062284    /usr/lib/locale/pt_BR.utf8/LC_CTYPE
b7902000-b79d9000 r--p 00000000 03:05 1062285    /usr/lib/locale/pt_BR.utf8/LC_COLLATE
b79d9000-b7a0e000 r--s 00000000 03:05 392440     /var/run/nscd/group
b7a0e000-b7a43000 r--s 00000000 03:05 392433     /var/run/nscd/passwd
b7a43000-b7a45000 rw-p b7a43000 00:00 0
b7a45000-b7a59000 r-xp 00000000 03:05 651558     /lib/libpthread-2.5.so
b7a59000-b7a5b000 rw-p 00013000 03:05 651558     /lib/libpthread-2.5.so
b7a5b000-b7a5d000 rw-p b7a5b000 00:00 0
b7a5d000-b7a98000 r-xp 00000000 03:05 651584     /lib/libncurses.so.5.5
b7a98000-b7a9f000 r--p 0003a000 03:05 651584     /lib/libncurses.so.5.5
b7a9f000-b7aa4000 rw-p 00041000 03:05 651584     /lib/libncurses.so.5.5
b7aa4000-b7aab000 r-xp 00000000 03:05 1032787    /usr/lib/libkrb5support.so.0.1
b7aab000-b7aad000 rw-p 00006000 03:05 1032787    /usr/lib/libkrb5support.so.0.1
b7aad000-b7aae000 rw-p b7aad000 00:00 0
b7aae000-b7ad2000 r-xp 00000000 03:05 1032775    /usr/lib/libk5crypto.so.3.0
b7ad2000-b7ad4000 rw-p 00023000 03:05 1032775    /usr/lib/libk5crypto.so.3.0
b7ad4000-b7ae5000 r-xp 00000000 03:05 651569     /lib/libaudit.so.0.0.0
b7ae5000-b7ae7000 rw-p 00010000 03
================================================================================

(the last line ends like that already...  I believe it is missing
 information...)



This happens when running this view:

================================================================================
CREATE OR REPLACE VIEW neolab.v_resultado_amostra_analise AS
    SELECT vrr.amostra_analise_id AS id, vrr.id AS amostra_id,
        ar.liberado AS is_liberado, ar.alterado_em,
        an.metodologia, an.ibmp, an.valores_referencia, an.nome AS analise,
        neolab.f_v_formata_valor_resultado(
            (SELECT o_resultado
            FROM neolab.f_v_resultado_amostra_analise(vrr.amostra_analise_id)),
        an.id) AS resultado,
        (SELECT o_is_limite_quantificacao
         FROM neolab.f_v_resultado_amostra_analise(vrr.amostra_analise_id))
        AS is_limite_quantificacao
    FROM neolab.v_resultados_resumo vrr
    LEFT JOIN neolab.amostras_resultados ar ON ar.amostra_analise_id = vrr.amostra_analise_id
    JOIN neolab.amostras_analises aa ON aa.id = vrr.amostra_analise_id
    JOIN neolab.analises an ON an.id = aa.analise_id;
================================================================================

where the function called is:

================================================================================
CREATE OR REPLACE FUNCTION neolab.f_v_formata_valor_resultado(
    p_resultado FLOAT, p_analise_id INTEGER,
    OUT o_resultado TEXT)
AS $_$
    p_resultado = args[0]
    p_analise_id = args[1]

    precisao=plpy.execute('''
        SELECT precisao FROM neolab.analises
        WHERE id=%s''' % p_analise_id)
    unidade=plpy.execute('''
        SELECT u.simbolo FROM neolab.unidades u
        JOIN neolab.analises a ON a.unidade_relatorio_id = u.id
        WHERE a.id = %s''' % p_analise_id)
    formato='%%.%sf %s' % (precisao[0]['precisao'], unidade[0]['simbolo'])
    return ((formato % p_resultado).replace('.', ','))
$_$ LANGUAGE plpythonu STABLE STRICT;

-- -----------------------------------------------------------------------------
-- Sample output:
-- -----------------------------------------------------------------------------
--
-- neo=# select * from neolab.f_v_formata_valor_resultado(3.23456, 53);
--  o_resultado
-- ----------------
--  3,23 mg/g crea
-- (1 registro)
--
-- neo=# select * from neolab.f_v_formata_valor_resultado(3.23456, 59);
--  o_resultado
-- -------------
--  3,23 g/L
-- (1 registro)


CREATE OR REPLACE FUNCTION neolab.f_v_resultado_amostra_analise(
    p_amostra_analise_id INTEGER,
    OUT o_resultado FLOAT, OUT o_is_limite_quantificacao BOOL) AS $_$
DECLARE
    w_resultado FLOAT;
    w_analise_id INTEGER;
    w_limite_quantificacao FLOAT;
BEGIN
    o_is_limite_quantificacao:=FALSE;
    w_resultado:=resultado_calculado FROM neolab.amostras_resultados
        WHERE amostra_analise_id=p_amostra_analise_id;
    w_analise_id:=analise_id FROM neolab.amostras_analises
        WHERE id=p_amostra_analise_id;
    w_limite_quantificacao:=limite_quantificacao FROM neolab.analises
        WHERE id=w_analise_id;
    IF w_resultado < w_limite_quantificacao THEN
        o_is_limite_quantificacao:=TRUE;
        w_resultado:=w_limite_quantificacao;
    END IF;
    o_resultado:=w_resultado;
END;
$_$ LANGUAGE plpgsql STRICT STABLE;

================================================================================



Sometimes I can run (repeatedly) the offending view, but when it crashes I
have to restart the machine to be able to run it again.

On psql I get:

================================================================================
neo=# select * from neolab.v_resultado_amostra_analise ;
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!> \q
================================================================================



This is with:

OpenSuSE 10.2
postgresql-server-8.1.5-13
postgresql-libs-8.1.5-13
postgresql-docs-8.1.5-13
postgresql-devel-8.1.5-13
postgresql-8.1.5-13
postgresql-pl-8.1.5-15

jupiter:/var/lib/pgsql/data/pg_log # psql -V
psql (PostgreSQL) 8.1.5
contains support for command-line editing


Linux jupiter 2.6.18.2-34-default #1 SMP Mon Nov 27 11:46:27 UTC 2006 i686 i686 i386 GNU/Linux


processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 4
model name      : Mobile Intel(R) Pentium(R) 4 CPU 3.06GHz
stepping        : 1
cpu MHz         : 1867.000
cache size      : 1024 KB
physical id     : 0
siblings        : 1
core id         : 0
cpu cores       : 1
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse
sse2ss ht tm pbe constant_tsc up pni monitor ds_cpl est tm2 cid xtpr 
bogomips        : 6137.03


(HT is disabled)



So, my doubt is: where do I start debugging that? :-)



Thanks for any help,
--
Jorge Godoy      <jgodoy@gmail.com>

pgsql-general by date:

Previous
From: Michael Fuhr
Date:
Subject: Re: Unable to restore dump due to client encoding issues -- or, when is SQL_ASCII really UTF8
Next
From: Tom Lane
Date:
Subject: Re: How does filter order relate to query optimization?