BUG #5269: postgres backend terminates with SIGSEGV - Mailing list pgsql-bugs

From Justin Pitts
Subject BUG #5269: postgres backend terminates with SIGSEGV
Date
Msg-id 201001090350.o093octq014172@wwwmaster.postgresql.org
Whole thread Raw
Responses Re: BUG #5269: postgres backend terminates with SIGSEGV  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
The following bug has been logged online:

Bug reference:      5269
Logged by:          Justin Pitts
Email address:      justinpitts@gmail.com
PostgreSQL version: 8.4.2
Operating system:   Debian Lenny 2.6.30-bpo.1-amd-64 kernel
Description:        postgres backend terminates with SIGSEGV
Details:

I originally experienced this on 8.4.1.

Postgres backend terminates with SIGSEGV. I can reproduce this behavior only
about 30% of the time by manipulating my application. The application
performs a steady stream of inserts all the time, and the provoking action
is a rather complex report.

I am including a backtrace. Sanitizing a simple, clear repro case will take
us some time, and it may be easier for us to host a duplicate machine for
someone to ssh into. Please let me know if that is reasonable, or how else I
can assist.





Core was generated by `postgres: ssjiddwdbusr ssjiddwdb 192.168.20.35(46628)
S'.
Program terminated with signal 11, Segmentation fault.
[New process 9564]
#0  0x00007f1c43e0eee5 in memcpy () from /lib/libc.so.6
(gdb) bt
#0  0x00007f1c43e0eee5 in memcpy () from /lib/libc.so.6
#1  0x00000000006bf259 in CopySnapshot (snapshot=0x167db40) at
snapmgr.c:231
#2  0x00000000006bf36d in PushActiveSnapshot (snap=0x167db40) at
snapmgr.c:276
#3  0x0000000000560443 in _SPI_execute_plan (plan=0x16bfe20,
paramLI=0x16507a0, snapshot=0x0,
    crosscheck_snapshot=0x0, read_only=0 '\0', fire_triggers=1 '\001',
tcount=2) at spi.c:1797
#4  0x0000000000560a96 in SPI_execute_plan (plan=0x16bfe20,
Values=0x16cfec0, Nulls=0x16cfed8 " ",
    read_only=0 '\0', tcount=2) at spi.c:392
#5  0x00007f1c01dcdcda in exec_run_select (estate=0x7fffe98830e0,
expr=0x1674708, maxtuples=2, portalP=0x0)
    at pl_exec.c:4149
#6  0x00007f1c01dcdfd2 in exec_eval_expr (estate=0x7fffe98830e0,
expr=0x1674708, isNull=0x7fffe9882e6f "",
    rettype=0x7fffe9882e68) at pl_exec.c:4067
#7  0x00007f1c01dd0223 in exec_assign_expr (estate=0x16920f0,
target=0x16cfda0, expr=0x0) at pl_exec.c:3428
#8  0x00007f1c01dd1eb2 in exec_stmts (estate=0x7fffe98830e0, stmts=<value
optimized out>) at pl_exec.c:1345
#9  0x00007f1c01dd3216 in exec_stmt_block (estate=0x7fffe98830e0,
block=0x1675f60) at pl_exec.c:1137
#10 0x00007f1c01dd47fc in plpgsql_exec_function (func=0x16c2b60,
fcinfo=0x7fffe9883350) at pl_exec.c:315
#11 0x00007f1c01dca43e in plpgsql_call_handler (fcinfo=0x7fffe9883350) at
pl_handler.c:95
#12 0x00000000005493e6 in ExecMakeFunctionResult (fcache=0x167dd00,
econtext=0x167dbf0, isNull=0x167e9d8 "",
    isDone=0x167eaf0) at execQual.c:1685
#13 0x000000000054484e in ExecProject (projInfo=<value optimized out>,
isDone=0x7fffe988383c)
    at execQual.c:5007
#14 0x0000000000556e49 in ExecResult (node=0x167dae0) at nodeResult.c:155
#15 0x0000000000543c5d in ExecProcNode (node=0x167dae0) at
execProcnode.c:344
#16 0x00000000005418a2 in standard_ExecutorRun (queryDesc=0x16bed10,
direction=ForwardScanDirection, count=0)
    at execMain.c:1504
#17 0x00000000005ed677 in PortalRunSelect (portal=0x164a780, forward=<value
optimized out>, count=0,
    dest=0x1602cf0) at pquery.c:953
#18 0x00000000005eea29 in PortalRun (portal=0x164a780,
count=9223372036854775807, isTopLevel=1 '\001',
    dest=0x1602cf0, altdest=0x1602cf0, completionTag=0x7fffe9883cd0 "") at
pquery.c:779
#19 0x00000000005eb790 in PostgresMain (argc=4, argv=<value optimized out>,

    username=0x1563c40 "ssjiddwdbusr") at postgres.c:1928
#20 0x00000000005bfe08 in ServerLoop () at postmaster.c:3449
#21 0x00000000005c0b97 in PostmasterMain (argc=3, argv=0x1561240) at
postmaster.c:1040
#22 0x000000000056a558 in main (argc=3, argv=0x1561240) at main.c:188


I notice bug # 5238, but I have no idea if it is related. I'll take a
hint from it however:

(gdb) p (char *) debug_query_string
$1 = 0x166e190 "SELECT \"datawarehouse\".spr_update_device($1, $2, $3,
$4, $5, $6, $7, $8, $9, $10)"

This is somewhat unexpected. I expected the 'complex report' to be here,
but
this is a writer process which manages a data warehouse schema.

I know more information is needed. I am hoping this is helpful in the
meantime.

pgsql-bugs by date:

Previous
From: Michael Felt
Date:
Subject: Re: AIX initdb problem: /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data fails
Next
From: Tom Lane
Date:
Subject: Re: BUG #5269: postgres backend terminates with SIGSEGV