Thread: BUG #7656: PL/Perl SPI_freetuptable() segfault

BUG #7656: PL/Perl SPI_freetuptable() segfault

From
pgmail@joh.to
Date:
The following bug has been logged on the website:

Bug reference:      7656
Logged by:          Marko Tiikkaja
Email address:      pgmail@joh.to
PostgreSQL version: 9.1.6
Operating system:   OS X
Description:        =


Hi,

I have a reproducible segmentation fault in PL/Perl.  I have yet to narrow
down the test case to something sensible, but I do have a backtrace:

219        while (context->firstchild !=3D NULL)
(gdb) bt
#0  0x0000000104e90782 in MemoryContextDeleteChildren (context=3D0x1000002b=
d)
at mcxt.c:219
#1  0x0000000104e906a8 in MemoryContextDelete (context=3D0x1000002bd) at
mcxt.c:174
#2  0x0000000104bbefb5 in SPI_freetuptable (tuptable=3D0x7f9ae4289230) at
spi.c:1003
#3  0x000000011ec9928b in plperl_spi_execute_fetch_result
(tuptable=3D0x7f9ae4289230, processed=3D1, status=3D-6) at plperl.c:2900
#4  0x000000011ec98f27 in plperl_spi_exec (query=3D0x7f9ae4155f80
"0x7f9ae3e3fe50", limit=3D-439796840) at plperl.c:2821
#5  0x000000011ec9b5f7 in XS__spi_exec_query (my_perl=3D0x7f9ae40cce00,
cv=3D0x7f9ae4148e90) at SPI.c:69
#6  0x000000011ed19abd in Perl_pp_entersub ()
#7  0x000000011ed11ee1 in Perl_runops_standard ()
#8  0x000000011ecc36a3 in Perl_call_sv ()
#9  0x000000011ec949c7 in plperl_call_perl_func (desc=3D0x7f9ae42c7400,
fcinfo=3D0x7fff5b266790) at plperl.c:2066
#10 0x000000011ec96cda in plperl_func_handler (fcinfo=3D0x7fff5b266790) at
plperl.c:2199
#11 0x000000011ec91f62 in plperl_call_handler (fcinfo=3D0x7fff5b266790) at
plperl.c:1710
#12 0x000000011ec92fd8 in plperlu_call_handler (fcinfo=3D0x7fff5b266790) at
plperl.c:1911
#13 0x0000000104e6671a in fmgr_security_definer (fcinfo=3D0x7fff5b266790) at
fmgr.c:975
#14 0x0000000104b8bd8b in ExecMakeTableFunctionResult
(funcexpr=3D0x7f9ae421f8b0, econtext=3D0x7f9ae421f450,
expectedDesc=3D0x7f9ae421f770, randomAccess=3D0 '\0') at execQual.c:2146
#15 0x0000000104baf39c in FunctionNext (node=3D0x7f9ae421f340) at
nodeFunctionscan.c:65
#16 0x0000000104b94e4d in ExecScanFetch (node=3D0x7f9ae421f340,
accessMtd=3D0x104baf320 <FunctionNext>, recheckMtd=3D0x104baf420
<FunctionRecheck>) at execScan.c:82
#17 0x0000000104b94b73 in ExecScan (node=3D0x7f9ae421f340,
accessMtd=3D0x104baf320 <FunctionNext>, recheckMtd=3D0x104baf420
<FunctionRecheck>) at execScan.c:132
#18 0x0000000104baf479 in ExecFunctionScan (node=3D0x7f9ae421f340) at
nodeFunctionscan.c:105
#19 0x0000000104b87235 in ExecProcNode (node=3D0x7f9ae421f340) at
execProcnode.c:416
#20 0x0000000104b8481e in ExecutePlan (estate=3D0x7f9ae421f230,
planstate=3D0x7f9ae421f340, operation=3DCMD_SELECT, sendTuples=3D1 '\001',
numberTuples=3D0, direction=3DForwardScanDirection, dest=3D0x7f9ae401f640) =
at
execMain.c:1440
#21 0x0000000104b82827 in standard_ExecutorRun (queryDesc=3D0x7f9ae4334d90,
direction=3DForwardScanDirection, count=3D0) at execMain.c:314
#22 0x000000010530f6d4 in explain_ExecutorRun ()
#23 0x0000000104b826d1 in ExecutorRun (queryDesc=3D0x7f9ae4334d90,
direction=3DForwardScanDirection, count=3D0) at execMain.c:260
#24 0x0000000104cf863b in PortalRunSelect (portal=3D0x7f9ae403f030, forward=
=3D1
'\001', count=3D0, dest=3D0x7f9ae401f640) at pquery.c:943
#25 0x0000000104cf820b in PortalRun (portal=3D0x7f9ae403f030,
count=3D9223372036854775807, isTopLevel=3D1 '\001', dest=3D0x7f9ae401f640,
altdest=3D0x7f9ae401f640, completionTag=3D0x7fff5b26724f "") at pquery.c:787
#26 0x0000000104cf251a in exec_execute_message (portal_name=3D0x7f9ae401f230
"", max_rows=3D9223372036854775807) at postgres.c:1965
#27 0x0000000104cf6005 in PostgresMain (argc=3D2, argv=3D0x7f9ae401ba90,
username=3D0x7f9ae401ba60 "marko") at postgres.c:4025
#28 0x0000000104c8b0ff in BackendRun (port=3D0x7f9ae3c06050) at
postmaster.c:3617
#29 0x0000000104c8a444 in BackendStartup (port=3D0x7f9ae3c06050) at
postmaster.c:3302
#30 0x0000000104c869ac in ServerLoop () at postmaster.c:1466
#31 0x0000000104c85bd0 in PostmasterMain (argc=3D3, argv=3D0x7f9ae3c03ce0) =
at
postmaster.c:1127
#32 0x0000000104bd807b in main (argc=3D3, argv=3D0x7f9ae3c03ce0) at main.c:=
199

I'm not running exactly 9.1.6; this is commit ff8f7103b559d8f19731157aca38
from REL9_1_STABLE.

While trying to narrow down the test case I noticed what the problem was: I
was calling spi_execute_query() instead of spi_execute_prepared().  I can't
reproduce it on a smaller test case (I get the expected "syntax error"), but
I still have the old function available if that seems necessary.

Re: BUG #7656: PL/Perl SPI_freetuptable() segfault

From
Tom Lane
Date:
pgmail@joh.to writes:
> I have a reproducible segmentation fault in PL/Perl.  I have yet to narrow
> down the test case to something sensible, but I do have a backtrace:

> 219        while (context->firstchild != NULL)
> (gdb) bt
> #0  0x0000000104e90782 in MemoryContextDeleteChildren (context=0x1000002bd)
> at mcxt.c:219
> #1  0x0000000104e906a8 in MemoryContextDelete (context=0x1000002bd) at
> mcxt.c:174
> #2  0x0000000104bbefb5 in SPI_freetuptable (tuptable=0x7f9ae4289230) at
> spi.c:1003
> #3  0x000000011ec9928b in plperl_spi_execute_fetch_result
> (tuptable=0x7f9ae4289230, processed=1, status=-6) at plperl.c:2900
> #4  0x000000011ec98f27 in plperl_spi_exec (query=0x7f9ae4155f80
> "0x7f9ae3e3fe50", limit=-439796840) at plperl.c:2821
> #5  0x000000011ec9b5f7 in XS__spi_exec_query (my_perl=0x7f9ae40cce00,
> cv=0x7f9ae4148e90) at SPI.c:69

> While trying to narrow down the test case I noticed what the problem was: I
> was calling spi_execute_query() instead of spi_execute_prepared().

Hm.  It looks like SPI_execute failed as expected (note the status
passed to plperl_spi_execute_fetch_result is -6 which is
SPI_ERROR_ARGUMENT), but it did not reset SPI_tuptable, which led to
plperl_spi_execute_fetch_result trying to call SPI_freetuptable on what
was probably an already-deleted tuple table.

One theory we could adopt on this is that this is
plperl_spi_execute_fetch_result's fault and it shouldn't be trying to
free a tuple table unless status > 0.

Another theory we could adopt is that SPI functions that are capable of
setting SPI_tuptable ought to clear it at start, to ensure that they
return it as null on failure.

The latter seems like a "nicer" fix but I'm afraid it might have
unexpected side-effects.  It would certainly be a lot more invasive.

Thoughts?
        regards, tom lane



Re: [HACKERS] BUG #7656: PL/Perl SPI_freetuptable() segfault

From
Andrew Dunstan
Date:
On 11/13/2012 12:17 PM, Tom Lane wrote:
> pgmail@joh.to writes:
>> I have a reproducible segmentation fault in PL/Perl.  I have yet to narrow
>> down the test case to something sensible, but I do have a backtrace:
>> 219        while (context->firstchild != NULL)
>> (gdb) bt
>> #0  0x0000000104e90782 in MemoryContextDeleteChildren (context=0x1000002bd)
>> at mcxt.c:219
>> #1  0x0000000104e906a8 in MemoryContextDelete (context=0x1000002bd) at
>> mcxt.c:174
>> #2  0x0000000104bbefb5 in SPI_freetuptable (tuptable=0x7f9ae4289230) at
>> spi.c:1003
>> #3  0x000000011ec9928b in plperl_spi_execute_fetch_result
>> (tuptable=0x7f9ae4289230, processed=1, status=-6) at plperl.c:2900
>> #4  0x000000011ec98f27 in plperl_spi_exec (query=0x7f9ae4155f80
>> "0x7f9ae3e3fe50", limit=-439796840) at plperl.c:2821
>> #5  0x000000011ec9b5f7 in XS__spi_exec_query (my_perl=0x7f9ae40cce00,
>> cv=0x7f9ae4148e90) at SPI.c:69
>> While trying to narrow down the test case I noticed what the problem was: I
>> was calling spi_execute_query() instead of spi_execute_prepared().
> Hm.  It looks like SPI_execute failed as expected (note the status
> passed to plperl_spi_execute_fetch_result is -6 which is
> SPI_ERROR_ARGUMENT), but it did not reset SPI_tuptable, which led to
> plperl_spi_execute_fetch_result trying to call SPI_freetuptable on what
> was probably an already-deleted tuple table.
>
> One theory we could adopt on this is that this is
> plperl_spi_execute_fetch_result's fault and it shouldn't be trying to
> free a tuple table unless status > 0.
>
> Another theory we could adopt is that SPI functions that are capable of
> setting SPI_tuptable ought to clear it at start, to ensure that they
> return it as null on failure.
>
> The latter seems like a "nicer" fix but I'm afraid it might have
> unexpected side-effects.  It would certainly be a lot more invasive.


These aren't mutually exclusive, though, are they? It seems reasonable 
to do the minimal fix for the stable branches (looks like it's just a 
matter of moving the call up a couple of lines in plperl.c) and make the 
nicer fix just for the development branch.

cheers

andrew