Thread: Recovering from an exception

Recovering from an exception

From
Marko Tiikkaja
Date:
Hi,

I'm trying to recover from an exception in an fmgr hook.  It seems to
work relatively well most of the time, but in some cases the backend
segfaults.  Attached is a self-contained test case demonstrating the
problem.  The problem I'm hitting can be reproduced with:

create function f4() returns int as $$ select 2 $$ language sql;
create function f3() returns int as $$ declare f int; begin select f4()
into f; return 1; end $$ language plpgsql;

select f3();

the backend dies at:

Program received signal SIGSEGV, Segmentation fault.
0x08406d49 in fmgr_security_definer (fcinfo=0x89b133c) at fmgr.c:975
975                     result = FunctionCallInvoke(fcinfo);

10 lines of backtrace:

#0  0x08406d49 in fmgr_security_definer (fcinfo=0x89b133c) at fmgr.c:975
#1  0x082094b5 in ExecMakeFunctionResult (fcache=0x89b1300,
     econtext=0x89b11ec, isNull=0x89b185c '\177' <repeats 200 times>...,
     isDone=0x89b1908) at execQual.c:1917
#2  0x08209e32 in ExecEvalFunc (fcache=0x89b1300, econtext=0x89b11ec,
     isNull=0x89b185c '\177' <repeats 200 times>..., isDone=0x89b1908)
     at execQual.c:2356
#3  0x0820f6d9 in ExecTargetList (targetlist=0x89b18ec, econtext=0x89b11ec,
     values=0x89b1848, isnull=0x89b185c '\177' <repeats 200 times>...,
     itemIsDone=0x89b1908, isDone=0xbffc2118) at execQual.c:5210
#4  0x0820fc03 in ExecProject (projInfo=0x89b1870, isDone=0xbffc2118)
     at execQual.c:5425
#5  0x08224972 in ExecResult (node=0x89b1160) at nodeResult.c:155
#6  0x08206337 in ExecProcNode (node=0x89b1160) at execProcnode.c:367
#7  0x082041dc in ExecutePlan (estate=0x89b10d4, planstate=0x89b1160,
     operation=CMD_SELECT, sendTuples=1 '\001', numberTuples=1,
     direction=ForwardScanDirection, dest=0x862170c) at execMain.c:1440
#8  0x08202887 in standard_ExecutorRun (queryDesc=0x89adcac,
     direction=ForwardScanDirection, count=1) at execMain.c:314
#9  0x082026fb in ExecutorRun (queryDesc=0x89adcac,
     direction=ForwardScanDirection, count=1) at execMain.c:262
#10 0x08232131 in _SPI_pquery (queryDesc=0x89adcac, fire_triggers=1 '\001',
     tcount=1) at spi.c:2110

It looks like fcinfo (amongst other things) is allocated in a child of
the SPI context.  My speculation is that the SPI context gets reset by
AtEOSubXact_SPI(), thus resetting the memory fcinfo points to, leading
to SIGSEGV.

Any thoughts on how to avoid this crash?  Or generally, how to correctly
clean up after an exception?  The attached code tries to imitate what
the PLs are doing, but it's not working. :-(


Regards,
Marko Tiikkaja

Attachment

Re: Recovering from an exception

From
"Marko Tiikkaja"
Date:
Hi,

On Tue, 01 Jan 2013 15:56:50 +0100, I wrote:
> It looks like fcinfo (amongst other things) is allocated in a child of
> the SPI context.  My speculation is that the SPI context gets reset by
> AtEOSubXact_SPI(), thus resetting the memory fcinfo points to, leading
> to SIGSEGV.

Indeed, that looks to be the case.  If I change the bottom part of
AtEOSubXact_SPI() a bit:

-       if (_SPI_current && !isCommit)
+       if (_SPI_current && _SPI_current->connectSubid == mySubid &&
!isCommit)

the problem goes away.

I'm puzzled as to why AtEOSubXact_SPI() needs to unconditionally clear the
surround SPI context, or why it assumes it's a good idea.



Regards,
Marko Tiikkaja


Re: Recovering from an exception

From
"Marko Tiikkaja"
Date:
On Wed, 02 Jan 2013 01:16:11 +0100, I wrote:
> I'm puzzled as to why AtEOSubXact_SPI() needs to unconditionally clear
> the surround SPI context, or why it assumes it's a good idea.

I managed to fix this by creating my own SPI context outside the
subtransaction for AtEOSubXact_SPI() to destroy, and everything appears to
be working correctly.


Regards,
Marko Tiikkaja