Re: Fwd: Core dump with nested CREATE TEMP TABLE - Mailing list pgsql-hackers

From Jim Nasby
Subject Re: Fwd: Core dump with nested CREATE TEMP TABLE
Date
Msg-id 55E7A66D.6000404@BlueTreble.com
Whole thread Raw
In response to Re: Fwd: Core dump with nested CREATE TEMP TABLE  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Responses Re: Fwd: Core dump with nested CREATE TEMP TABLE  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
On 9/2/15 2:56 PM, Jim Nasby wrote:
> On 9/2/15 2:17 PM, Alvaro Herrera wrote:
>> Michael Paquier wrote:
>>
>>> I haven't written yet a test case but I think that we could reproduce
>>> that simply by having a relation referenced in the exception block of
>>> a first function, calling a second function that itself raises an
>>> exception, causing the referencing error.
>>
>> Hm, so function 2 is called in the exception block of function 1?
>>
>> Are you going to produce the test case for this?  Jim?
>
> I had attempted one but the issue is that it's more than just an
> exception block thing. If it was that simple then we'd still get the
> crash without involving pgTap. So I suspect you need to have a named
> cursor in the mix as well.
>
> Let me make another attempt at something simpler.

After some time messing around with this I've been able to remove pgTap
from the equation using the attached crash.sql (relevant snippet below).

This only works if you pass a refcursor into the function. It doesn't
work if you do

OPEN have FOR EXECUTE $$SELECT * FROM tf.get( NULL::invoice, 'base' )$$;

inside the function instead. This code also produces different results;
it outputs the error message before crashing and the stack trace shows
the assert is now failing while trying to abort the top-level
transaction. So it looks like the scenario is:

BEGIN;
DECLARE (open?) cursor that calls function with exception handler that
calls function with exception handler that calls something that fails

The double set of exceptions seems to be critical; if instead of calling
tf.get(invoice) (which recursively does tf.get(customer)) I define the
cursor to call tf.get(customer) directly there's no assert.

I can poke at this more tomorrow, but it'd be very helpful if someone
could explain this failure mode, as I'm basically stumbling in the dark
on a test case right now.

CREATE OR REPLACE FUNCTION a(
     have refcursor
) RETURNS void LANGUAGE plpgsql AS $body$
DECLARE
     have_rec record;
BEGIN
     FETCH have INTO have_rec;
END
$body$;
DECLARE h CURSOR FOR SELECT * FROM tf.get( NULL::invoice, 'base' );
SELECT a(
   'h'::refcursor
);

> (lldb) bt
> * thread #1: tid = 0x722ed8, 0x00007fff92a5a866 libsystem_kernel.dylib`__pthread_kill + 10, queue =
'com.apple.main-thread',stop reason = signal SIGABRT 
>   * frame #0: 0x00007fff92a5a866 libsystem_kernel.dylib`__pthread_kill + 10
>     frame #1: 0x00007fff9001b35c libsystem_pthread.dylib`pthread_kill + 92
>     frame #2: 0x00007fff8cf89b1a libsystem_c.dylib`abort + 125
>     frame #3: 0x000000010cdb4039 postgres`ExceptionalCondition(conditionName=0x000000010cf59cfd,
errorType=0x000000010cec392d,fileName=0x000000010cf59045, lineNumber=1972) + 137 at assert.c:54 
>     frame #4: 0x000000010cd9b332 postgres`RelationClearRelation(relation=0x000000011658f110, rebuild='\0') + 162 at
relcache.c:1970
>     frame #5: 0x000000010cd9cc0f postgres`AtEOSubXact_cleanup(relation=0x000000011658f110, isCommit='\0', mySubid=15,
parentSubid=1)+ 79 at relcache.c:2665 
>     frame #6: 0x000000010cd9cb92 postgres`AtEOSubXact_RelationCache(isCommit='\0', mySubid=15, parentSubid=1) + 242
atrelcache.c:2633 
>     frame #7: 0x000000010c9b6e88 postgres`AbortSubTransaction + 440 at xact.c:4373
>     frame #8: 0x000000010c9b7208 postgres`AbortCurrentTransaction + 312 at xact.c:2947
>     frame #9: 0x000000010cc249f3 postgres`PostgresMain(argc=1, argv=0x00007fa3f4800378, dbname=0x00007fa3f4800200,
username=0x00007fa3f30031f8)+ 1875 at postgres.c:3902 
>     frame #10: 0x000000010cb9da48 postgres`PostmasterMain [inlined] BackendRun + 8024 at postmaster.c:4155
>     frame #11: 0x000000010cb9da22 postgres`PostmasterMain [inlined] BackendStartup at postmaster.c:3829
>     frame #12: 0x000000010cb9da22 postgres`PostmasterMain [inlined] ServerLoop at postmaster.c:1597
>     frame #13: 0x000000010cb9da22 postgres`PostmasterMain(argc=<unavailable>, argv=<unavailable>) + 7986 at
postmaster.c:1244
>     frame #14: 0x000000010cb218cd postgres`main(argc=<unavailable>, argv=<unavailable>) + 1325 at main.c:228
>     frame #15: 0x00007fff8e9a35fd libdyld.dylib`start + 1
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

Attachment

pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: Allow a per-tablespace effective_io_concurrency setting
Next
From: Michael Paquier
Date:
Subject: Re: Allow replication roles to use file access functions