Re: [BUGS] pg_logical_slot_peek_changes crashes postgres when calledfrom inside pl/pgsql - Mailing list pgsql-bugs

From Andres Freund
Subject Re: [BUGS] pg_logical_slot_peek_changes crashes postgres when calledfrom inside pl/pgsql
Date
Msg-id 20171005211058.z5obk2c2e7zvauk6@alap3.anarazel.de
Whole thread Raw
In response to Re: [BUGS] pg_logical_slot_peek_changes crashes postgres when called from inside pl/pgsql  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [BUGS] pg_logical_slot_peek_changes crashes postgres when called from inside pl/pgsql  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
On 2017-10-05 11:38:40 -0400, Tom Lane wrote:
> Michael Paquier <michael.paquier@gmail.com> writes:
> > This looks like a legit bug to me. Andres, any opinions?
> 
> I wonder why ReorderBufferCommit does this:
> 
>         if (using_subtxn)
>             BeginInternalSubTransaction("replay");
>         else
>             StartTransactionCommand();
> 
> and then tries to clean that up with this brain-dead-looking sequence:
> 
>         AbortCurrentTransaction();
> 
>         /* make sure there's no cache pollution */
>         ReorderBufferExecuteInvalidations(rb, txn);
> 
>         if (using_subtxn)
>             RollbackAndReleaseCurrentSubTransaction();
> 
> Shouldn't that be something like
> 
>         if (using_subtxn)
>             RollbackAndReleaseCurrentSubTransaction();
>         else
>             AbortCurrentTransaction();
> 
> ?  Although by this theory, the using_subtxn path has never worked,
> not even a little bit, which seems somewhat unlikely.

I'm not sure what the problem is that you're seeing? The separation of
AbortCurrentTransaction() and RollbackAndReleaseCurrentSubTransaction()
is so that invalidations get executed outside an xact. The AbortCurrent
will move from TBLOCK_SUBINPROGRESS to TBLOCK_SUBABORT, the release then
from TBLOCK_SUBABORT to the containing transaction's state?

Afaict the exactly same crashing codepath would be reached if
RollbackAndReleaseCurrentSubTransaction() were immediately called,
without a preceding AbortCurrentTransaction().

I don't think that's really the problem. I don't know SPI very well, but
from a quick look it looks to me that the problem is that
AtEOSubXact_SPI() enters the/* * If we are aborting a subtransaction and there is an open SPI context * surrounding the
subxact,clean up to prevent memory leakage. */
 
block even if 'found' is false.  I need to look more at this, but just
adding a 'found &&' to that if makes things pass.

What we have here is that plpgsql uses SPI to execute
pg_logical_slot_peek_changes(), which internally does *not* use SPI. For
every replayed transaction it starts / finishes an internal
subtransaction, and when called from SPI the SPI subxabort handler acts
on the surrounding spi context.

Greetings,

Andres Freund


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: [BUGS] pg_logical_slot_peek_changes crashes postgres when called from inside pl/pgsql
Next
From: Sean Chittenden
Date:
Subject: Re: [BUGS] Re: [PATCH] BUG #13416: Postgres>= 9.3 doesn't use optimized shared memory on Solarisanymore