Re: Hot Standy introduced problem with query cancel behavior - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Hot Standy introduced problem with query cancel behavior
Date
Msg-id 16887.1262897267@sss.pgh.pa.us
Whole thread Raw
In response to Re: Hot Standy introduced problem with query cancel behavior  (Andres Freund <andres@anarazel.de>)
Responses Re: Hot Standy introduced problem with query cancel behavior  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> The reason I suggested adding CHECK_FOR_INTERRUPTS into the recv code path was 
> that it should allow a relatively "natural" handling of canceling "IDLE IN 
> TRANSACTION" queries without doing anything in the interrupt handler.

> I think it shouldn't be to hard to make that code path safe for 
> CHECK_FOR_INTERRUPTS().

Idle in transaction isn't the problem (except for what it does to the
FE/BE protocol state).  The problem is what happens inside a non-idle
transaction.

Since apparently I'm still not being clear enough about this, let me
spell it out:

1. Outer transaction calls, say, a plperl function.
2. plperl function executes some query via SPI, thereby starting  a subtransaction.
3. We receive an HS query-cancel interrupt.  Since  !ImmediateInterruptOK, this just sets QueryCancelPending.
4. At the next occurrence of CHECK_FOR_INTERRUPTS, ProcessInterrupts  is entered.
5. According to both Simon's committed patch and his recent variant,  ProcessInterrupts executes
AbortOutOfAnyTransactionand then throws  elog(ERROR).
 
6. plperl.c catches the elog longjmp and tries to abort its  subtransaction (loss #1), then return to the Perl
interpreter which is under no obligation to abort processing its perl script  (loss #2), and whenever it does exit, or
elsecall SPI to try to  process another query, we're screwed because the outer transaction  is already dead (loss #3).
 

The situation with Perl or Python or some other PL is pretty much
the worst case, since we have no control whatever over that code
layer --- but in reality this type of scenario can play out even
without any third-party code involved.  Anyplace that catches an
elog longjmp will be broken by AbortOutOfAnyTransaction inside
ProcessInterrupts, because things aren't supposed to happen in that
order.
        regards, tom lane


pgsql-hackers by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: true serializability and predicate locking
Next
From: Magnus Hagander
Date:
Subject: Re: RFC: PostgreSQL Add-On Network