Re: "cancelling statement due to user request error" occurs but the transaction has committed. - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: "cancelling statement due to user request error" occurs but the transaction has committed.
Date
Msg-id 20150320005013.GE20462@momjian.us
Whole thread Raw
In response to Re: "cancelling statement due to user request error" occurs but the transaction has committed.  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, Mar 19, 2015 at 07:19:02PM -0400, Tom Lane wrote:
> The core issue here, I think, is that xact.c only holds off interrupts
> during what it considers to be the commit critical section --- which is
> okay from the standpoint of transactional consistency.  But the complaint
> has to do with what the client perceives to have happened if a SIGINT
> arrives somewhere between where xact.c has committed and where postgres.c
> has reported the commit to the client.  If we want to address that, I
> think postgres.c needs to hold off interrupts for the entire duration from
> just before CommitTransactionCommand() to just after ReadyForQuery().
> That's likely to be rather messy, because there are so many code paths
> there, especially when you consider error cases.
> 
> A possible way to do this without incurring unacceptably high risk of
> breakage (in particular, ending up with interrupts still held off when
> they shouldn't be any longer) is to assume that there should never be a
> case where we reach ReadCommand() with interrupts still held off.  Then
> we could invent an additional interrupt primitive "RESET_INTERRUPTS()"
> that forces InterruptHoldoffCount to zero (and, presumably, then does
> a CHECK_FOR_INTERRUPTS()); then putting a HOLD_INTERRUPTS() before calling
> CommitTransactionCommand() and a RESET_INTERRUPTS() before waiting for
> client input would presumably be pretty safe.  On the other hand, that
> approach could easily mask interrupt holdoff mismatch bugs elsewhere in
> the code base.

Well, right now we allow interrupts for as long as possible, i.e. to the
middle of CommitTransaction().  Your approach would block interrupts for
a larger span, which might be worse than the bug we are fixing.  It also
feels like it would be unmodular as functions would change the blocking
state when they are called.

Tom is right that my cancel5.diff version has an area between the first
cancel erase and the second cancel erase where a cancel might arrive.  I
assumed there were no checks in that area, but I might be wrong, and
there could be checks there someday.

The fundamental problem is that the place we need to block cancels
starts several levels down in a function, and the place we need to clear
it is higher.  Maybe the entire HOLD/RESUME block API we have for this
is wrong and we just need a global variable to ignore client cancels to
be read inside the signal handler, and we just set and clear it.  I can
work on a patch if people like that idea.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: configure can't detect proper pthread flags
Next
From: Kouhei Kaigai
Date:
Subject: Re: GSoC - Idea Discussion