Thread: "cancelling statement due to user request error" occurs but the transaction has committed.

Hi All,

When log_duration is true ( or log_min_duration_statement>=0 ),
If a transaction has internally been commited receives a SIGINT signal
then a query cancellation error is output.

For example,
1. A query like a TRUNCATE is removing bigger table files.
2. The session receives SIGINT signal.
3. Query cancellation error occurs.
4. But the query has commited.

e.g.)
---
naoya=# \d
       List of relations
 Schema | Name | Type  | Owner
--------+------+-------+-------
 public | hoge | table | naoya
(1 row)

naoya=# set log_duration=on;
SET
naoya=# select count(*) from hoge;
 count
--------
 100000
(1 row)

naoya=# truncate hoge;
Cancel request sent
ERROR:  canceling statement due to user request
naoya=# select count(*) from hoge;
 count
-------
     0
(1 row)
---

This is because  ProcessInterrupts function is called by errfinish ( in query-duration ereport).

I think this cancellation request must not interrupt the internal commited transaction.

This is because clients may misunderstand "the transaction has rollbacked".

Now,
I tried to fix the problem.

--- postgresql-fe7337f/src/backend/utils/error/elog.c    2014-06-06 11:57:44.000000000 +0900
+++ postgresql-fe7337f.new/src/backend/utils/error/elog.c    2014-06-06 13:10:51.000000000 +0900
@@ -580,7 +580,8 @@
      * can stop a query emitting tons of notice or warning messages, even if
      * it's in a loop that otherwise fails to check for interrupts.
      */
-    CHECK_FOR_INTERRUPTS();
+    if (IsTransactionState())
+        CHECK_FOR_INTERRUPTS();
 }

Thereby,
When ereport(non error level) calls and not in-transaction state,
PostgreSQL never calls ProcessInterrupts function by errfinish.

But I have a anxiety to fix errfinish function because
errfinish is called in many many situations..

Could you please confirm it?

Regards,

Naoya

---
Naoya Anzai
Engineering Department
NEC Solution Inovetors, Ltd.
E-Mail: anzai-naoya@mxu.nes.nec.co.jp
---




Attachment
On Fri, Jun 6, 2014 at 2:11 PM, Naoya Anzai <anzai-naoya@mxu.nes.nec.co.jp> wrote:
>
> Hi All,
>
> When log_duration is true ( or log_min_duration_statement>=0 ),
> If a transaction has internally been commited receives a SIGINT signal
> then a query cancellation error is output.
>
> For example,
> 1. A query like a TRUNCATE is removing bigger table files.
> 2. The session receives SIGINT signal.
> 3. Query cancellation error occurs.
> 4. But the query has commited.
>
>
> naoya=# truncate hoge;
> Cancel request sent
> ERROR:  canceling statement due to user request
> naoya=# select count(*) from hoge;
>  count
> -------
>      0
> (1 row)
> ---
>
> This is because  ProcessInterrupts function is called by errfinish ( in query-duration ereport).
>
> I think this cancellation request must not interrupt the internal commited transaction.
>
> This is because clients may misunderstand "the transaction has rollbacked".

There can be similar observation if the server goes off (power
outage or anything like) after committing transaction, client will
receive connection broken, so he can misunderstand that as well.
I think for such corner cases, client needs to reconfirm his action
results with database before concluding anything.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Hi Amit,
Thank you for your response.

> There can be similar observation if the server goes off (power
> outage or anything like) after committing transaction, client will
> receive connection broken, so he can misunderstand that as well.
> I think for such corner cases, client needs to reconfirm his action
> results with database before concluding anything.

I see.
Now, I understand that ProcessInterrupts Error (ProcDie, QueryCancel, ClientLost..) does not mean "That query has been
RollBacked".

Regards,

Naoya

---
Naoya Anzai
Engineering Department
NEC Solution Inovetors, Ltd.
E-Mail: anzai-naoya@mxu.nes.nec.co.jp
---



On Sun, Jun 8, 2014 at 2:52 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> I think this cancellation request must not interrupt the internal commited
>> transaction.
>>
>> This is because clients may misunderstand "the transaction has
>> rollbacked".
>
> There can be similar observation if the server goes off (power
> outage or anything like) after committing transaction, client will
> receive connection broken, so he can misunderstand that as well.
> I think for such corner cases, client needs to reconfirm his action
> results with database before concluding anything.

I don't agree with this analysis.  If the connection is closed after
the client sends a COMMIT and before it gets a response, then the
client must indeed be smart enough to figure out whether or not the
commit happened.  But if the server sends a response, the client
should be able to rely on that response being correct.  In this case,
an ERROR is getting sent but the transaction is getting committed;
yuck.  I'm not sure whether the fix is right, but this definitely
seems like a bug.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Robert Haas <robertmhaas@gmail.com> writes:
> I don't agree with this analysis.  If the connection is closed after
> the client sends a COMMIT and before it gets a response, then the
> client must indeed be smart enough to figure out whether or not the
> commit happened.  But if the server sends a response, the client
> should be able to rely on that response being correct.  In this case,
> an ERROR is getting sent but the transaction is getting committed;
> yuck.  I'm not sure whether the fix is right, but this definitely
> seems like a bug.

In general, the only way to avoid that sort of behavior for a post-commit
error would be to PANIC ... and even then, the transaction got committed,
which might not be the expectation of a client that got an error message,
even if it said PANIC.  So this whole area is a minefield, and the only
attractive thing we can do is to try to reduce the number of errors that
can get thrown post-commit.  We already, for example, do not treat
post-commit file unlink failures as ERROR, though we surely would prefer
to do that.

So from this standpoint, redefining SIGINT as not throwing an error when
we're in post-commit seems like a good idea.  I'm not endorsing any
details of the patch here, but the 20000-foot view seems generally sound.
        regards, tom lane



Robert Haas <robertmhaas@gmail.com> wrote:

> If the connection is closed after the client sends a COMMIT and
> before it gets a response, then the client must indeed be smart
> enough to figure out whether or not the commit happened.  But if
> the server sends a response, the client should be able to rely on
> that response being correct.  In this case, an ERROR is getting
> sent but the transaction is getting committed; yuck.  I'm not
> sure whether the fix is right, but this definitely seems like a
> bug.

+1

It is one thing to send a request and experience a crash or loss of
connection before a response is delivered.  You have to consider
that the state of the transaction is indeterminate and needs to be
checked.  But if the client receives a response saying that the
commit was successful, or that the transaction was rolled back,
that had better reflect reality; otherwise it is a clear bug.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Tue, Jun 10, 2014 at 10:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I don't agree with this analysis.  If the connection is closed after
>> the client sends a COMMIT and before it gets a response, then the
>> client must indeed be smart enough to figure out whether or not the
>> commit happened.  But if the server sends a response, the client
>> should be able to rely on that response being correct.  In this case,
>> an ERROR is getting sent but the transaction is getting committed;
>> yuck.  I'm not sure whether the fix is right, but this definitely
>> seems like a bug.
>
> In general, the only way to avoid that sort of behavior for a post-commit
> error would be to PANIC ... and even then, the transaction got committed,
> which might not be the expectation of a client that got an error message,
> even if it said PANIC.  So this whole area is a minefield, and the only
> attractive thing we can do is to try to reduce the number of errors that
> can get thrown post-commit.  We already, for example, do not treat
> post-commit file unlink failures as ERROR, though we surely would prefer
> to do that.

We could treated it as a lost-communication scenario.  The appropriate
recovery actions from the client's point of view are identical.

> So from this standpoint, redefining SIGINT as not throwing an error when
> we're in post-commit seems like a good idea.  I'm not endorsing any
> details of the patch here, but the 20000-foot view seems generally sound.

Cool, that makes sense to me also.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Jun 10, 2014 at 10:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> ...  So this whole area is a minefield, and the only
>> attractive thing we can do is to try to reduce the number of errors that
>> can get thrown post-commit.  We already, for example, do not treat
>> post-commit file unlink failures as ERROR, though we surely would prefer
>> to do that.

> We could treated it as a lost-communication scenario.  The appropriate
> recovery actions from the client's point of view are identical.

I'd hardly rate that as an attractive option.
        regards, tom lane



On Tue, Jun 10, 2014 at 10:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Tue, Jun 10, 2014 at 10:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> ...  So this whole area is a minefield, and the only
>>> attractive thing we can do is to try to reduce the number of errors that
>>> can get thrown post-commit.  We already, for example, do not treat
>>> post-commit file unlink failures as ERROR, though we surely would prefer
>>> to do that.
>
>> We could treated it as a lost-communication scenario.  The appropriate
>> recovery actions from the client's point of view are identical.
>
> I'd hardly rate that as an attractive option.

Well, the only other principled fix I can see is to add a new reponse
along the lines of ERRORBUTITCOMMITTED, which does not seem attractive
either, since all clients will have to be taught to understand it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Hi,

> Well, the only other principled fix I can see is to add a new reponse
> along the lines of ERRORBUTITCOMMITTED, which does not seem attractive
> either, since all clients will have to be taught to understand it.

+1

I think current specification hard to understand for many users.
It is really good if PostgreSQL gave us a message such as a replication abort warning:
###
WARNING:  canceling wait for synchronous replication due to user request
DETAIL:  The transaction has already committed locally, but might not have been replicated to the standby.
###

Regards,

Naoya

---
Naoya Anzai
Engineering Department
NEC Solution Inovetors, Ltd.
E-Mail: anzai-naoya@mxu.nes.nec.co.jp
---



On Tue, Jun 10, 2014 at 10:30:24AM -0400, Robert Haas wrote:
> On Tue, Jun 10, 2014 at 10:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Robert Haas <robertmhaas@gmail.com> writes:
> >> I don't agree with this analysis.  If the connection is closed after
> >> the client sends a COMMIT and before it gets a response, then the
> >> client must indeed be smart enough to figure out whether or not the
> >> commit happened.  But if the server sends a response, the client
> >> should be able to rely on that response being correct.  In this case,
> >> an ERROR is getting sent but the transaction is getting committed;
> >> yuck.  I'm not sure whether the fix is right, but this definitely
> >> seems like a bug.
> >
> > In general, the only way to avoid that sort of behavior for a post-commit
> > error would be to PANIC ... and even then, the transaction got committed,
> > which might not be the expectation of a client that got an error message,
> > even if it said PANIC.  So this whole area is a minefield, and the only
> > attractive thing we can do is to try to reduce the number of errors that
> > can get thrown post-commit.  We already, for example, do not treat
> > post-commit file unlink failures as ERROR, though we surely would prefer
> > to do that.
> 
> We could treated it as a lost-communication scenario.  The appropriate
> recovery actions from the client's point of view are identical.
> 
> > So from this standpoint, redefining SIGINT as not throwing an error when
> > we're in post-commit seems like a good idea.  I'm not endorsing any
> > details of the patch here, but the 20000-foot view seems generally sound.
> 
> Cool, that makes sense to me also.

Did we ever do anything about this?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



On Wed, Sep 10, 2014 at 08:10:45PM -0400, Bruce Momjian wrote:
> On Tue, Jun 10, 2014 at 10:30:24AM -0400, Robert Haas wrote:
> > On Tue, Jun 10, 2014 at 10:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > Robert Haas <robertmhaas@gmail.com> writes:
> > >> I don't agree with this analysis.  If the connection is closed after
> > >> the client sends a COMMIT and before it gets a response, then the
> > >> client must indeed be smart enough to figure out whether or not the
> > >> commit happened.  But if the server sends a response, the client
> > >> should be able to rely on that response being correct.  In this case,
> > >> an ERROR is getting sent but the transaction is getting committed;
> > >> yuck.  I'm not sure whether the fix is right, but this definitely
> > >> seems like a bug.
> > >
> > > In general, the only way to avoid that sort of behavior for a post-commit
> > > error would be to PANIC ... and even then, the transaction got committed,
> > > which might not be the expectation of a client that got an error message,
> > > even if it said PANIC.  So this whole area is a minefield, and the only
> > > attractive thing we can do is to try to reduce the number of errors that
> > > can get thrown post-commit.  We already, for example, do not treat
> > > post-commit file unlink failures as ERROR, though we surely would prefer
> > > to do that.
> >
> > We could treated it as a lost-communication scenario.  The appropriate
> > recovery actions from the client's point of view are identical.
> >
> > > So from this standpoint, redefining SIGINT as not throwing an error when
> > > we're in post-commit seems like a good idea.  I'm not endorsing any
> > > details of the patch here, but the 20000-foot view seems generally sound.
> >
> > Cool, that makes sense to me also.
>
> Did we ever do anything about this?

I have researched this issue originally reported in June of 2014 and
implemented a patch to ignore cancel while we are completing a commit.
I am not clear if this is the proper place for this code, though a
disable_timeout() call on the line above suggests I am close.  :-)
(The disable_timeout disables internal timeouts, but it doesn't disable
cancels coming from the client.)

The first patch is for testing and adds a sleep(5) to the end of the
TRUNCATE command, to give the tester time to press Control-C from psql,
and enables log_duration so the cancel is checked.

The second patch is the patch that disables cancel when we are in the
process of committing;  before:

    test=> CREATE TABLE test(x INT);
    CREATE TABLE
    test=> INSERT INTO test VALUES (3);
    INSERT 0 1
    test=> TRUNCATE test;
    ^CCancel request sent
-->    ERROR:  canceling statement due to user request
    test=> SELECT * FROM test;
     x
    ---
    (0 rows)

and with both patches:

    test=> CREATE TABLE test(x INT);
    CREATE TABLE
    test=> INSERT INTO test VALUES (3);
    INSERT 0 1
    test=> TRUNCATE test;
    ^CCancel request sent
-->    TRUNCATE TABLE
    test=> SELECT * FROM test;
     x
    ---
    (0 rows)

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Attachment
On Wed, Mar 18, 2015 at 10:56 PM, Bruce Momjian <bruce@momjian.us> wrote:
> I have researched this issue originally reported in June of 2014 and
> implemented a patch to ignore cancel while we are completing a commit.
> I am not clear if this is the proper place for this code, though a
> disable_timeout() call on the line above suggests I am close.  :-)

This would also disable cancel interrupts while running AFTER
triggers, which seems almost certain to be wrong.  TBH, I'm not sure
why the existing HOLD_INTERRUPTS() in CommitTransaction() isn't
already preventing this problem.  Did you investigate that at all?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Thu, Mar 19, 2015 at 07:54:02AM -0400, Robert Haas wrote:
> On Wed, Mar 18, 2015 at 10:56 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > I have researched this issue originally reported in June of 2014 and
> > implemented a patch to ignore cancel while we are completing a commit.
> > I am not clear if this is the proper place for this code, though a
> > disable_timeout() call on the line above suggests I am close.  :-)
>
> This would also disable cancel interrupts while running AFTER
> triggers, which seems almost certain to be wrong.  TBH, I'm not sure
> why the existing HOLD_INTERRUPTS() in CommitTransaction() isn't
> already preventing this problem.  Did you investigate that at all?

Yes, the situation is complex, and was suggested by the original poster.
The issue with CommitTransaction() is that it only _holds_ the signal
--- it doesn't clear it.  Now, since there are very few
CHECK_FOR_INTERRUPTS() calls in the typical commit process flow, the
signal is normally erased.  However, if log_duration or
log_min_duration_statement are set, they call ereport, which calls
errfinish(), which has a call to CHECK_FOR_INTERRUPTS().

First attached patch is more surgical and clears a possible cancel
request before we report the query duration in the logs --- this doesn't
affect any after triggers that might include CHECK_FOR_INTERRUPTS()
calls we want to honor.

Another approach would be to have CommitTransaction() clear any pending
cancel before it calls RESUME_INTERRUPTS().  The second attached patch
takes that approach, and also works.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Attachment
On Thu, Mar 19, 2015 at 10:23 AM, Bruce Momjian <bruce@momjian.us> wrote:
> On Thu, Mar 19, 2015 at 07:54:02AM -0400, Robert Haas wrote:
>> On Wed, Mar 18, 2015 at 10:56 PM, Bruce Momjian <bruce@momjian.us> wrote:
>> > I have researched this issue originally reported in June of 2014 and
>> > implemented a patch to ignore cancel while we are completing a commit.
>> > I am not clear if this is the proper place for this code, though a
>> > disable_timeout() call on the line above suggests I am close.  :-)
>>
>> This would also disable cancel interrupts while running AFTER
>> triggers, which seems almost certain to be wrong.  TBH, I'm not sure
>> why the existing HOLD_INTERRUPTS() in CommitTransaction() isn't
>> already preventing this problem.  Did you investigate that at all?
>
> Yes, the situation is complex, and was suggested by the original poster.
> The issue with CommitTransaction() is that it only _holds_ the signal
> --- it doesn't clear it.  Now, since there are very few
> CHECK_FOR_INTERRUPTS() calls in the typical commit process flow, the
> signal is normally erased.  However, if log_duration or
> log_min_duration_statement are set, they call ereport, which calls
> errfinish(), which has a call to CHECK_FOR_INTERRUPTS().
>
> First attached patch is more surgical and clears a possible cancel
> request before we report the query duration in the logs --- this doesn't
> affect any after triggers that might include CHECK_FOR_INTERRUPTS()
> calls we want to honor.
>
> Another approach would be to have CommitTransaction() clear any pending
> cancel before it calls RESUME_INTERRUPTS().  The second attached patch
> takes that approach, and also works.

So, either way, what happens if the query cancel shows up just an
instant after you clear the flag?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Robert Haas wrote:
> On Thu, Mar 19, 2015 at 10:23 AM, Bruce Momjian <bruce@momjian.us> wrote:

> > The issue with CommitTransaction() is that it only _holds_ the signal
> > --- it doesn't clear it.  Now, since there are very few
> > CHECK_FOR_INTERRUPTS() calls in the typical commit process flow, the
> > signal is normally erased.  However, if log_duration or
> > log_min_duration_statement are set, they call ereport, which calls
> > errfinish(), which has a call to CHECK_FOR_INTERRUPTS().
> >
> > First attached patch is more surgical and clears a possible cancel
> > request before we report the query duration in the logs --- this doesn't
> > affect any after triggers that might include CHECK_FOR_INTERRUPTS()
> > calls we want to honor.
> >
> > Another approach would be to have CommitTransaction() clear any pending
> > cancel before it calls RESUME_INTERRUPTS().  The second attached patch
> > takes that approach, and also works.
> 
> So, either way, what happens if the query cancel shows up just an
> instant after you clear the flag?

I don't understand why aren't interrupts held until after the commit is
done -- including across the mentioned ereports.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



On Thu, Mar 19, 2015 at 04:36:38PM -0400, Robert Haas wrote:
> On Thu, Mar 19, 2015 at 10:23 AM, Bruce Momjian <bruce@momjian.us> wrote:
> > First attached patch is more surgical and clears a possible cancel
> > request before we report the query duration in the logs --- this doesn't
> > affect any after triggers that might include CHECK_FOR_INTERRUPTS()
> > calls we want to honor.
> >
> > Another approach would be to have CommitTransaction() clear any pending
> > cancel before it calls RESUME_INTERRUPTS().  The second attached patch
> > takes that approach, and also works.
>
> So, either way, what happens if the query cancel shows up just an
> instant after you clear the flag?

Oh, good point.  This version handles that case addressing only the
log_duration* block.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Attachment
On Thu, Mar 19, 2015 at 06:59:20PM -0300, Alvaro Herrera wrote:
> Robert Haas wrote:
> > On Thu, Mar 19, 2015 at 10:23 AM, Bruce Momjian <bruce@momjian.us> wrote:
>
> > > The issue with CommitTransaction() is that it only _holds_ the signal
> > > --- it doesn't clear it.  Now, since there are very few
> > > CHECK_FOR_INTERRUPTS() calls in the typical commit process flow, the
> > > signal is normally erased.  However, if log_duration or
> > > log_min_duration_statement are set, they call ereport, which calls
> > > errfinish(), which has a call to CHECK_FOR_INTERRUPTS().
> > >
> > > First attached patch is more surgical and clears a possible cancel
> > > request before we report the query duration in the logs --- this doesn't
> > > affect any after triggers that might include CHECK_FOR_INTERRUPTS()
> > > calls we want to honor.
> > >
> > > Another approach would be to have CommitTransaction() clear any pending
> > > cancel before it calls RESUME_INTERRUPTS().  The second attached patch
> > > takes that approach, and also works.
> >
> > So, either way, what happens if the query cancel shows up just an
> > instant after you clear the flag?
>
> I don't understand why aren't interrupts held until after the commit is
> done -- including across the mentioned ereports.

Uh, I think Robert was thinking of pre-commit triggers at the top of
CommitTransaction() that might take a long time and we might want to
cancel.  In fact, he is right that mid-way into CommitTransaction(),
after those pre-commit triggers, we do HOLD_INTERRUPTS(), then set our
clog bit and continue to the bottom of that function.  What is happening
is that we don't _clear_ the cancel bit and log_duration is finding the
cancel.

In an ideal world, we would clear the client cancel in
CommitTransaction() and when we do log_duration*, and the attached patch
now does that.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Attachment
Bruce Momjian <bruce@momjian.us> writes:
> On Thu, Mar 19, 2015 at 04:36:38PM -0400, Robert Haas wrote:
>> So, either way, what happens if the query cancel shows up just an
>> instant after you clear the flag?

> Oh, good point.  This version handles that case addressing only the
> log_duration* block.

This is just moving the failure cases around, and not by very much either.

The core issue here, I think, is that xact.c only holds off interrupts
during what it considers to be the commit critical section --- which is
okay from the standpoint of transactional consistency.  But the complaint
has to do with what the client perceives to have happened if a SIGINT
arrives somewhere between where xact.c has committed and where postgres.c
has reported the commit to the client.  If we want to address that, I
think postgres.c needs to hold off interrupts for the entire duration from
just before CommitTransactionCommand() to just after ReadyForQuery().
That's likely to be rather messy, because there are so many code paths
there, especially when you consider error cases.

A possible way to do this without incurring unacceptably high risk of
breakage (in particular, ending up with interrupts still held off when
they shouldn't be any longer) is to assume that there should never be a
case where we reach ReadCommand() with interrupts still held off.  Then
we could invent an additional interrupt primitive "RESET_INTERRUPTS()"
that forces InterruptHoldoffCount to zero (and, presumably, then does
a CHECK_FOR_INTERRUPTS()); then putting a HOLD_INTERRUPTS() before calling
CommitTransactionCommand() and a RESET_INTERRUPTS() before waiting for
client input would presumably be pretty safe.  On the other hand, that
approach could easily mask interrupt holdoff mismatch bugs elsewhere in
the code base.
        regards, tom lane



Bruce Momjian <bruce@momjian.us> writes:
> On Thu, Mar 19, 2015 at 06:59:20PM -0300, Alvaro Herrera wrote:
>> I don't understand why aren't interrupts held until after the commit is
>> done -- including across the mentioned ereports.

> Uh, I think Robert was thinking of pre-commit triggers at the top of
> CommitTransaction() that might take a long time and we might want to
> cancel.

Yeah, that's a good point.  So really the only way to make this work as
requested is to have some cooperation between xact.c and postgres.c,
so that the hold taken midway through CommitTransaction is kept until
we reach the idle point.

The attached is only very lightly tested but shows what we probably
would need for this.  It's a bit messy in that the API for
CommitTransactionCommand leaves it unspecified whether interrupts are
held at exit; I'm not sure if it's useful or feasible to be more precise.

            regards, tom lane

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 1495bb4..cedf5ee 100644
*** a/src/backend/access/transam/xact.c
--- b/src/backend/access/transam/xact.c
*************** static void CallSubXactCallbacks(SubXact
*** 274,280 ****
  static void CleanupTransaction(void);
  static void CheckTransactionChain(bool isTopLevel, bool throwError,
                        const char *stmtType);
! static void CommitTransaction(void);
  static TransactionId RecordTransactionAbort(bool isSubXact);
  static void StartTransaction(void);

--- 274,280 ----
  static void CleanupTransaction(void);
  static void CheckTransactionChain(bool isTopLevel, bool throwError,
                        const char *stmtType);
! static void CommitTransaction(bool keep_holdoff);
  static TransactionId RecordTransactionAbort(bool isSubXact);
  static void StartTransaction(void);

*************** StartTransaction(void)
*** 1762,1771 ****
  /*
   *    CommitTransaction
   *
   * NB: if you change this routine, better look at PrepareTransaction too!
   */
  static void
! CommitTransaction(void)
  {
      TransactionState s = CurrentTransactionState;
      TransactionId latestXid;
--- 1762,1775 ----
  /*
   *    CommitTransaction
   *
+  * If keep_holdoff is true, we return with interrupts still held off;
+  * this is appropriate if caller has something to do that should be
+  * considered part of post-commit cleanup.
+  *
   * NB: if you change this routine, better look at PrepareTransaction too!
   */
  static void
! CommitTransaction(bool keep_holdoff)
  {
      TransactionState s = CurrentTransactionState;
      TransactionId latestXid;
*************** CommitTransaction(void)
*** 1837,1843 ****
       */
      PreCommit_Notify();

!     /* Prevent cancel/die interrupt while cleaning up */
      HOLD_INTERRUPTS();

      /* Commit updates to the relation map --- do this as late as possible */
--- 1841,1851 ----
       */
      PreCommit_Notify();

!     /*
!      * Prevent cancel/die interrupt while cleaning up.  If caller passes
!      * keep_holdoff = true, interrupts will remain held till caller resumes
!      * them.
!      */
      HOLD_INTERRUPTS();

      /* Commit updates to the relation map --- do this as late as possible */
*************** CommitTransaction(void)
*** 1958,1974 ****
       */
      s->state = TRANS_DEFAULT;

!     RESUME_INTERRUPTS();
  }


  /*
   *    PrepareTransaction
   *
   * NB: if you change this routine, better look at CommitTransaction too!
   */
  static void
! PrepareTransaction(void)
  {
      TransactionState s = CurrentTransactionState;
      TransactionId xid = GetCurrentTransactionId();
--- 1966,1987 ----
       */
      s->state = TRANS_DEFAULT;

!     if (!keep_holdoff)
!         RESUME_INTERRUPTS();
  }


  /*
   *    PrepareTransaction
   *
+  * If keep_holdoff is true, we return with interrupts still held off;
+  * this is appropriate if caller has something to do that should be
+  * considered part of post-commit cleanup.
+  *
   * NB: if you change this routine, better look at CommitTransaction too!
   */
  static void
! PrepareTransaction(bool keep_holdoff)
  {
      TransactionState s = CurrentTransactionState;
      TransactionId xid = GetCurrentTransactionId();
*************** PrepareTransaction(void)
*** 2067,2073 ****
                  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
          errmsg("cannot PREPARE a transaction that has exported snapshots")));

!     /* Prevent cancel/die interrupt while cleaning up */
      HOLD_INTERRUPTS();

      /*
--- 2080,2090 ----
                  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
          errmsg("cannot PREPARE a transaction that has exported snapshots")));

!     /*
!      * Prevent cancel/die interrupt while cleaning up.  If caller passes
!      * keep_holdoff = true, interrupts will remain held till caller resumes
!      * them.
!      */
      HOLD_INTERRUPTS();

      /*
*************** PrepareTransaction(void)
*** 2225,2231 ****
       */
      s->state = TRANS_DEFAULT;

!     RESUME_INTERRUPTS();
  }


--- 2242,2249 ----
       */
      s->state = TRANS_DEFAULT;

!     if (!keep_holdoff)
!         RESUME_INTERRUPTS();
  }


*************** StartTransactionCommand(void)
*** 2492,2500 ****

  /*
   *    CommitTransactionCommand
   */
  void
! CommitTransactionCommand(void)
  {
      TransactionState s = CurrentTransactionState;

--- 2510,2524 ----

  /*
   *    CommitTransactionCommand
+  *
+  * If keep_holdoff is true, and we perform a transaction commit or prepare,
+  * we return with interrupts still held off; this is appropriate if caller has
+  * something to do that should be considered part of post-commit cleanup.
+  * Note that callers passing "true" must be prepared for either the case that
+  * interrupts are held or that they're not.
   */
  void
! CommitTransactionCommand(bool keep_holdoff)
  {
      TransactionState s = CurrentTransactionState;

*************** CommitTransactionCommand(void)
*** 2515,2521 ****
               * transaction commit, and return to the idle state.
               */
          case TBLOCK_STARTED:
!             CommitTransaction();
              s->blockState = TBLOCK_DEFAULT;
              break;

--- 2539,2545 ----
               * transaction commit, and return to the idle state.
               */
          case TBLOCK_STARTED:
!             CommitTransaction(keep_holdoff);
              s->blockState = TBLOCK_DEFAULT;
              break;

*************** CommitTransactionCommand(void)
*** 2544,2550 ****
               * idle state.
               */
          case TBLOCK_END:
!             CommitTransaction();
              s->blockState = TBLOCK_DEFAULT;
              break;

--- 2568,2574 ----
               * idle state.
               */
          case TBLOCK_END:
!             CommitTransaction(keep_holdoff);
              s->blockState = TBLOCK_DEFAULT;
              break;

*************** CommitTransactionCommand(void)
*** 2583,2589 ****
               * return to the idle state.
               */
          case TBLOCK_PREPARE:
!             PrepareTransaction();
              s->blockState = TBLOCK_DEFAULT;
              break;

--- 2607,2613 ----
               * return to the idle state.
               */
          case TBLOCK_PREPARE:
!             PrepareTransaction(keep_holdoff);
              s->blockState = TBLOCK_DEFAULT;
              break;

*************** CommitTransactionCommand(void)
*** 2634,2646 ****
              if (s->blockState == TBLOCK_END)
              {
                  Assert(s->parent == NULL);
!                 CommitTransaction();
                  s->blockState = TBLOCK_DEFAULT;
              }
              else if (s->blockState == TBLOCK_PREPARE)
              {
                  Assert(s->parent == NULL);
!                 PrepareTransaction();
                  s->blockState = TBLOCK_DEFAULT;
              }
              else
--- 2658,2670 ----
              if (s->blockState == TBLOCK_END)
              {
                  Assert(s->parent == NULL);
!                 CommitTransaction(keep_holdoff);
                  s->blockState = TBLOCK_DEFAULT;
              }
              else if (s->blockState == TBLOCK_PREPARE)
              {
                  Assert(s->parent == NULL);
!                 PrepareTransaction(keep_holdoff);
                  s->blockState = TBLOCK_DEFAULT;
              }
              else
*************** CommitTransactionCommand(void)
*** 2655,2661 ****
               */
          case TBLOCK_SUBABORT_END:
              CleanupSubTransaction();
!             CommitTransactionCommand();
              break;

              /*
--- 2679,2685 ----
               */
          case TBLOCK_SUBABORT_END:
              CleanupSubTransaction();
!             CommitTransactionCommand(keep_holdoff);
              break;

              /*
*************** CommitTransactionCommand(void)
*** 2664,2670 ****
          case TBLOCK_SUBABORT_PENDING:
              AbortSubTransaction();
              CleanupSubTransaction();
!             CommitTransactionCommand();
              break;

              /*
--- 2688,2694 ----
          case TBLOCK_SUBABORT_PENDING:
              AbortSubTransaction();
              CleanupSubTransaction();
!             CommitTransactionCommand(keep_holdoff);
              break;

              /*
*************** BeginInternalSubTransaction(char *name)
*** 3779,3785 ****
              break;
      }

!     CommitTransactionCommand();
      StartTransactionCommand();
  }

--- 3803,3809 ----
              break;
      }

!     CommitTransactionCommand(false);
      StartTransactionCommand();
  }

diff --git a/src/backend/bootstrap/bootparse.y b/src/backend/bootstrap/bootparse.y
index 6e563b6..cb14f08 100644
*** a/src/backend/bootstrap/bootparse.y
--- b/src/backend/bootstrap/bootparse.y
*************** do_start(void)
*** 77,83 ****
  static void
  do_end(void)
  {
!     CommitTransactionCommand();
      elog(DEBUG4, "commit transaction");
      CHECK_FOR_INTERRUPTS();        /* allow SIGINT to kill bootstrap run */
      if (isatty(0))
--- 77,83 ----
  static void
  do_end(void)
  {
!     CommitTransactionCommand(false);
      elog(DEBUG4, "commit transaction");
      CHECK_FOR_INTERRUPTS();        /* allow SIGINT to kill bootstrap run */
      if (isatty(0))
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index f85ed93..0bef8ad 100644
*** a/src/backend/catalog/index.c
--- b/src/backend/catalog/index.c
*************** index_drop(Oid indexId, bool concurrent)
*** 1453,1459 ****
          LockRelationIdForSession(&indexrelid, ShareUpdateExclusiveLock);

          PopActiveSnapshot();
!         CommitTransactionCommand();
          StartTransactionCommand();

          /*
--- 1453,1459 ----
          LockRelationIdForSession(&indexrelid, ShareUpdateExclusiveLock);

          PopActiveSnapshot();
!         CommitTransactionCommand(false);
          StartTransactionCommand();

          /*
*************** index_drop(Oid indexId, bool concurrent)
*** 1507,1513 ****
           * Again, commit the transaction to make the pg_index update visible
           * to other sessions.
           */
!         CommitTransactionCommand();
          StartTransactionCommand();

          /*
--- 1507,1513 ----
           * Again, commit the transaction to make the pg_index update visible
           * to other sessions.
           */
!         CommitTransactionCommand(false);
          StartTransactionCommand();

          /*
diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c
index 1af977c..460c1fd 100644
*** a/src/backend/catalog/namespace.c
--- b/src/backend/catalog/namespace.c
*************** RemoveTempRelationsCallback(int code, Da
*** 3855,3861 ****

          RemoveTempRelations(myTempNamespace);

!         CommitTransactionCommand();
      }
  }

--- 3855,3861 ----

          RemoveTempRelations(myTempNamespace);

!         CommitTransactionCommand(false);
      }
  }

diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index 2826b7e..1bee7a6 100644
*** a/src/backend/commands/async.c
--- b/src/backend/commands/async.c
*************** ProcessCompletedNotifies(void)
*** 1104,1110 ****
          asyncQueueAdvanceTail();
      }

!     CommitTransactionCommand();

      MemoryContextSwitchTo(caller_context);

--- 1104,1110 ----
          asyncQueueAdvanceTail();
      }

!     CommitTransactionCommand(false);

      MemoryContextSwitchTo(caller_context);

*************** ProcessIncomingNotify(void)
*** 1988,1994 ****

      asyncQueueReadAllNotifications();

!     CommitTransactionCommand();

      /*
       * Must flush the notify messages to ensure frontend gets them promptly.
--- 1988,1994 ----

      asyncQueueReadAllNotifications();

!     CommitTransactionCommand(false);

      /*
       * Must flush the notify messages to ensure frontend gets them promptly.
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 3febdd5..1c5d476 100644
*** a/src/backend/commands/cluster.c
--- b/src/backend/commands/cluster.c
*************** cluster(ClusterStmt *stmt, bool isTopLev
*** 214,220 ****

          /* Commit to get out of starting transaction */
          PopActiveSnapshot();
!         CommitTransactionCommand();

          /* Ok, now that we've got them all, cluster them one by one */
          foreach(rv, rvs)
--- 214,220 ----

          /* Commit to get out of starting transaction */
          PopActiveSnapshot();
!         CommitTransactionCommand(false);

          /* Ok, now that we've got them all, cluster them one by one */
          foreach(rv, rvs)
*************** cluster(ClusterStmt *stmt, bool isTopLev
*** 228,234 ****
              /* Do the job. */
              cluster_rel(rvtc->tableOid, rvtc->indexOid, true, stmt->verbose);
              PopActiveSnapshot();
!             CommitTransactionCommand();
          }

          /* Start a new transaction for the cleanup work. */
--- 228,234 ----
              /* Do the job. */
              cluster_rel(rvtc->tableOid, rvtc->indexOid, true, stmt->verbose);
              PopActiveSnapshot();
!             CommitTransactionCommand(false);
          }

          /* Start a new transaction for the cleanup work. */
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index a699ce3..8f7a51f 100644
*** a/src/backend/commands/dbcommands.c
--- b/src/backend/commands/dbcommands.c
*************** movedb(const char *dbname, const char *t
*** 1312,1318 ****
       * really commits.
       */
      PopActiveSnapshot();
!     CommitTransactionCommand();

      /* Start new transaction for the remaining work; don't need a snapshot */
      StartTransactionCommand();
--- 1312,1318 ----
       * really commits.
       */
      PopActiveSnapshot();
!     CommitTransactionCommand(false);

      /* Start new transaction for the remaining work; don't need a snapshot */
      StartTransactionCommand();
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 1c1d0da..b76738c 100644
*** a/src/backend/commands/indexcmds.c
--- b/src/backend/commands/indexcmds.c
*************** DefineIndex(Oid relationId,
*** 663,669 ****
      LockRelationIdForSession(&heaprelid, ShareUpdateExclusiveLock);

      PopActiveSnapshot();
!     CommitTransactionCommand();
      StartTransactionCommand();

      /*
--- 663,669 ----
      LockRelationIdForSession(&heaprelid, ShareUpdateExclusiveLock);

      PopActiveSnapshot();
!     CommitTransactionCommand(false);
      StartTransactionCommand();

      /*
*************** DefineIndex(Oid relationId,
*** 737,743 ****
      /*
       * Commit this transaction to make the indisready update visible.
       */
!     CommitTransactionCommand();
      StartTransactionCommand();

      /*
--- 737,743 ----
      /*
       * Commit this transaction to make the indisready update visible.
       */
!     CommitTransactionCommand(false);
      StartTransactionCommand();

      /*
*************** ReindexMultipleTables(const char *object
*** 1914,1920 ****

      /* Now reindex each rel in a separate transaction */
      PopActiveSnapshot();
!     CommitTransactionCommand();
      foreach(l, relids)
      {
          Oid            relid = lfirst_oid(l);
--- 1914,1920 ----

      /* Now reindex each rel in a separate transaction */
      PopActiveSnapshot();
!     CommitTransactionCommand(false);
      foreach(l, relids)
      {
          Oid            relid = lfirst_oid(l);
*************** ReindexMultipleTables(const char *object
*** 1930,1936 ****
                              get_namespace_name(get_rel_namespace(relid)),
                              get_rel_name(relid))));
          PopActiveSnapshot();
!         CommitTransactionCommand();
      }
      StartTransactionCommand();

--- 1930,1936 ----
                              get_namespace_name(get_rel_namespace(relid)),
                              get_rel_name(relid))));
          PopActiveSnapshot();
!         CommitTransactionCommand(false);
      }
      StartTransactionCommand();

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index bd57b68..c69610d 100644
*** a/src/backend/commands/vacuum.c
--- b/src/backend/commands/vacuum.c
*************** vacuum(int options, RangeVar *relation,
*** 263,269 ****
              PopActiveSnapshot();

          /* matches the StartTransaction in PostgresMain() */
!         CommitTransactionCommand();
      }

      /* Turn vacuum cost accounting on or off */
--- 263,269 ----
              PopActiveSnapshot();

          /* matches the StartTransaction in PostgresMain() */
!         CommitTransactionCommand(false);
      }

      /* Turn vacuum cost accounting on or off */
*************** vacuum(int options, RangeVar *relation,
*** 310,316 ****
                  if (use_own_xacts)
                  {
                      PopActiveSnapshot();
!                     CommitTransactionCommand();
                  }
              }
          }
--- 310,316 ----
                  if (use_own_xacts)
                  {
                      PopActiveSnapshot();
!                     CommitTransactionCommand(false);
                  }
              }
          }
*************** vacuum_rel(Oid relid, RangeVar *relation
*** 1243,1249 ****
      if (!onerel)
      {
          PopActiveSnapshot();
!         CommitTransactionCommand();
          return false;
      }

--- 1243,1249 ----
      if (!onerel)
      {
          PopActiveSnapshot();
!         CommitTransactionCommand(false);
          return false;
      }

*************** vacuum_rel(Oid relid, RangeVar *relation
*** 1274,1280 ****
                              RelationGetRelationName(onerel))));
          relation_close(onerel, lmode);
          PopActiveSnapshot();
!         CommitTransactionCommand();
          return false;
      }

--- 1274,1280 ----
                              RelationGetRelationName(onerel))));
          relation_close(onerel, lmode);
          PopActiveSnapshot();
!         CommitTransactionCommand(false);
          return false;
      }

*************** vacuum_rel(Oid relid, RangeVar *relation
*** 1292,1298 ****
                          RelationGetRelationName(onerel))));
          relation_close(onerel, lmode);
          PopActiveSnapshot();
!         CommitTransactionCommand();
          return false;
      }

--- 1292,1298 ----
                          RelationGetRelationName(onerel))));
          relation_close(onerel, lmode);
          PopActiveSnapshot();
!         CommitTransactionCommand(false);
          return false;
      }

*************** vacuum_rel(Oid relid, RangeVar *relation
*** 1307,1313 ****
      {
          relation_close(onerel, lmode);
          PopActiveSnapshot();
!         CommitTransactionCommand();
          return false;
      }

--- 1307,1313 ----
      {
          relation_close(onerel, lmode);
          PopActiveSnapshot();
!         CommitTransactionCommand(false);
          return false;
      }

*************** vacuum_rel(Oid relid, RangeVar *relation
*** 1375,1381 ****
       * Complete the transaction and free all temporary memory used.
       */
      PopActiveSnapshot();
!     CommitTransactionCommand();

      /*
       * If the relation has a secondary toast rel, vacuum that too while we
--- 1375,1381 ----
       * Complete the transaction and free all temporary memory used.
       */
      PopActiveSnapshot();
!     CommitTransactionCommand(false);

      /*
       * If the relation has a secondary toast rel, vacuum that too while we
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 5ccae24..b593437 100644
*** a/src/backend/postmaster/autovacuum.c
--- b/src/backend/postmaster/autovacuum.c
*************** get_database_list(void)
*** 1847,1853 ****
      heap_endscan(scan);
      heap_close(rel, AccessShareLock);

!     CommitTransactionCommand();

      return dblist;
  }
--- 1847,1853 ----
      heap_endscan(scan);
      heap_close(rel, AccessShareLock);

!     CommitTransactionCommand(false);

      return dblist;
  }
*************** deleted:
*** 2355,2361 ****
      vac_update_datfrozenxid();

      /* Finally close out the last transaction. */
!     CommitTransactionCommand();
  }

  /*
--- 2355,2361 ----
      vac_update_datfrozenxid();

      /* Finally close out the last transaction. */
!     CommitTransactionCommand(false);
  }

  /*
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 2956119..833fd69 100644
*** a/src/backend/replication/walsender.c
--- b/src/backend/replication/walsender.c
*************** IdentifySystem(void)
*** 331,337 ****
          /* make dbname live outside TX context */
          MemoryContextSwitchTo(cur);
          dbname = get_database_name(MyDatabaseId);
!         CommitTransactionCommand();
          /* CommitTransactionCommand switches to TopMemoryContext */
          MemoryContextSwitchTo(cur);
      }
--- 331,337 ----
          /* make dbname live outside TX context */
          MemoryContextSwitchTo(cur);
          dbname = get_database_name(MyDatabaseId);
!         CommitTransactionCommand(false);
          /* CommitTransactionCommand switches to TopMemoryContext */
          MemoryContextSwitchTo(cur);
      }
diff --git a/src/backend/storage/ipc/sinval.c b/src/backend/storage/ipc/sinval.c
index 67ec515..ed3961e 100644
*** a/src/backend/storage/ipc/sinval.c
--- b/src/backend/storage/ipc/sinval.c
*************** ProcessCatchupInterrupt(void)
*** 200,206 ****
          {
              elog(DEBUG4, "ProcessCatchupEvent outside transaction");
              StartTransactionCommand();
!             CommitTransactionCommand();
          }
      }
  }
--- 200,206 ----
          {
              elog(DEBUG4, "ProcessCatchupEvent outside transaction");
              StartTransactionCommand();
!             CommitTransactionCommand(false);
          }
      }
  }
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 33720e8..ab361cd 100644
*** a/src/backend/tcop/postgres.c
--- b/src/backend/tcop/postgres.c
*************** static int    errdetail_params(ParamListInf
*** 190,196 ****
  static int    errdetail_abort(void);
  static int    errdetail_recovery_conflict(void);
  static void start_xact_command(void);
! static void finish_xact_command(void);
  static bool IsTransactionExitStmt(Node *parsetree);
  static bool IsTransactionExitStmtList(List *parseTrees);
  static bool IsTransactionStmtList(List *parseTrees);
--- 190,196 ----
  static int    errdetail_abort(void);
  static int    errdetail_recovery_conflict(void);
  static void start_xact_command(void);
! static void finish_xact_command(bool keep_holdoff);
  static bool IsTransactionExitStmt(Node *parsetree);
  static bool IsTransactionExitStmtList(List *parseTrees);
  static bool IsTransactionStmtList(List *parseTrees);
*************** exec_simple_query(const char *query_stri
*** 1111,1125 ****

          PortalDrop(portal, false);

!         if (IsA(parsetree, TransactionStmt))
!         {
!             /*
!              * If this was a transaction control statement, commit it. We will
!              * start a new xact command for the next command (if any).
!              */
!             finish_xact_command();
!         }
!         else if (lnext(parsetree_item) == NULL)
          {
              /*
               * If this is the last parsetree of the query string, close down
--- 1111,1117 ----

          PortalDrop(portal, false);

!         if (lnext(parsetree_item) == NULL)
          {
              /*
               * If this is the last parsetree of the query string, close down
*************** exec_simple_query(const char *query_stri
*** 1131,1137 ****
               * historical Postgres behavior, we do not force a transaction
               * boundary between queries appearing in a single query string.
               */
!             finish_xact_command();
          }
          else
          {
--- 1123,1138 ----
               * historical Postgres behavior, we do not force a transaction
               * boundary between queries appearing in a single query string.
               */
!             finish_xact_command(true);
!         }
!         else if (IsA(parsetree, TransactionStmt))
!         {
!             /*
!              * If this was a transaction control statement, commit it. We will
!              * start a new xact command for the next command.  We know there
!              * is another one, so don't hold off interrupts.
!              */
!             finish_xact_command(false);
          }
          else
          {
*************** exec_simple_query(const char *query_stri
*** 1154,1160 ****
      /*
       * Close down transaction statement, if one is open.
       */
!     finish_xact_command();

      /*
       * If there were no parsetrees, return EmptyQueryResponse message.
--- 1155,1161 ----
      /*
       * Close down transaction statement, if one is open.
       */
!     finish_xact_command(true);

      /*
       * If there were no parsetrees, return EmptyQueryResponse message.
*************** exec_execute_message(const char *portal_
*** 2001,2007 ****
               * If this was a transaction control statement, commit it.  We
               * will start a new xact command for the next command (if any).
               */
!             finish_xact_command();
          }
          else
          {
--- 2002,2008 ----
               * If this was a transaction control statement, commit it.  We
               * will start a new xact command for the next command (if any).
               */
!             finish_xact_command(true);
          }
          else
          {
*************** start_xact_command(void)
*** 2453,2459 ****
  }

  static void
! finish_xact_command(void)
  {
      if (xact_started)
      {
--- 2454,2460 ----
  }

  static void
! finish_xact_command(bool keep_holdoff)
  {
      if (xact_started)
      {
*************** finish_xact_command(void)
*** 2464,2470 ****
          ereport(DEBUG3,
                  (errmsg_internal("CommitTransactionCommand")));

!         CommitTransactionCommand();

  #ifdef MEMORY_CONTEXT_CHECKING
          /* Check all memory contexts that weren't freed during commit */
--- 2465,2471 ----
          ereport(DEBUG3,
                  (errmsg_internal("CommitTransactionCommand")));

!         CommitTransactionCommand(keep_holdoff);

  #ifdef MEMORY_CONTEXT_CHECKING
          /* Check all memory contexts that weren't freed during commit */
*************** PostgresMain(int argc, char *argv[],
*** 3964,3969 ****
--- 3965,3976 ----
          }

          /*
+          * Resume interrupts, in case they were held off thanks to a COMMIT.
+          */
+         InterruptHoldoffCount = 0;
+         CHECK_FOR_INTERRUPTS();
+
+         /*
           * (2) Allow asynchronous signals to be executed immediately if they
           * come in while we are waiting for client input. (This must be
           * conditional since we don't want, say, reads on behalf of COPY FROM
*************** PostgresMain(int argc, char *argv[],
*** 4127,4133 ****
                  }

                  /* commit the function-invocation transaction */
!                 finish_xact_command();

                  send_ready_for_query = true;
                  break;
--- 4134,4140 ----
                  }

                  /* commit the function-invocation transaction */
!                 finish_xact_command(true);

                  send_ready_for_query = true;
                  break;
*************** PostgresMain(int argc, char *argv[],
*** 4216,4222 ****

              case 'S':            /* sync */
                  pq_getmsgend(&input_message);
!                 finish_xact_command();
                  send_ready_for_query = true;
                  break;

--- 4223,4229 ----

              case 'S':            /* sync */
                  pq_getmsgend(&input_message);
!                 finish_xact_command(true);
                  send_ready_for_query = true;
                  break;

diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 1e646a1..59eb697 100644
*** a/src/backend/utils/init/postinit.c
--- b/src/backend/utils/init/postinit.c
*************** InitPostgres(const char *in_dbname, Oid
*** 794,800 ****
          pgstat_bestart();

          /* close the transaction we started above */
!         CommitTransactionCommand();

          return;
      }
--- 794,800 ----
          pgstat_bestart();

          /* close the transaction we started above */
!         CommitTransactionCommand(false);

          return;
      }
*************** InitPostgres(const char *in_dbname, Oid
*** 973,979 ****

      /* close the transaction we started above */
      if (!bootstrap)
!         CommitTransactionCommand();
  }

  /*
--- 973,979 ----

      /* close the transaction we started above */
      if (!bootstrap)
!         CommitTransactionCommand(false);
  }

  /*
diff --git a/src/include/access/xact.h b/src/include/access/xact.h
index fdf3ea3..59e5df0 100644
*** a/src/include/access/xact.h
--- b/src/include/access/xact.h
*************** extern bool TransactionIdIsCurrentTransa
*** 308,314 ****
  extern void CommandCounterIncrement(void);
  extern void ForceSyncCommit(void);
  extern void StartTransactionCommand(void);
! extern void CommitTransactionCommand(void);
  extern void AbortCurrentTransaction(void);
  extern void BeginTransactionBlock(void);
  extern bool EndTransactionBlock(void);
--- 308,314 ----
  extern void CommandCounterIncrement(void);
  extern void ForceSyncCommit(void);
  extern void StartTransactionCommand(void);
! extern void CommitTransactionCommand(bool keep_holdoff);
  extern void AbortCurrentTransaction(void);
  extern void BeginTransactionBlock(void);
  extern bool EndTransactionBlock(void);
diff --git a/src/test/modules/worker_spi/worker_spi.c b/src/test/modules/worker_spi/worker_spi.c
index 4149c94..c8882ff 100644
*** a/src/test/modules/worker_spi/worker_spi.c
--- b/src/test/modules/worker_spi/worker_spi.c
*************** initialize_worker_spi(worktable *table)
*** 154,160 ****

      SPI_finish();
      PopActiveSnapshot();
!     CommitTransactionCommand();
      pgstat_report_activity(STATE_IDLE, NULL);
  }

--- 154,160 ----

      SPI_finish();
      PopActiveSnapshot();
!     CommitTransactionCommand(false);
      pgstat_report_activity(STATE_IDLE, NULL);
  }

*************** worker_spi_main(Datum main_arg)
*** 291,297 ****
           */
          SPI_finish();
          PopActiveSnapshot();
!         CommitTransactionCommand();
          pgstat_report_activity(STATE_IDLE, NULL);
      }

--- 291,297 ----
           */
          SPI_finish();
          PopActiveSnapshot();
!         CommitTransactionCommand(false);
          pgstat_report_activity(STATE_IDLE, NULL);
      }


On Thu, Mar 19, 2015 at 07:19:02PM -0400, Tom Lane wrote:
> The core issue here, I think, is that xact.c only holds off interrupts
> during what it considers to be the commit critical section --- which is
> okay from the standpoint of transactional consistency.  But the complaint
> has to do with what the client perceives to have happened if a SIGINT
> arrives somewhere between where xact.c has committed and where postgres.c
> has reported the commit to the client.  If we want to address that, I
> think postgres.c needs to hold off interrupts for the entire duration from
> just before CommitTransactionCommand() to just after ReadyForQuery().
> That's likely to be rather messy, because there are so many code paths
> there, especially when you consider error cases.
> 
> A possible way to do this without incurring unacceptably high risk of
> breakage (in particular, ending up with interrupts still held off when
> they shouldn't be any longer) is to assume that there should never be a
> case where we reach ReadCommand() with interrupts still held off.  Then
> we could invent an additional interrupt primitive "RESET_INTERRUPTS()"
> that forces InterruptHoldoffCount to zero (and, presumably, then does
> a CHECK_FOR_INTERRUPTS()); then putting a HOLD_INTERRUPTS() before calling
> CommitTransactionCommand() and a RESET_INTERRUPTS() before waiting for
> client input would presumably be pretty safe.  On the other hand, that
> approach could easily mask interrupt holdoff mismatch bugs elsewhere in
> the code base.

Well, right now we allow interrupts for as long as possible, i.e. to the
middle of CommitTransaction().  Your approach would block interrupts for
a larger span, which might be worse than the bug we are fixing.  It also
feels like it would be unmodular as functions would change the blocking
state when they are called.

Tom is right that my cancel5.diff version has an area between the first
cancel erase and the second cancel erase where a cancel might arrive.  I
assumed there were no checks in that area, but I might be wrong, and
there could be checks there someday.

The fundamental problem is that the place we need to block cancels
starts several levels down in a function, and the place we need to clear
it is higher.  Maybe the entire HOLD/RESUME block API we have for this
is wrong and we just need a global variable to ignore client cancels to
be read inside the signal handler, and we just set and clear it.  I can
work on a patch if people like that idea.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



On Thu, Mar 19, 2015 at 08:43:52PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Thu, Mar 19, 2015 at 06:59:20PM -0300, Alvaro Herrera wrote:
> >> I don't understand why aren't interrupts held until after the commit is
> >> done -- including across the mentioned ereports.
> 
> > Uh, I think Robert was thinking of pre-commit triggers at the top of
> > CommitTransaction() that might take a long time and we might want to
> > cancel.
> 
> Yeah, that's a good point.  So really the only way to make this work as
> requested is to have some cooperation between xact.c and postgres.c,
> so that the hold taken midway through CommitTransaction is kept until
> we reach the idle point.
> 
> The attached is only very lightly tested but shows what we probably
> would need for this.  It's a bit messy in that the API for
> CommitTransactionCommand leaves it unspecified whether interrupts are
> held at exit; I'm not sure if it's useful or feasible to be more precise.

Oh, I see what you are saying, and why a global variable will not work. 
I thought all paths reset the cancel state when the returned from a
commit, but it seems there are many places that don't do reset, so you
have to pass a flag down into CommitTransaction() to control that ---
makes sense.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +