Re: Unexpected "canceling statement due to user request" error - Mailing list pgsql-general

From Tom Lane
Subject Re: Unexpected "canceling statement due to user request" error
Date
Msg-id 11561.1567381579@sss.pgh.pa.us
Whole thread Raw
In response to Re: Unexpected "canceling statement due to user request" error  (Will Storey <will@summercat.com>)
Responses Re: Unexpected "canceling statement due to user request" error
List pgsql-general
Will Storey <will@summercat.com> writes:
> Thanks for the pointer!

> It looks like we'd have to be entering that section and finding the
> statement timeout not set, otherwise we'd not fall through to the "user
> request" case.

> But as far as I can tell, the only reason my query would be cancelled is
> because of the statement timeout. I wonder if there is some way the
> statement timeout flag is being lost/cleared. I see there's situations
> where we clear timeout flags.

I poked at this for awhile and concluded that what you must be seeing is
that the statement timeout interrupt triggers, but no CHECK_FOR_INTERRUPTS
call happens thereafter, until we get to the disable_statement_timeout()
call in finish_xact_command().  That throws away the timeout-happened
indicator, but it does nothing to reset InterruptPending or
QueryCancelPending.  Those flags will also be thrown away eventually,
after we've read a new client command:

        /*
         * (4) disable async signal conditions again.
         *
         * Query cancel is supposed to be a no-op when there is no query in
         * progress, so if a query cancel arrived while we were idle, just
         * reset QueryCancelPending. ProcessInterrupts() has that effect when
         * it's called when DoingCommandRead is set, so check for interrupts
         * before resetting DoingCommandRead.
         */
        CHECK_FOR_INTERRUPTS();
        DoingCommandRead = false;

Ignoring the timeout if it happened too late is what we're supposed
to be doing, so that's fine so far.

But: what happens if some other CHECK_FOR_INTERRUPTS occurs before we
reach the main input loop?  Of course, exactly the symptom you saw:
a bogus "canceling statement due to user request" error report.
What's worse, that happens after we've completed the transaction, leaving
the client with a very misleading idea of whether the command committed
or not.

So how could that happen?  After experimenting, I found a pretty
obvious candidate: statement duration logging happens after we
do finish_xact_command, and there's a CHECK_FOR_INTERRUPTS in
errfinish().  So I hypothesize that you have log_duration or
log_min_duration_statement enabled.

I was able to replicate the problem by lobotomizing a couple of
CHECK_FOR_INTERRUPTS that would normally happen in a simple
"select pg_sleep()" query (see attached for-demonstration-
purposes-only patch).  Then I get

regression=# set statement_timeout TO '1s';
SET
regression=# select pg_sleep(2.0);  -- completes normally after 2 sec
 pg_sleep 
----------
 
(1 row)

regression=# set log_min_duration_statement TO '100ms';
SET
regression=# select pg_sleep(2.0);  -- sits for 2 sec, then:
ERROR:  canceling statement due to user request

So that's not very good and we ought to do something about it.
We can't have disable_statement_timeout() clear QueryCancelPending,
because that might lose state about some other cause for query
cancel.  I'm inclined to think that we need to split QueryCancelPending
into more than one flag.

A separate question is how come the particular query you're complaining
about has (seemingly) a fairly wide window where it never does any
CHECK_FOR_INTERRUPTS call before terminating.  Perhaps there's someplace
we need to sprinkle a few more of those.

            regards, tom lane

diff --git a/src/backend/executor/nodeResult.c b/src/backend/executor/nodeResult.c
index 80ed9cc..d09ac95 100644
--- a/src/backend/executor/nodeResult.c
+++ b/src/backend/executor/nodeResult.c
@@ -72,7 +72,7 @@ ExecResult(PlanState *pstate)
     PlanState  *outerPlan;
     ExprContext *econtext;

-    CHECK_FOR_INTERRUPTS();
+//    CHECK_FOR_INTERRUPTS();

     econtext = node->ps.ps_ExprContext;

diff --git a/src/backend/utils/adt/misc.c b/src/backend/utils/adt/misc.c
index d330a88..9f515db 100644
--- a/src/backend/utils/adt/misc.c
+++ b/src/backend/utils/adt/misc.c
@@ -369,7 +369,7 @@ pg_sleep(PG_FUNCTION_ARGS)
         float8        delay;
         long        delay_ms;

-        CHECK_FOR_INTERRUPTS();
+//        CHECK_FOR_INTERRUPTS();

         delay = endtime - GetNowFloat();
         if (delay >= 600.0)

pgsql-general by date:

Previous
From: Thiemo Kellner
Date:
Subject: Re: pgmodeler ?
Next
From: Will Storey
Date:
Subject: Re: Unexpected "canceling statement due to user request" error