Re: BUG #15492: pg_cancel_backend(pg_backend_pid()) returns true sporadically - Mailing list pgsql-bugs

From Magnus Hagander
Subject Re: BUG #15492: pg_cancel_backend(pg_backend_pid()) returns true sporadically
Date
Msg-id CABUevEzPW_OrC2nry0puBWhhz_e4+kx-Tv6=nsJGdJcGjzWUxQ@mail.gmail.com
Whole thread Raw
In response to Re: BUG #15492: pg_cancel_backend(pg_backend_pid()) returns true sporadically  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: BUG #15492: pg_cancel_backend(pg_backend_pid()) returns truesporadically  (Alexander Lakhin <exclusion@gmail.com>)
List pgsql-bugs
On Fri, Nov 9, 2018 at 2:25 AM Thomas Munro <thomas.munro@enterprisedb.com> wrote:
On Thu, Nov 8, 2018 at 11:31 PM Magnus Hagander <magnus@hagander.net> wrote:
> On Thu, Nov 8, 2018 at 5:31 AM PG Bug reporting form <noreply@postgresql.org> wrote:
>>
>> The following bug has been logged on the website:
>>
>> Bug reference:      15492
>> Logged by:          Alexander Lakhin
>> Email address:      exclusion@gmail.com
>> PostgreSQL version: 11.0
>> Operating system:   Windows 2012 R2
>> Description:
>>
>> When performing `make standbycheck` I get sporadic failure:
>>
>> ============== running regression test queries        ==============
>> test hs_standby_check         ... ok
>> test hs_standby_allowed       ... ok
>> test hs_standby_disallowed    ... ok
>> test hs_standby_functions     ... FAILED
>>
>> ======================
>>  1 of 4 tests failed.
>> ======================
>>
>> ***
>> C:/tmp/postgrespro-standard-10.6.1/src/test/regress/expected/hs_standby_functions.out   Wed
>> Nov  7 01:14:03 2018
>> ---
>> C:/tmp/postgrespro-standard-10.6.1/src/test/regress/results/hs_standby_functions.out    Wed
>> Nov  7 06:36:47 2018
>> ***************
>> *** 37,40 ****
>>
>>   -- suicide is painless
>>   select pg_cancel_backend(pg_backend_pid());
>> ! ERROR:  canceling statement due to user request
>> --- 37,44 ----
>>
>>   -- suicide is painless
>>   select pg_cancel_backend(pg_backend_pid());
>> !  pg_cancel_backend
>> ! -------------------
>> !  t
>> ! (1 row)
>> !
>>
>> ======================================================================
>>
>> In fact, I see the same when I just do in psql (using EnterpriseDB's
>> PostgreSQL 11 for Windows):
>>
>> postgres=# select pg_cancel_backend(pg_backend_pid());
>> ERROR:  canceling statement due to user request
>> postgres=# select pg_cancel_backend(pg_backend_pid());
>> ERROR:  canceling statement due to user request
>> postgres=# select pg_cancel_backend(pg_backend_pid());
>> ERROR:  canceling statement due to user request
>> postgres=# select pg_cancel_backend(pg_backend_pid());
>>  pg_cancel_backend
>> -------------------
>>  t
>> (1 row)
>>
>>
>> postgres=# select pg_cancel_backend(pg_backend_pid());
>>  pg_cancel_backend
>> -------------------
>>  t
>> (1 row)
>>
>>
>> postgres=# select pg_cancel_backend(pg_backend_pid());
>> ERROR:  canceling statement due to user request
>> postgres=#
>>
>> I couldn't reproduce it on Linux, though.
>> So if it's an expected behaviour, shouldn't the hs_standby_functions check
>> be fixed?
>> (I don't understand what is the point of this pg_cancel_backend call.)
>
>
> This is clearly a timing thing.
>
> The most common case is that the signal is sent and delivered while the pg_cancel_backend() command is still executed. This is probably "always" happening on Unix due to how signals work.
>
> On Windows, what happens in the case where it returns is that the signal is delivered to the "signal thread" (the separate thread handling our signal emulation), but that thread is not scheduled to run until the pg_cancel_backend() function has actually returned. Thus it returns the value and is then canceled.
>
> That said, I agree with the question -- what is the point of this? pg_cancel_backend(pg_backend_pid()) can surely only ever cancel the pg_cancel_backend call itself, so it seems pointless.
>
> The *comment* talks about suicide, which indicates that maybe the original intention was to use pg_terminate_backend()?  But it has also been i nthere since 2009, so why is this problem only showing up now?

We saw a variant of this problem on appveyor (a Windows build-bot)
when testing Daniel's patch to add an optional message (search for
"timing"), and it was fixed as part of that patch, for the new code in
that patch:

https://www.postgresql.org/message-id/flat/C2C7C3EC-CC5F-44B6-9C78-637C88BD7D14@yesql.se

Perhaps other pre-existing tests need similar treatment?

Ah yes, that seems to be the same thing, and yes that seem like a reasonalbe solution. So something like:
+select case
+       when pg_cancel_backend(pg_backend_pid())
+       then pg_sleep(60)
+end;
 
Alexander, can you check to see if making that change solves the issue on your machine?

--

pgsql-bugs by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: BUG #15492: pg_cancel_backend(pg_backend_pid()) returns true sporadically
Next
From: Tom Lane
Date:
Subject: Re: BUG #15492: pg_cancel_backend(pg_backend_pid()) returns true sporadically