Re: BUG #15492: pg_cancel_backend(pg_backend_pid()) returns true sporadically - Mailing list pgsql-bugs

From Thomas Munro
Subject Re: BUG #15492: pg_cancel_backend(pg_backend_pid()) returns true sporadically
Date
Msg-id CAEepm=16ZLqpPk6LjPVORFrN0ix_zQ1VUM7Hs=iuP5ExGEE3sA@mail.gmail.com
Whole thread Raw
In response to Re: BUG #15492: pg_cancel_backend(pg_backend_pid()) returns true sporadically  (Magnus Hagander <magnus@hagander.net>)
Responses Re: BUG #15492: pg_cancel_backend(pg_backend_pid()) returns true sporadically  (Magnus Hagander <magnus@hagander.net>)
List pgsql-bugs
On Thu, Nov 8, 2018 at 11:31 PM Magnus Hagander <magnus@hagander.net> wrote:
> On Thu, Nov 8, 2018 at 5:31 AM PG Bug reporting form <noreply@postgresql.org> wrote:
>>
>> The following bug has been logged on the website:
>>
>> Bug reference:      15492
>> Logged by:          Alexander Lakhin
>> Email address:      exclusion@gmail.com
>> PostgreSQL version: 11.0
>> Operating system:   Windows 2012 R2
>> Description:
>>
>> When performing `make standbycheck` I get sporadic failure:
>>
>> ============== running regression test queries        ==============
>> test hs_standby_check         ... ok
>> test hs_standby_allowed       ... ok
>> test hs_standby_disallowed    ... ok
>> test hs_standby_functions     ... FAILED
>>
>> ======================
>>  1 of 4 tests failed.
>> ======================
>>
>> ***
>> C:/tmp/postgrespro-standard-10.6.1/src/test/regress/expected/hs_standby_functions.out   Wed
>> Nov  7 01:14:03 2018
>> ---
>> C:/tmp/postgrespro-standard-10.6.1/src/test/regress/results/hs_standby_functions.out    Wed
>> Nov  7 06:36:47 2018
>> ***************
>> *** 37,40 ****
>>
>>   -- suicide is painless
>>   select pg_cancel_backend(pg_backend_pid());
>> ! ERROR:  canceling statement due to user request
>> --- 37,44 ----
>>
>>   -- suicide is painless
>>   select pg_cancel_backend(pg_backend_pid());
>> !  pg_cancel_backend
>> ! -------------------
>> !  t
>> ! (1 row)
>> !
>>
>> ======================================================================
>>
>> In fact, I see the same when I just do in psql (using EnterpriseDB's
>> PostgreSQL 11 for Windows):
>>
>> postgres=# select pg_cancel_backend(pg_backend_pid());
>> ERROR:  canceling statement due to user request
>> postgres=# select pg_cancel_backend(pg_backend_pid());
>> ERROR:  canceling statement due to user request
>> postgres=# select pg_cancel_backend(pg_backend_pid());
>> ERROR:  canceling statement due to user request
>> postgres=# select pg_cancel_backend(pg_backend_pid());
>>  pg_cancel_backend
>> -------------------
>>  t
>> (1 row)
>>
>>
>> postgres=# select pg_cancel_backend(pg_backend_pid());
>>  pg_cancel_backend
>> -------------------
>>  t
>> (1 row)
>>
>>
>> postgres=# select pg_cancel_backend(pg_backend_pid());
>> ERROR:  canceling statement due to user request
>> postgres=#
>>
>> I couldn't reproduce it on Linux, though.
>> So if it's an expected behaviour, shouldn't the hs_standby_functions check
>> be fixed?
>> (I don't understand what is the point of this pg_cancel_backend call.)
>
>
> This is clearly a timing thing.
>
> The most common case is that the signal is sent and delivered while the pg_cancel_backend() command is still
executed.This is probably "always" happening on Unix due to how signals work. 
>
> On Windows, what happens in the case where it returns is that the signal is delivered to the "signal thread" (the
separatethread handling our signal emulation), but that thread is not scheduled to run until the pg_cancel_backend()
functionhas actually returned. Thus it returns the value and is then canceled. 
>
> That said, I agree with the question -- what is the point of this? pg_cancel_backend(pg_backend_pid()) can surely
onlyever cancel the pg_cancel_backend call itself, so it seems pointless. 
>
> The *comment* talks about suicide, which indicates that maybe the original intention was to use
pg_terminate_backend()? But it has also been i nthere since 2009, so why is this problem only showing up now? 

We saw a variant of this problem on appveyor (a Windows build-bot)
when testing Daniel's patch to add an optional message (search for
"timing"), and it was fixed as part of that patch, for the new code in
that patch:

https://www.postgresql.org/message-id/flat/C2C7C3EC-CC5F-44B6-9C78-637C88BD7D14@yesql.se

Perhaps other pre-existing tests need similar treatment?

--
Thomas Munro
http://www.enterprisedb.com


pgsql-bugs by date:

Previous
From: Michael Paquier
Date:
Subject: Re: BUG #15492: pg_cancel_backend(pg_backend_pid()) returns truesporadically
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT