Thread: A new function to wait for the backend exit after termination

A new function to wait for the backend exit after termination

From

Bharath Rupireddy

Date:

21 October 2020, 13:02:15

Hi,

Currently pg_terminate_backend(), sends SIGTERM to the backend process but doesn't ensure it's exit. There are chances that backends still are running(even after pg_terminate_backend() is called) until the interrupts are processed(using ProcessInterrupts()). This could cause problems especially in testing, for instance in a sql file right after pg_terminate_backend(), if any test case depends on the backend's non-existence[1], but the backend is not terminated. As discussed in [1], we have wait_pid()(see regress.c and sql/dblink.sql), but it's not usable across the system. In [1], we thought it would be better to have functions ensuring the backend's exit on the similar lines of pg_terminate_backend().

I propose to have two functions:

1. pg_terminate_backend_and_wait() -- which sends SIGTERM to the backend and wait's until it's exit.

2. pg_wait_backend() -- which waits for a given backend process. Note that this function has to be used carefully after pg_terminate_backend(), if used on a backend that's not ternmited it simply keeps waiting in a loop.

Attaching a WIP patch herewith.

Thoughts?

[1] https://www.postgresql.org/message-id/flat/f31cc4da-a7ea-677f-cf64-a2f9db854bf5%40oss.nttdata.com

With Regards,

Bharath Rupireddy.

EnterpriseDB: http://www.enterprisedb.com

Attachment

v1-0001-pg_terminate_backend_and_wait.patch

Re: A new function to wait for the backend exit after termination

From

Magnus Hagander

Date:

21 October 2020, 13:13:36

On Wed, Oct 21, 2020 at 3:02 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:

Hi,

Currently pg_terminate_backend(), sends SIGTERM to the backend process but doesn't ensure it's exit. There are chances that backends still are running(even after pg_terminate_backend() is called) until the interrupts are processed(using ProcessInterrupts()). This could cause problems especially in testing, for instance in a sql file right after pg_terminate_backend(), if any test case depends on the backend's non-existence[1], but the backend is not terminated. As discussed in [1], we have wait_pid()(see regress.c and sql/dblink.sql), but it's not usable across the system. In [1], we thought it would be better to have functions ensuring the backend's exit on the similar lines of pg_terminate_backend().

I propose to have two functions:

1. pg_terminate_backend_and_wait() -- which sends SIGTERM to the backend and wait's until it's exit.

I think it would be nicer to have a pg_terminate_backend(pid, wait=false), so a function with a second parameter which defaults to the current behaviour of not waiting. And it might be a good idea to also give it a timeout parameter?

2. pg_wait_backend() -- which waits for a given backend process. Note that this function has to be used carefully after pg_terminate_backend(), if used on a backend that's not ternmited it simply keeps waiting in a loop.

It seems this one also very much would need a timeout value.

And surely we should show some sort of wait event when it's waiting.

Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

Re: A new function to wait for the backend exit after termination

From

"David G. Johnston"

Date:

21 October 2020, 14:30:45

On Wed, Oct 21, 2020 at 6:13 AM Magnus Hagander <magnus@hagander.net> wrote:

On Wed, Oct 21, 2020 at 3:02 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
Hi,

Currently pg_terminate_backend(), sends SIGTERM to the backend process but doesn't ensure it's exit. There are chances that backends still are running(even after pg_terminate_backend() is called) until the interrupts are processed(using ProcessInterrupts()). This could cause problems especially in testing, for instance in a sql file right after pg_terminate_backend(), if any test case depends on the backend's non-existence[1], but the backend is not terminated. As discussed in [1], we have wait_pid()(see regress.c and sql/dblink.sql), but it's not usable across the system. In [1], we thought it would be better to have functions ensuring the backend's exit on the similar lines of pg_terminate_backend().

I propose to have two functions:

1. pg_terminate_backend_and_wait() -- which sends SIGTERM to the backend and wait's until it's exit.

I think it would be nicer to have a pg_terminate_backend(pid, wait=false), so a function with a second parameter which defaults to the current behaviour of not waiting. And it might be a good idea to also give it a timeout parameter?

Agreed on the overload, and the timeouts make sense too - with the caller deciding whether a timeout results in a failure or a false return value.

2. pg_wait_backend() -- which waits for a given backend process. Note that this function has to be used carefully after pg_terminate_backend(), if used on a backend that's not ternmited it simply keeps waiting in a loop.

It seems this one also very much would need a timeout value.

Is there a requirement for waiting to be superuser only? You are not affecting any session but your own during the waiting period.

I could imagine, in theory at least, wanting to wait for a backend to go idle as well as for it disappearing. Scope creep in terms of this patch's goal but worth at least considering now.

David J.

Re: A new function to wait for the backend exit after termination

From

Bharath Rupireddy

Date:

22 October 2020, 02:20:01

Thanks for the feedback.

On Wed, Oct 21, 2020 at 6:43 PM Magnus Hagander <magnus@hagander.net> wrote:
>
>> Currently pg_terminate_backend(), sends SIGTERM to the backend process but doesn't ensure it's exit. There are
chancesthat backends still are running(even after pg_terminate_backend() is called) until the interrupts are
processed(usingProcessInterrupts()). This could cause problems especially in testing, for instance in a sql file right
afterpg_terminate_backend(), if any test case depends on the backend's non-existence[1], but the backend is not
terminated.As discussed in [1], we have wait_pid()(see regress.c and sql/dblink.sql), but it's not usable across the
system.In [1], we thought it would be better to have functions ensuring the backend's exit on the similar lines of
pg_terminate_backend().
>>
>> I propose to have two functions:
>>
>> 1. pg_terminate_backend_and_wait() -- which sends SIGTERM to the backend and wait's until it's exit.
>
> I think it would be nicer to have a pg_terminate_backend(pid, wait=false), so a function with a second parameter
whichdefaults to the current behaviour of not waiting. And it might be a good idea to also give it a timeout parameter? 
>

+1 to have pg_terminate_backend(pid, wait=false, timeout), timeout in
milliseconds only valid if wait = true.

>
>> 2. pg_wait_backend() -- which waits for a given backend process. Note that this function has to be used carefully
afterpg_terminate_backend(), if used on a backend that's not ternmited it simply keeps waiting in a loop. 
>
> It seems this one also very much would need a timeout value.
>
> And surely we should show some sort of wait event when it's waiting.
>

Yes for this function too we can have a timeout value.
pg_wait_backend(pid, timeout), timeout in milliseconds.

I think we can use WaitLatch with the given timeout and with a new
wait event type WAIT_EVENT_BACKEND_SHUTDOWN  instead of pg_usleep for
achieving the given timeout mechanism. With WaitLatch we would also
get the waiting event in stats. Thoughts?

            rc = WaitLatch(MyLatch,
                       WL_LATCH_SET | WL_POSTMASTER_DEATH, timeout,
                       WAIT_EVENT_BACKEND_SHUTDOWN);

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: A new function to wait for the backend exit after termination

From

Andres Freund

Date:

22 October 2020, 02:35:25

On 2020-10-21 15:13:36 +0200, Magnus Hagander wrote:
> It seems this one also very much would need a timeout value.

I'm not really against that, but I wonder if we just end up
reimplementing statement timeout...

Re: A new function to wait for the backend exit after termination

From

Bharath Rupireddy

Date:

22 October 2020, 02:46:18

Thanks for the feedback.

On Wed, Oct 21, 2020 at 8:01 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Wed, Oct 21, 2020 at 6:13 AM Magnus Hagander <magnus@hagander.net> wrote:
>>
>> I think it would be nicer to have a pg_terminate_backend(pid, wait=false), so a function with a second parameter
whichdefaults to the current behaviour of not waiting. And it might be a good idea to also give it a timeout
parameter?
>
> Agreed on the overload, and the timeouts make sense too - with the caller deciding whether a timeout results in a
failureor a false return value.

>

If the backend is terminated within the user specified timeout then
the function returns true, otherwise false.

>
>>> 2. pg_wait_backend() -- which waits for a given backend process. Note that this function has to be used carefully
afterpg_terminate_backend(), if used on a backend that's not ternmited it simply keeps waiting in a loop.

>>
>> It seems this one also very much would need a timeout value.
>
> Is there a requirement for waiting to be superuser only?  You are not affecting any session but your own during the
waitingperiod.

>

IIUC, in the same patch instead of returning an error in case of
non-superusers, do we need to wait for user provided timeout
milliseconds until the current user becomes superuser and then throw
error if still non-superuser, and proceed further if superuser?

Do we need to have a new function that waits until a current
non-superuser in a session becomes superuser?

Something else?

>
> I could imagine, in theory at least, wanting to wait for a backend to go idle as well as for it disappearing.  Scope
creepin terms of this patch's goal but worth at least considering now.

>

IIUC, do we need a new option, something like pg_wait_backend(pid,
timeout, waituntil) where "waituntil" if specified "idle" waits until
the given backend goes to idle mode, or "termination" waits until
termination?

If my understanding is wrong, could you please explain more?

Re: A new function to wait for the backend exit after termination

From

"David G. Johnston"

Date:

22 October 2020, 03:09:35

On Wednesday, October 21, 2020, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:

Thanks for the feedback.

On Wed, Oct 21, 2020 at 8:01 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Wed, Oct 21, 2020 at 6:13 AM Magnus Hagander <magnus@hagander.net> wrote:
>>
>> I think it would be nicer to have a pg_terminate_backend(pid, wait=false), so a function with a second parameter which defaults to the current behaviour of not waiting. And it might be a good idea to also give it a timeout parameter?
>
> Agreed on the overload, and the timeouts make sense too - with the caller deciding whether a timeout results in a failure or a false return value.
>

If the backend is terminated within the user specified timeout then
the function returns true, otherwise false.

I’m suggesting an option for the second case to fail instead of returning false.

>
>>> 2. pg_wait_backend() -- which waits for a given backend process. Note that this function has to be used carefully after pg_terminate_backend(), if used on a backend that's not ternmited it simply keeps waiting in a loop.
>>
>> It seems this one also very much would need a timeout value.
>
> Is there a requirement for waiting to be superuser only? You are not affecting any session but your own during the waiting period.
>

IIUC, in the same patch instead of returning an error in case of
non-superusers, do we need to wait for user provided timeout
milliseconds until the current user becomes superuser and then throw
error if still non-superuser, and proceed further if superuser?

Do we need to have a new function that waits until a current
non-superuser in a session becomes superuser?

Something else?

Not sure how that would even be possible mid-statement. I was suggesting removing the superuser check altogether and letting any user execute “wait”.

>
> I could imagine, in theory at least, wanting to wait for a backend to go idle as well as for it disappearing. Scope creep in terms of this patch's goal but worth at least considering now.
>

IIUC, do we need a new option, something like pg_wait_backend(pid,
timeout, waituntil) where "waituntil" if specified "idle" waits until
the given backend goes to idle mode, or "termination" waits until
termination?

If my understanding is wrong, could you please explain more?

Yes, this describes what i was thinking.

David J.

Re: A new function to wait for the backend exit after termination

From

Bharath Rupireddy

Date:

22 October 2020, 04:12:33

On Thu, Oct 22, 2020 at 8:39 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
>> If the backend is terminated within the user specified timeout then
>> the function returns true, otherwise false.
>
> I’m suggesting an option for the second case to fail instead of returning false.
>

That seems fine.

>
>> >
>> > I could imagine, in theory at least, wanting to wait for a backend to go idle as well as for it disappearing.
Scopecreep in terms of this patch's goal but worth at least considering now. 
>> >
>>
>> IIUC, do we need a new option, something like pg_wait_backend(pid,
>> timeout, waituntil) where "waituntil" if specified "idle" waits until
>> the given backend goes to idle mode, or "termination" waits until
>> termination?
>>
>> If my understanding is wrong, could you please explain more?
>
>
> Yes, this describes what i was thinking.
>

+1.

I will implement these functionality and post a new patch soon.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: A new function to wait for the backend exit after termination

From

Bharath Rupireddy

Date:

28 October 2020, 11:50:46

On Wed, Oct 21, 2020 at 6:43 PM Magnus Hagander <magnus@hagander.net> wrote:
>
> I think it would be nicer to have a pg_terminate_backend(pid, wait=false), so a function with a second parameter
whichdefaults to the current behaviour of not waiting. And it might be a good idea to also give it a timeout parameter? 
>

Done.

>
>> 2. pg_wait_backend() -- which waits for a given backend process. Note that this function has to be used carefully
afterpg_terminate_backend(), if used on a backend that's not ternmited it simply keeps waiting in a loop. 
>
> It seems this one also very much would need a timeout value.
>

Done.

>
> And surely we should show some sort of wait event when it's waiting.
>

Added two wait events.

>
>> If the backend is terminated within the user specified timeout then
>> the function returns true, otherwise false.
>
> I’m suggesting an option for the second case to fail instead of returning false.
>

Done.

>
> > I could imagine, in theory at least, wanting to wait for a backend to go idle as well as for it disappearing.
Scopecreep in terms of this patch's goal but worth at least considering now. 
>
> IIUC, do we need a new option, something like pg_wait_backend(pid,
> timeout, waituntil) where "waituntil" if specified "idle" waits until
> the given backend goes to idle mode, or "termination" waits until
> termination?
>

Done.

Attaching a v2 patch herewith.

Thoughts and feedback are welcome.

Below things are still pending, which I plan to work on soon:

1. More testing and addition of test cases into the regression test suite.
2. Addition of the new function information into the docs.


With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Attachment

v2-0001-pg_wait_backend-and-pg_terminate_backend-with-wai.patch

Re: A new function to wait for the backend exit after termination

From

Fujii Masao

Date:

28 October 2020, 13:11:42

On 2020/10/28 20:50, Bharath Rupireddy wrote:
> On Wed, Oct 21, 2020 at 6:43 PM Magnus Hagander <magnus@hagander.net> wrote:
>>
>> I think it would be nicer to have a pg_terminate_backend(pid, wait=false), so a function with a second parameter
whichdefaults to the current behaviour of not waiting. And it might be a good idea to also give it a timeout
parameter?
>>
> 
> Done.
> 
>>
>>> 2. pg_wait_backend() -- which waits for a given backend process. Note that this function has to be used carefully
afterpg_terminate_backend(), if used on a backend that's not ternmited it simply keeps waiting in a loop.

>>
>> It seems this one also very much would need a timeout value.
>>
> 
> Done.
> 
>>
>> And surely we should show some sort of wait event when it's waiting.
>>
> 
> Added two wait events.
> 
>>
>>> If the backend is terminated within the user specified timeout then
>>> the function returns true, otherwise false.
>>
>> I’m suggesting an option for the second case to fail instead of returning false.
>>
> 
> Done.

I prefer that false is returned when the timeout happens,
like pg_promote() does.

> 
>>
>>> I could imagine, in theory at least, wanting to wait for a backend to go idle as well as for it disappearing.
Scopecreep in terms of this patch's goal but worth at least considering now.

>>
>> IIUC, do we need a new option, something like pg_wait_backend(pid,
>> timeout, waituntil) where "waituntil" if specified "idle" waits until
>> the given backend goes to idle mode, or "termination" waits until
>> termination?

Isn't this wait-for-idle mode fragile? Because there is no guarantee
that the backend is still in idle state when pg_wait_backend(idle) returns.

>>
> 
> Done.
> 
> Attaching a v2 patch herewith.
> 
> Thoughts and feedback are welcome.

Thanks for the patch!

When the specified timeout is negative, the following error is thrown *after*
SIGTERM is signaled to the target backend. This seems strange to me.
The timeout value should be verified at the beginning of the function, instead.

     ERROR:  timeout cannot be negative

pg_terminate_backend(xxx, false) failed with the following error. I think
it's more helpful if the function can work even without the timeout value.
That is, what about redefining the function in src/backend/catalog/system_views.sql
and specifying the DEFAULT values for the arguments "wait" and "timeout"?
The similar function "pg_promote" would be good reference to you.

     ERROR:  function pg_terminate_backend(integer, boolean) does not exist at character 8

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: A new function to wait for the backend exit after termination

From

Bharath Rupireddy

Date:

28 October 2020, 13:49:55

Thanks for the comments.

On Wed, Oct 28, 2020 at 6:41 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
> I prefer that false is returned when the timeout happens,
> like pg_promote() does.
>

Earlier it was suggested to error out on timeout. Since users can not
guess on time it takes to terminate or become idle, throwing error
seems to be odd on timeout. And also in case if the given pid is not a
backend pid, we are throwing a warning and returning false but not
error. Similarly we can return false on timeout, if required a
warning. Thoughts?

>
> >> IIUC, do we need a new option, something like pg_wait_backend(pid,
> >> timeout, waituntil) where "waituntil" if specified "idle" waits until
> >> the given backend goes to idle mode, or "termination" waits until
> >> termination?
>
> Isn't this wait-for-idle mode fragile? Because there is no guarantee
> that the backend is still in idle state when pg_wait_backend(idle) returns.
>

Yeah this can happen. By the time pg_wait_backend returns we could
have the idle state of the backend changed. Looks like this is also a
problem with the existing pgstat_get_backend_current_activity()
function. There we have a comment saying below and the function
returns a pointer to the current activity string. Maybe we could have
similar comments about the usage in the document?

 *    It is the caller's responsibility to invoke this only for backends whose
 *    state is expected to remain stable while the result is in use.

Does this problem exist even if we use  pg_stat_activity()?

>
> When the specified timeout is negative, the following error is thrown *after*
> SIGTERM is signaled to the target backend. This seems strange to me.
> The timeout value should be verified at the beginning of the function, instead.
>
>      ERROR:  timeout cannot be negative
>

Okay. I will change that.

>
> pg_terminate_backend(xxx, false) failed with the following error. I think
> it's more helpful if the function can work even without the timeout value.
> That is, what about redefining the function in src/backend/catalog/system_views.sql
> and specifying the DEFAULT values for the arguments "wait" and "timeout"?
> The similar function "pg_promote" would be good reference to you.
>
>      ERROR:  function pg_terminate_backend(integer, boolean) does not exist at character 8
>

Yeah. This seems good. I will have false as default value for the wait
parameter. I have defined the timeout to be in milliseconds, then how
about having a default value of 100 milliseconds?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: A new function to wait for the backend exit after termination

From

"David G. Johnston"

Date:

28 October 2020, 14:21:09

On Wed, Oct 28, 2020 at 6:50 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:

Thanks for the comments.

On Wed, Oct 28, 2020 at 6:41 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
> I prefer that false is returned when the timeout happens,
> like pg_promote() does.
>

Earlier it was suggested to error out on timeout.

For consideration. I'll give a point for being consistent with other existing functions, and it wouldn't be hard to extend should we want to add the option later, so while the more flexible API seems better on its face limiting ourselves to boolean false isn't a big deal to me; especially as I've yet to write code that would make use of this feature.

Since users can not
guess on time it takes to terminate or become idle, throwing error
seems to be odd on timeout.

I don't see how the one follows from the other.

And also in case if the given pid is not a
backend pid, we are throwing a warning and returning false but not
error.

Similarly we can return false on timeout, if required a
warning. Thoughts?

IMO, if there are multiple ways to return false then all of them should emit a notice or warning describing which of the false conditions was hit.

>
> >> IIUC, do we need a new option, something like pg_wait_backend(pid,
> >> timeout, waituntil) where "waituntil" if specified "idle" waits until
> >> the given backend goes to idle mode, or "termination" waits until
> >> termination?
>
> Isn't this wait-for-idle mode fragile? Because there is no guarantee
> that the backend is still in idle state when pg_wait_backend(idle) returns.
>

I was thinking this would be useful for orchestration. However, as you say, its a pretty fragile method. I withdraw the suggestion. What I would replace it with is a pg_wait_for_notify(payload_test) function that allows an SQL user to plug itself into the listen/notify feature and pause the session until a notification arrives. The session it is coordinating with would simply notify just before ending its script/transaction.

David J.

Re: A new function to wait for the backend exit after termination

From

Bharath Rupireddy

Date:

29 October 2020, 05:13:57

On Wed, Oct 28, 2020 at 7:51 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
>> And also in case if the given pid is not a
>> backend pid, we are throwing a warning and returning false but not
>> error.
>>
>> Similarly we can return false on timeout, if required a
>> warning. Thoughts?
>
> IMO, if there are multiple ways to return false then all of them should emit a notice or warning describing which of
thefalse conditions was hit. 
>

Currently there are two possibilities in pg_teriminate_backend where a
warning is thrown and false is returned. 1. when the process with a
given pid is not a backend 2. when we can not send the SIGTERM to the
given backend.

I will add another case to throw the warning and return false when
timeout occurs.

>>
>> > >> IIUC, do we need a new option, something like pg_wait_backend(pid,
>> > >> timeout, waituntil) where "waituntil" if specified "idle" waits until
>> > >> the given backend goes to idle mode, or "termination" waits until
>> > >> termination?
>> >
>> > Isn't this wait-for-idle mode fragile? Because there is no guarantee
>> > that the backend is still in idle state when pg_wait_backend(idle) returns.
>>
> I was thinking this would be useful for orchestration.  However, as you say, its a pretty fragile method.  I withdraw
thesuggestion. 
>

So, pg_wait_backend(pid, timeout) waits until the backend with a given
pid is terminated?

>
>What I would replace it with is a pg_wait_for_notify(payload_test) function that allows an SQL user to plug itself
intothe listen/notify feature and pause the session until a notification arrives.  The session it is coordinating with
would>simply notify just before ending its script/transaction. 
>

Why does one session need to listen and wait until another session
notifies? If my understanding is wrong, could you please elaborate on
the above point, the usage and the use case?

>
>For consideration.  I'll give a point for being consistent with other existing functions, and it wouldn't be hard to
extendshould we want to add the option later, so while the more flexible API seems better on its face limiting
ourselvesto >boolean false isn't a big deal to me; especially as I've yet to write code that would make use of this
feature.
>

I see that this pg_wait_backend(pid, timeout) functionality can be
right away used in two places, one in dblink.sql where wait_pid is
being used, second in postgres_fdw.sql where
terminate_backend_and_wait() is being used. However we can make these
changes as part of another patch set after the proposed two new
functions are finalized and reviewed.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: A new function to wait for the backend exit after termination

From

"David G. Johnston"

Date:

29 October 2020, 05:21:00

On Wed, Oct 28, 2020 at 10:14 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:

On Wed, Oct 28, 2020 at 7:51 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:

> I was thinking this would be useful for orchestration. However, as you say, its a pretty fragile method. I withdraw the suggestion.
>

So, pg_wait_backend(pid, timeout) waits until the backend with a given
pid is terminated?

Yes. The original proposal.

>
>What I would replace it with is a pg_wait_for_notify(payload_test) function that allows an SQL user to plug itself into the listen/notify feature and pause the session until a notification arrives. The session it is coordinating with would >simply notify just before ending its script/transaction.
>

Why does one session need to listen and wait until another session
notifies? If my understanding is wrong, could you please elaborate on
the above point, the usage and the use case?

Theory, but I imagine writing an isolation test like test script where the two sessions wait for notifications instead of sleep for random amounts of time.

More generally, psql is very powerful but doesn't allow scripting to plug into pub/sub. I don't have a concrete use case for why it should but the capability doesn't seem far-fetched.

I'm not saying this is something that is needed, rather it would seem more useful than wait_for_idle.

David J.

Re: A new function to wait for the backend exit after termination

From

Bharath Rupireddy

Date:

31 October 2020, 10:58:18

On Wed, Oct 28, 2020 at 6:41 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
> I prefer that false is returned when the timeout happens,
> like pg_promote() does.
>

Done.

>
> When the specified timeout is negative, the following error is thrown *after*
> SIGTERM is signaled to the target backend. This seems strange to me.
> The timeout value should be verified at the beginning of the function, instead.
>
>      ERROR:  timeout cannot be negative
>

I'm not throwing error for this case, instead a warning and returning
false. This is to keep it consistent with other cases such as the
given pid is not a backend pid.

Attaching the v3 patch. I tried to address the review comments
received so far and added documentation. I tested the patch locally
here. I saw that we don't have any test cases for existing
pg_terminate_backend(), do we need to add test cases into regression
suites for these two new functions?

Please review the v3 patch and let me know comments.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Attachment

v3-0001-pg_terminate_backend-with-wait-timeout-and-pg_wai.patch

Re: A new function to wait for the backend exit after termination

From

Muhammad Usama

Date:

30 November 2020, 14:39:31

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           tested, passed
Documentation:            not tested

I have tested the patch against current master branch (commit:6742e14959a3033d946ab3d67f5ce4c99367d332)
Both functions work without a problem and as expected.
Just a tiny comment/suggestion.
specifying a -ve timeout in pg_terminate_backed rightly throws an error, 
I am not sure if it would be right or a wrong approach but I guess we can ignore -ve
timeout in pg_terminate_backend function when wait (second argument) is false.

e.g.  pg_terminate_backend(12320, false,-1); -- ignore -1 timout since wait is false

The new status of this patch is: Ready for Committer

Re: A new function to wait for the backend exit after termination

From

Bharath Rupireddy

Date:

02 December 2020, 08:30:42

On Mon, Nov 30, 2020 at 8:10 PM Muhammad Usama <m.usama@gmail.com> wrote:
>
> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, passed
> Implements feature:       tested, passed
> Spec compliant:           tested, passed
> Documentation:            not tested
>
> I have tested the patch against current master branch (commit:6742e14959a3033d946ab3d67f5ce4c99367d332)
> Both functions work without a problem and as expected.
>

Thanks!

>
> Just a tiny comment/suggestion.
> specifying a -ve timeout in pg_terminate_backed rightly throws an error,
> I am not sure if it would be right or a wrong approach but I guess we can ignore -ve
> timeout in pg_terminate_backend function when wait (second argument) is false.
>
> e.g.  pg_terminate_backend(12320, false,-1); -- ignore -1 timout since wait is false
>

IMO, that's not a good idea. I see it this way, for any function first
the input args have to be validated. If okay, then follows the use of
those args and the main functionality. I can also see pg_promote(),
which first does the input timeout validation throwing error if it is
<= 0.

We can retain the existing behaviour.

>
> The new status of this patch is: Ready for Committer
>

Thanks!

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: A new function to wait for the backend exit after termination

From

Muhammad Usama

Date:

02 December 2020, 13:46:43

On Wed, Dec 2, 2020 at 1:30 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:

On Mon, Nov 30, 2020 at 8:10 PM Muhammad Usama <m.usama@gmail.com> wrote:
>
> The following review has been posted through the commitfest application:
> make installcheck-world: tested, passed
> Implements feature: tested, passed
> Spec compliant: tested, passed
> Documentation: not tested
>
> I have tested the patch against current master branch (commit:6742e14959a3033d946ab3d67f5ce4c99367d332)
> Both functions work without a problem and as expected.
>

Thanks!

>
> Just a tiny comment/suggestion.
> specifying a -ve timeout in pg_terminate_backed rightly throws an error,
> I am not sure if it would be right or a wrong approach but I guess we can ignore -ve
> timeout in pg_terminate_backend function when wait (second argument) is false.
>
> e.g. pg_terminate_backend(12320, false,-1); -- ignore -1 timout since wait is false
>

IMO, that's not a good idea. I see it this way, for any function first
the input args have to be validated. If okay, then follows the use of
those args and the main functionality. I can also see pg_promote(),
which first does the input timeout validation throwing error if it is
<= 0.

We can retain the existing behaviour.

Agreed!

Thanks

Best regards

Muhammad Usama

>
> The new status of this patch is: Ready for Committer
>

Thanks!

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

RE: A new function to wait for the backend exit after termination

From

"Hou, Zhijie"

Date:

03 December 2020, 01:54:33

Hi

I take a look into the patch, and here some comments.

1.
+
+    ereport(WARNING,
+            (errmsg("could not wait for the termination of the backend with PID %d within %ld milliseconds",
+                    pid, timeout)));
+

The code use %ld to print int64 type.
How about use INT64_FORMAT, which looks more appropriate. 

2.
+    if (timeout <= 0)
+    {
+        ereport(WARNING,
+                (errmsg("timeout cannot be negative or zero: %ld", timeout)));
+        PG_RETURN_BOOL(r);
+    }

The same as 1.

3.
+pg_terminate_backend_and_wait(PG_FUNCTION_ARGS)
+{
+    int     pid = PG_GETARG_DATUM(0);

+pg_wait_backend(PG_FUNCTION_ARGS)
+{
+    int        pid = PG_GETARG_INT32(0);

The code use different macro to get pid,
How about use PG_GETARG_INT32(0) for each one.


I changed the status to 'wait on anthor'.
The others of the patch LGTM, 
I think it can be changed to Ready for Committer again, when this comment is confirmed.


Best regards,
houzj

Re: A new function to wait for the backend exit after termination

From

Tom Lane

Date:

03 December 2020, 02:01:34

"Hou, Zhijie" <houzj.fnst@cn.fujitsu.com> writes:
> +    ereport(WARNING,
> +            (errmsg("could not wait for the termination of the backend with PID %d within %ld milliseconds",
> +                    pid, timeout)));

> The code use %ld to print int64 type.
> How about use INT64_FORMAT, which looks more appropriate.

This is a translatable message, so INT64_FORMAT is no good -- we need
something that is the same across platforms.  The current project standard
for this problem is to use "%lld" and explicitly cast the argument to long
long int to match that.

            regards, tom lane

RE: A new function to wait for the backend exit after termination

From

"Hou, Zhijie"

Date:

03 December 2020, 02:21:51

> > +    ereport(WARNING,
> > +            (errmsg("could not wait for the termination of the
> backend with PID %d within %ld milliseconds",
> > +                    pid, timeout)));
>
> > The code use %ld to print int64 type.
> > How about use INT64_FORMAT, which looks more appropriate.
>
> This is a translatable message, so INT64_FORMAT is no good -- we need
> something that is the same across platforms.  The current project standard
> for this problem is to use "%lld" and explicitly cast the argument to long
> long int to match that.

Thank you for pointing out that,
And sorry for did not think of it.

Yes, we can use %lld, (long long int) timeout.

Best regards,
houzj

RE: A new function to wait for the backend exit after termination

From

"Hou, Zhijie"

Date:

03 December 2020, 03:32:46

> 
> I changed the status to 'wait on anthor'.
> The others of the patch LGTM,
> I think it can be changed to Ready for Committer again, when this comment
> is confirmed.
> 

I am Sorry I forgot a possible typo comment.

+{ oid => '16386', descr => 'terminate a backend process and wait for it\'s exit or until timeout occurs'

Does the following change looks better?

Wait for it\'s exit => Wait for its exit

Best regards,
houzj

Re: A new function to wait for the backend exit after termination

From

Bharath Rupireddy

Date:

03 December 2020, 03:56:08

Thanks for the review.

On Thu, Dec 3, 2020 at 7:24 AM Hou, Zhijie <houzj.fnst@cn.fujitsu.com> wrote:
>
> 1.
> +
> +       ereport(WARNING,
> +                       (errmsg("could not wait for the termination of the backend with PID %d within %ld
milliseconds",
> +                                       pid, timeout)));
> +
>
> The code use %ld to print int64 type.
> How about use INT64_FORMAT, which looks more appropriate.
>

Changed it to use %lld and typecasting timeout to (long long int) as
suggested by Tom.

>
> 2.
> +       if (timeout <= 0)
> +       {
> +               ereport(WARNING,
> +                               (errmsg("timeout cannot be negative or zero: %ld", timeout)));
> +               PG_RETURN_BOOL(r);
> +       }
>
> The same as 1.
>

Changed.

>
> 3.
> +pg_terminate_backend_and_wait(PG_FUNCTION_ARGS)
> +{
> +       int     pid = PG_GETARG_DATUM(0);
>
> +pg_wait_backend(PG_FUNCTION_ARGS)
> +{
> +       int             pid = PG_GETARG_INT32(0);
>
> The code use different macro to get pid,
> How about use PG_GETARG_INT32(0) for each one.
>

Changed.

> I am Sorry I forgot a possible typo comment.
>
> +{ oid => '16386', descr => 'terminate a backend process and wait for it\'s exit or until timeout occurs'
>
> Does the following change looks better?
>
> Wait for it\'s exit => Wait for its exit
>

Changed.

>
> I changed the status to 'wait on anthor'.
> The others of the patch LGTM,
> I think it can be changed to Ready for Committer again, when this comment is confirmed.
>

Attaching v4 patch. Please have a look.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Attachment

v4-0001-pg_terminate_backend-with-wait-timeout-and-pg_wai.patch

RE: A new function to wait for the backend exit after termination

From

"Hou, Zhijie"

Date:

04 December 2020, 03:13:42

Hi,

-        however only superusers can terminate superuser backends.
+        however only superusers can terminate superuser backends. When no
+        <parameter>wait</parameter> and <parameter>timeout</parameter> are
+        provided, only SIGTERM is sent to the backend with the given process
+        ID and <literal>false</literal> is returned immediately. But the

I test the case when no wait and timeout are provided.
True is returned as the following which seems different from the doc.

postgres=# select pg_terminate_backend(pid);
 pg_terminate_backend 
----------------------
 t
(1 row)

Best regards,
houzj

Re: A new function to wait for the backend exit after termination

From

Bharath Rupireddy

Date:

04 December 2020, 06:29:51

On Fri, Dec 4, 2020 at 8:44 AM Hou, Zhijie <houzj.fnst@cn.fujitsu.com> wrote:
>
> - however only superusers can terminate superuser backends.
> + however only superusers can terminate superuser backends. When no
> + <parameter>wait</parameter> and <parameter>timeout</parameter> are
> + provided, only SIGTERM is sent to the backend with the given process
> + ID and <literal>false</literal> is returned immediately. But the
>
> I test the case when no wait and timeout are provided.
> True is returned as the following which seems different from the doc.
>
> postgres=# select pg_terminate_backend(pid);
> pg_terminate_backend
> ----------------------
> t
> (1 row)
>

Thanks for pointing that out. I reworded that statement. Attaching v5 patch. Please have a look.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Attachment

v5-0001-pg_terminate_backend-with-wait-timeout-and-pg_wai.patch

RE: A new function to wait for the backend exit after termination

From

"Hou, Zhijie"

Date:

04 December 2020, 08:31:40

Hi,

When test pg_terminate_backend_and_wait with parallel query.
I noticed that the function is not defined as parallel safe.

I am not very familiar with the standard about whether a function should be parallel safe.
But I found the following function are all defined as parallel safe:

pg_promote
pg_terminate_backend(integer)
pg_sleep*

Is there a reason why pg_terminate_backend_and_wait are not parallel safe ?
(I'm sorry if I miss something in previous mails.)

Best regards,
houzj

Re: A new function to wait for the backend exit after termination

From

Bharath Rupireddy

Date:

04 December 2020, 09:13:28

On Fri, Dec 4, 2020 at 2:02 PM Hou, Zhijie <houzj.fnst@cn.fujitsu.com> wrote:
>
> Hi,
>
> When test pg_terminate_backend_and_wait with parallel query.
> I noticed that the function is not defined as parallel safe.
>
> I am not very familiar with the standard about whether a function should be parallel safe.
> But I found the following function are all defined as parallel safe:
>
> pg_promote
> pg_terminate_backend(integer)
> pg_sleep*
>
> Is there a reason why pg_terminate_backend_and_wait are not parallel safe ?
> (I'm sorry if I miss something in previous mails.)
>

I'm not quite sure of a use case where existing pg_terminate_backend()
or for that matter the new pg_terminate_backend_and_wait() and
pg_wait_backend() will ever get used from parallel workers. Having
said that, I marked the new functions as parallel safe to keep it the
way it is with existing pg_terminate_backend().

postgres=# select proparallel, proname, prosrc from pg_proc where
proname IN ('pg_wait_backend', 'pg_terminate_backend');
 proparallel |       proname        |            prosrc
-------------+----------------------+-------------------------------
 s           | pg_terminate_backend | pg_terminate_backend
 s           | pg_wait_backend      | pg_wait_backend
 s           | pg_terminate_backend | pg_terminate_backend_and_wait
(3 rows)

Attaching v6 patch. Please have a look.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Attachment

v6-0001-pg_terminate_backend-with-wait-timeout-and-pg_wai.patch

Re: A new function to wait for the backend exit after termination

From

hou zhijie

Date:

04 December 2020, 09:29:01

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           tested, passed
Documentation:            not tested

Thanks for the new patch, the patch LGTM and works as expected

The new status of this patch is: Ready for Committer

Re: A new function to wait for the backend exit after termination

From

Fujii Masao

Date:

18 March 2021, 07:16:54


On 2021/03/17 11:58, Kyotaro Horiguchi wrote:
> The first suggested signature for pg_terminate_backend() with timeout
> was pg_terminate_backend(pid, timeout).  The current signature (pid,
> wait?, timeout) looks redundant.  Maybe the reason for rejecting 0
> astimeout is pg_terminate_backend(pid, true, 0) looks odd but it we
> can wait forever in that case (as other features does).

I'm afraid that "waiting forever" can cause something like deadlock situation,
as follows. We have no mechanism to detect this for now.

1. backend 1 took the lock on the relation A.
2. backend 2 took the lock on the relation B.
3. backend 1 tries to take the lock on the relation B and is waiting for
    the lock to be released.
4. backend 2 accidentally executes pg_wait_for_backend_termination() with
    the pid of backend 1, and then is waiting for backend 1 to be terminated.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: A new function to wait for the backend exit after termination

From

Bharath Rupireddy

Date:

18 March 2021, 07:41:37

On Thu, Mar 18, 2021 at 12:46 PM Fujii Masao
<masao.fujii@oss.nttdata.com> wrote:
> On 2021/03/17 11:58, Kyotaro Horiguchi wrote:
> > The first suggested signature for pg_terminate_backend() with timeout
> > was pg_terminate_backend(pid, timeout).  The current signature (pid,
> > wait?, timeout) looks redundant.  Maybe the reason for rejecting 0
> > astimeout is pg_terminate_backend(pid, true, 0) looks odd but it we
> > can wait forever in that case (as other features does).
>
> I'm afraid that "waiting forever" can cause something like deadlock situation,
> as follows. We have no mechanism to detect this for now.
>
> 1. backend 1 took the lock on the relation A.
> 2. backend 2 took the lock on the relation B.
> 3. backend 1 tries to take the lock on the relation B and is waiting for
>     the lock to be released.
> 4. backend 2 accidentally executes pg_wait_for_backend_termination() with
>     the pid of backend 1, and then is waiting for backend 1 to be terminated.

Yeah this can happen.

So, as stated upthread, how about a timeout 0 (which is default)
telling "don't wait", erroring out on negative value and when
specified a positive milliseconds value, then wait for that amount of
time. With this semantics, we can remove the wait flag for
pg_terminate_backend(pid, 0). Thoughts?

And for pg_wait_for_backend_termination timeout 0 or negative, we
error out. Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: A new function to wait for the backend exit after termination

From

Bharath Rupireddy

Date:

19 March 2021, 06:07:26

On Thu, Mar 18, 2021 at 1:11 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Thu, Mar 18, 2021 at 12:46 PM Fujii Masao
> <masao.fujii@oss.nttdata.com> wrote:
> > On 2021/03/17 11:58, Kyotaro Horiguchi wrote:
> > > The first suggested signature for pg_terminate_backend() with timeout
> > > was pg_terminate_backend(pid, timeout).  The current signature (pid,
> > > wait?, timeout) looks redundant.  Maybe the reason for rejecting 0
> > > astimeout is pg_terminate_backend(pid, true, 0) looks odd but it we
> > > can wait forever in that case (as other features does).
> >
> > I'm afraid that "waiting forever" can cause something like deadlock situation,
> > as follows. We have no mechanism to detect this for now.
> >
> > 1. backend 1 took the lock on the relation A.
> > 2. backend 2 took the lock on the relation B.
> > 3. backend 1 tries to take the lock on the relation B and is waiting for
> >     the lock to be released.
> > 4. backend 2 accidentally executes pg_wait_for_backend_termination() with
> >     the pid of backend 1, and then is waiting for backend 1 to be terminated.
>
> Yeah this can happen.
>
> So, as stated upthread, how about a timeout 0 (which is default)
> telling "don't wait", erroring out on negative value and when
> specified a positive milliseconds value, then wait for that amount of
> time. With this semantics, we can remove the wait flag for
> pg_terminate_backend(pid, 0). Thoughts?
>
> And for pg_wait_for_backend_termination timeout 0 or negative, we
> error out. Thoughts?

Attaching v11 patch that removed the wait boolean flag in the
pg_terminate_backend and timeout 0 indicates "no wait", negative value
"errors out", positive value "waits for those many milliseconds". Also
addressed other review comments that I received upthread. Please
review v11 further.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Attachment

v11-0001-pg_terminate_backend-with-wait-and-timeout.patch

Re: A new function to wait for the backend exit after termination

From

Bharath Rupireddy

Date:

05 April 2021, 03:21:03

On Fri, Mar 19, 2021 at 11:37 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
> Attaching v11 patch that removed the wait boolean flag in the
> pg_terminate_backend and timeout 0 indicates "no wait", negative value
> "errors out", positive value "waits for those many milliseconds". Also
> addressed other review comments that I received upthread. Please
> review v11 further.

Attaching v12 patch after rebasing onto the latest master.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Attachment

v12-0001-pg_terminate_backend-with-wait-and-timeout.patch

Re: A new function to wait for the backend exit after termination

From

Magnus Hagander

Date:

08 April 2021, 09:41:17

On Mon, Apr 5, 2021 at 5:21 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Fri, Mar 19, 2021 at 11:37 AM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> > Attaching v11 patch that removed the wait boolean flag in the
> > pg_terminate_backend and timeout 0 indicates "no wait", negative value
> > "errors out", positive value "waits for those many milliseconds". Also
> > addressed other review comments that I received upthread. Please
> > review v11 further.
>
> Attaching v12 patch after rebasing onto the latest master.

I've applied this patch with some minor changes.

I rewrote some parts of the documentation to make it more focused on
the end user rather than the implementation. I also made a small
simplification in pg_terminate_backend() which removes the "wait"
variable (seems like a bit of a leftover since the time when it was a
separate argument). And picked a correct oid for the function (oids
8000-9999 should be used for new patches, 16386 is in the user area of
oids)

Thanks!

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/

Re: A new function to wait for the backend exit after termination

From

Noah Misch

Date:

01 June 2021, 03:48:58

On Thu, Apr 08, 2021 at 11:41:17AM +0200, Magnus Hagander wrote:
> I've applied this patch with some minor changes.

I wondered if the new pg_wait_for_backend_termination() could replace
regress.c:wait_pid().  I think it can't, because the new function requires the
backend to still be present in the procarray:

    proc = BackendPidGetProc(pid);

    if (proc == NULL)
    {
        ereport(WARNING,
                (errmsg("PID %d is not a PostgreSQL server process", pid)));

        PG_RETURN_BOOL(false);
    }

    PG_RETURN_BOOL(pg_wait_until_termination(pid, timeout));

If a backend has left the procarray but not yet left the kernel process table,
this function will issue the warning and not wait for the final exit.  Given
that limitation, is pg_wait_for_backend_termination() useful enough?  If
waiting for procarray departure is enough, should pg_wait_until_termination()
check BackendPidGetProc(pid) instead of kill(0, pid), so it can return
earlier?  I can see the value of adding the pg_terminate_backend() timeout
argument, in any case.

Re: A new function to wait for the backend exit after termination

From

Bharath Rupireddy

Date:

01 June 2021, 07:55:24

On Tue, Jun 1, 2021 at 9:19 AM Noah Misch <noah@leadboat.com> wrote:
>
> On Thu, Apr 08, 2021 at 11:41:17AM +0200, Magnus Hagander wrote:
> > I've applied this patch with some minor changes.

Thanks for taking a look at this function.

> I wondered if the new pg_wait_for_backend_termination() could replace
> regress.c:wait_pid().

I was earlier thinking of replacing the wait_pid() with the new
function but arrived at a similar conclusion as yours.

> I think it can't, because the new function requires the
> backend to still be present in the procarray:
>
>         proc = BackendPidGetProc(pid);
>
>         if (proc == NULL)
>         {
>                 ereport(WARNING,
>                                 (errmsg("PID %d is not a PostgreSQL server process", pid)));
>
>                 PG_RETURN_BOOL(false);
>         }
>
>         PG_RETURN_BOOL(pg_wait_until_termination(pid, timeout));
>
> If a backend has left the procarray but not yet left the kernel process table,
> this function will issue the warning and not wait for the final exit.

Yes, if the backend is not in procarray but still in the kernel
process table, it emits a warning "PID %d is not a PostgreSQL server
process" and returns false.

> Given that limitation, is pg_wait_for_backend_termination() useful enough?  If
> waiting for procarray departure is enough, should pg_wait_until_termination()
> check BackendPidGetProc(pid) instead of kill(0, pid), so it can return
> earlier?

We can just remove BackendPidGetProc(pid) in
pg_wait_for_backend_termination. With this change, we can get rid of
the wait_pid() from regress.c. But, my concern is that the
pg_wait_for_backend_termination() can also check non-postgres server
process pid. Is this okay? In that case, this function becomes a
generic(OS level function) rather than a postgres server specific
function. I'm not sure if all agree to that. Thoughts?

> I can see the value of adding the pg_terminate_backend() timeout
> argument, in any case.

True. We can leave pg_terminate_backend() as is.

With Regards,
Bharath Rupireddy.

Re: A new function to wait for the backend exit after termination

From

Noah Misch

Date:

05 June 2021, 01:32:36

On Tue, Jun 01, 2021 at 01:25:24PM +0530, Bharath Rupireddy wrote:
> On Tue, Jun 1, 2021 at 9:19 AM Noah Misch <noah@leadboat.com> wrote:
> > Given that limitation, is pg_wait_for_backend_termination() useful enough?  If
> > waiting for procarray departure is enough, should pg_wait_until_termination()
> > check BackendPidGetProc(pid) instead of kill(0, pid), so it can return
> > earlier?
> 
> We can just remove BackendPidGetProc(pid) in
> pg_wait_for_backend_termination. With this change, we can get rid of
> the wait_pid() from regress.c. But, my concern is that the
> pg_wait_for_backend_termination() can also check non-postgres server
> process pid. Is this okay?

It may or may not be okay.  I would not feel good about it.

> In that case, this function becomes a
> generic(OS level function) rather than a postgres server specific
> function. I'm not sure if all agree to that. Thoughts?

My preference is to remove pg_wait_for_backend_termination().  The use case
that prompted this thread used pg_terminate_backend(pid, 180000); it doesn't
need pg_wait_for_backend_termination().

Re: A new function to wait for the backend exit after termination

From

Bharath Rupireddy

Date:

05 June 2021, 06:36:46

On Sat, Jun 5, 2021 at 7:02 AM Noah Misch <noah@leadboat.com> wrote:
>
> On Tue, Jun 01, 2021 at 01:25:24PM +0530, Bharath Rupireddy wrote:
> > On Tue, Jun 1, 2021 at 9:19 AM Noah Misch <noah@leadboat.com> wrote:
> > > Given that limitation, is pg_wait_for_backend_termination() useful enough?  If
> > > waiting for procarray departure is enough, should pg_wait_until_termination()
> > > check BackendPidGetProc(pid) instead of kill(0, pid), so it can return
> > > earlier?
> >
> > We can just remove BackendPidGetProc(pid) in
> > pg_wait_for_backend_termination. With this change, we can get rid of
> > the wait_pid() from regress.c. But, my concern is that the
> > pg_wait_for_backend_termination() can also check non-postgres server
> > process pid. Is this okay?
>
> It may or may not be okay.  I would not feel good about it.
>
> > In that case, this function becomes a
> > generic(OS level function) rather than a postgres server specific
> > function. I'm not sure if all agree to that. Thoughts?
>
> My preference is to remove pg_wait_for_backend_termination().  The use case
> that prompted this thread used pg_terminate_backend(pid, 180000); it doesn't
> need pg_wait_for_backend_termination().

I was earlier thinking that the function
pg_wait_for_backend_termination() will be useful:
1) If the user wants to pg_terminate_backend(<<pid>>); and
pg_wait_for_backend_termination(<<pid>>, <<timeout>>); separately. It
seems like the proc array entry will be removed as part of SITERM
processing (see [1]) and the BackendPidGetProc will return NULL. So,
it's not useful here.
2) If the user wants to pg_wait_for_backend_termination(<<pid>>,
<<timeout>>);, thinking that some event might cause the backend to be
terminated within the <<timeout>>. So, it's still useful here.

[1]
(gdb) bt
#0  ProcArrayRemove (proc=0x55b27f26356c
<CleanupInvalidationState+278>, latestXid=32764)
    at procarray.c:526
#1  0x000055b27f281c9d in RemoveProcFromArray (code=1, arg=0) at proc.c:812
#2  0x000055b27f2542ce in shmem_exit (code=1) at ipc.c:272
#3  0x000055b27f2540d5 in proc_exit_prepare (code=1) at ipc.c:194
#4  0x000055b27f254022 in proc_exit (code=1) at ipc.c:107
#5  0x000055b27f449479 in errfinish (filename=0x55b27f61cd65
"postgres.c", lineno=3191,
    funcname=0x55b27f61e770 <__func__.40727> "ProcessInterrupts") at elog.c:666
#6  0x000055b27f29097e in ProcessInterrupts () at postgres.c:3191
#7  0x000055b27f28cbf0 in ProcessClientReadInterrupt (blocked=true) at
postgres.c:499

With Regards,
Bharath Rupireddy.

Re: A new function to wait for the backend exit after termination

From

Noah Misch

Date:

05 June 2021, 19:08:01

On Sat, Jun 05, 2021 at 12:06:46PM +0530, Bharath Rupireddy wrote:
> On Sat, Jun 5, 2021 at 7:02 AM Noah Misch <noah@leadboat.com> wrote:
> > On Tue, Jun 01, 2021 at 01:25:24PM +0530, Bharath Rupireddy wrote:
> > > On Tue, Jun 1, 2021 at 9:19 AM Noah Misch <noah@leadboat.com> wrote:
> > > > Given that limitation, is pg_wait_for_backend_termination() useful enough?  If
> > > > waiting for procarray departure is enough, should pg_wait_until_termination()
> > > > check BackendPidGetProc(pid) instead of kill(0, pid), so it can return
> > > > earlier?
> > >
> > > We can just remove BackendPidGetProc(pid) in
> > > pg_wait_for_backend_termination. With this change, we can get rid of
> > > the wait_pid() from regress.c. But, my concern is that the
> > > pg_wait_for_backend_termination() can also check non-postgres server
> > > process pid. Is this okay?
> >
> > It may or may not be okay.  I would not feel good about it.
> >
> > > In that case, this function becomes a
> > > generic(OS level function) rather than a postgres server specific
> > > function. I'm not sure if all agree to that. Thoughts?
> >
> > My preference is to remove pg_wait_for_backend_termination().  The use case
> > that prompted this thread used pg_terminate_backend(pid, 180000); it doesn't
> > need pg_wait_for_backend_termination().
> 
> I was earlier thinking that the function
> pg_wait_for_backend_termination() will be useful:
> 1) If the user wants to pg_terminate_backend(<<pid>>); and
> pg_wait_for_backend_termination(<<pid>>, <<timeout>>); separately. It
> seems like the proc array entry will be removed as part of SITERM
> processing (see [1]) and the BackendPidGetProc will return NULL. So,
> it's not useful here.
> 2) If the user wants to pg_wait_for_backend_termination(<<pid>>,
> <<timeout>>);, thinking that some event might cause the backend to be
> terminated within the <<timeout>>. So, it's still useful here.

That is factual.  That pg_wait_for_backend_termination() appears to be useful
for (1) but isn't useful for (1) reduces its value.  I think it reduces the
value slightly below zero.  Relevant to that, if a user doesn't care about the
distinction between "backend has left the procarray" and "backend's PID has
left the kernel process table", that user can poll pg_stat_activity to achieve
the same level of certainty that pg_wait_for_backend_termination() offers.

Re: A new function to wait for the backend exit after termination

From

Bharath Rupireddy

Date:

14 June 2021, 14:04:59

On Sat, Jun 12, 2021 at 10:07 AM Noah Misch <noah@leadboat.com> wrote:
>
> On Fri, Jun 11, 2021 at 08:54:08PM -0500, Justin Pryzby wrote:
> > On Sat, Jun 05, 2021 at 12:08:01PM -0700, Noah Misch wrote:
> > > > > My preference is to remove pg_wait_for_backend_termination().  The use case
> > > > > that prompted this thread used pg_terminate_backend(pid, 180000); it doesn't
> > > > > need pg_wait_for_backend_termination().
> >
> > Is this an Opened Issue ?
>
> An Open Item?  Not really, since there's no objective defect.  Nonetheless,
> the attached is what I'd like to use.

Thanks. +1 to remove the pg_wait_for_backend_termination function. The
patch basically looks good to me. I'm attaching an updated patch. I
corrected a minor typo in the commit message, took docs and code
comment changes suggested by Justin Pryzby.

Please note that release notes still have the headline "Add functions
to wait for backend termination" of the original commit that added the
pg_wait_for_backend_termination. With the removal of it, I'm not quite
sure if we retain the the commit message or tweak it to something like
"Add optional timeout parameter to pg_terminate_backend".
<!--
Author: Magnus Hagander <magnus@hagander.net>
2021-04-08 [aaf043257] Add functions to wait for backend termination
-->

With Regards,
Bharath Rupireddy.

Attachment

v2-0001-Remove-pg_wait_for_backend_termination-function.patch

Re: A new function to wait for the backend exit after termination

From

Noah Misch

Date:

15 June 2021, 00:40:40

On Mon, Jun 14, 2021 at 07:34:59PM +0530, Bharath Rupireddy wrote:
> Thanks. +1 to remove the pg_wait_for_backend_termination function. The
> patch basically looks good to me. I'm attaching an updated patch. I
> corrected a minor typo in the commit message, took docs and code
> comment changes suggested by Justin Pryzby.

Pushed as two commits.  I used your log message typo fix.  Here were the diffs
in your v2 and not in an earlier patch:

> -+       Add an optional wait parameter to <link
> ++       Add an optional timeout parameter to <link

I used this.

> -+    int            timeout; /* milliseconds */
> ++    int            timeout;  /* milliseconds */

pgindent chooses a third option, so I ran pgindent instead of using this.

> Please note that release notes still have the headline "Add functions
> to wait for backend termination" of the original commit that added the
> pg_wait_for_backend_termination. With the removal of it, I'm not quite
> sure if we retain the the commit message or tweak it to something like
> "Add optional timeout parameter to pg_terminate_backend".
> <!--
> Author: Magnus Hagander <magnus@hagander.net>
> 2021-04-08 [aaf043257] Add functions to wait for backend termination
> -->

That part is supposed to mirror "git log --pretty=format:%s", no matter what
happens later.  The next set of release note updates might add my latest
commit (5f1df62) to this SGML comment, on another line.