Thread: Restart pg_usleep when interrupted

Restart pg_usleep when interrupted

From
"Imseih (AWS), Sami"
Date:

Hi,

 

In the proposal by Bertrand [1] to implement vacuum cost delay tracking

in pg_stat_progress_vacuum, it was discovered that the vacuum cost delay

ends early on the leader process of a parallel vacuum due to parallel workers

reporting progress on index vacuuming, which was introduced in 17

with commit [2]. With this patch, everytime a parallel worker

completes a vacuum index, it will send a completion message to the leader.

 

The facility that allows a parallel worker to report progress to the leader was

introduced in commit [3].

 

In the case of the patch being proposed by Bertrand, the number of interrupts

will be much more frequent as parallel workers would send a message to the leader

to update the vacuum delay counters every vacuum_delay_point call.

 

Looking into this, the ideal solution appears to provide the ability for a pg_usleep

call to restart after being interrupted. Thanks to the usage of nanosleep as

of commit[4], this should be trivial to do as nanosleep

provides a remaining time, which can be used to restart the sleep. This

idea was also mentioned in [5].

 

I am attaching a POC patch that does the $SUBJECT. Rather than changing the

existing pg_usleep, the patch introduces a function, pg_usleep_handle_interrupt,

that takes in the sleep duration and a boolean flag to force handling of the

interrupt, or not.

 

This function can replace pg_usleep inside vacuum_delay_point, and could be

useful in other cases in which it’s important to handle interrupts.

 

For Windows, the “force” = “true” flag of the new function uses SleepEx which

from what I can tell based on the code comments is a non-interruptible sleep.

 

Thoughts?

 

[1] https://www.postgresql.org/message-id/ZnbIIUQpFJKAJx2W%40ip-10-97-1-34.eu-west-3.compute.internal

[2] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=46ebdfe164c61fbac961d1eb7f40e9a684289ae6

[3] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f1889729dd3ab0352dc0ccc2ffcc1b1901f8e39f

[4] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=a948e49e2ef11815be0b211723bfc5b53b7f75a8

[5] https://www.postgresql.org/message-id/24848.1473607575%40sss.pgh.pa.us

 

 

Regards,

 

Sami Imseih

Amazon Web Services (AWS)

 

 

 

 

 

Re: Restart pg_usleep when interrupted

From
Robert Haas
Date:
Hi,

I think you need to find a way to disable this "Attachment protected
by Amazon" thing:

http://postgr.es/m/01000190606e3d2a-116ead16-84d2-4449-8d18-5053da66b1f4-000000@email.amazonses.com

We want patches to be in the pgsql-hackers archives, not temporarily
accessible via some link.

...Robert



Re: Restart pg_usleep when interrupted

From
"Imseih (AWS), Sami"
Date:
> I think you need to find a way to disable this "Attachment protected
> by Amazon" thing:

Yes, I am looking into that. I only noticed after sending the email.

Sorry about that.

Sami 






Re: Restart pg_usleep when interrupted

From
Sami Imseih
Date:
> We want patches to be in the pgsql-hackers archives, not temporarily
> accessible via some link.
>
> …Robert
>

Moving to another email going forward.

Reattaching the patch.




Regards,

Sami Imseih
Amazon Web Services
Attachment

Re: Restart pg_usleep when interrupted

From
Tom Lane
Date:
Sami Imseih <samimseih@gmail.com> writes:
> Reattaching the patch.

I feel like this is fundamentally a wrong solution, for the reasons
cited in the comment for pg_usleep: long sleeps are a bad idea
because of the resulting uncertainty about whether we'll respond to
interrupts and such promptly.  An example here is that if we get
a query cancel interrupt, we should probably not insist on finishing
out the current sleep before responding.

Therefore, rather than "improving" pg_usleep (and uglifying its API),
the right answer is to fix parallel vacuum leaders to not depend on
pg_usleep in the first place.  A better idea might be to use
pg_sleep() or equivalent code.

            regards, tom lane



Re: Restart pg_usleep when interrupted

From
Thomas Munro
Date:
On Sat, Jun 29, 2024 at 9:34 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Therefore, rather than "improving" pg_usleep (and uglifying its API),
> the right answer is to fix parallel vacuum leaders to not depend on
> pg_usleep in the first place.  A better idea might be to use
> pg_sleep() or equivalent code.

In case it's useful for someone looking into that, in earlier
discussions we figured out that it is possible to have high resolution
timeouts AND support latch multiplexing on Linux, FreeBSD, macOS:

https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BhC9mFx8tEcBsyo7-cAfWgtbRy1eDizeFuff2K7T%3D4bA%40mail.gmail.com



Re: Restart pg_usleep when interrupted

From
Sami Imseih
Date:

Thanks for the feedback!

On Jun 28, 2024, at 4:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Sami Imseih <samimseih@gmail.com> writes:
Reattaching the patch.

I feel like this is fundamentally a wrong solution, for the reasons
cited in the comment for pg_usleep: long sleeps are a bad idea
because of the resulting uncertainty about whether we'll respond to
interrupts and such promptly.  An example here is that if we get
a query cancel interrupt, we should probably not insist on finishing
out the current sleep before responding.

The case which brought up this discussion is the pg_usleep that 
is called within the vacuum_delay_point being interrupted. 

When I read the same code comment you cited, it sounded to me 
that “long sleeps” are those that are in seconds or minutes. The longest 
vacuum delay allowed is 100ms.


Therefore, rather than "improving" pg_usleep (and uglifying its API),
the right answer is to fix parallel vacuum leaders to not depend on
pg_usleep in the first place.  A better idea might be to use
pg_sleep() or equivalent code.

Yes, that is a good idea to explore and it will not require introducing
an awkward new API. I will look into using something similar to
pg_sleep.

Regards,

Sami