On Jun 28, 2024, at 4:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Sami Imseih <samimseih@gmail.com> writes:
Reattaching the patch.
I feel like this is fundamentally a wrong solution, for the reasons cited in the comment for pg_usleep: long sleeps are a bad idea because of the resulting uncertainty about whether we'll respond to interrupts and such promptly. An example here is that if we get a query cancel interrupt, we should probably not insist on finishing out the current sleep before responding.
The case which brought up this discussion is the pg_usleep that
is called within the vacuum_delay_point being interrupted.
When I read the same code comment you cited, it sounded to me
that “long sleeps” are those that are in seconds or minutes. The longest
vacuum delay allowed is 100ms.
Therefore, rather than "improving" pg_usleep (and uglifying its API), the right answer is to fix parallel vacuum leaders to not depend on pg_usleep in the first place. A better idea might be to use pg_sleep() or equivalent code.
Yes, that is a good idea to explore and it will not require introducing
an awkward new API. I will look into using something similar to