Home > mailing lists

Re: Restart pg_usleep when interrupted - Mailing list pgsql-hackers

From	Bertrand Drouvot
Subject	Re: Restart pg_usleep when interrupted
Date	July 2, 2024 09:05:40
Msg-id	ZoPC5IeP4k7sZpki@ip-10-97-1-34.eu-west-3.compute.internal Whole thread
In response to	Re: Restart pg_usleep when interrupted (Sami Imseih <samimseih@gmail.com>)
Responses	Re: Restart pg_usleep when interrupted
List	pgsql-hackers

Tree view

Hi,

On Fri, Jun 28, 2024 at 05:50:20PM -0500, Sami Imseih wrote:
> 
> Thanks for the feedback!
> 
> > On Jun 28, 2024, at 4:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > 
> > Sami Imseih <samimseih@gmail.com> writes:
> >> Reattaching the patch.
> > 
> > I feel like this is fundamentally a wrong solution, for the reasons
> > cited in the comment for pg_usleep: long sleeps are a bad idea
> > because of the resulting uncertainty about whether we'll respond to
> > interrupts and such promptly.  An example here is that if we get
> > a query cancel interrupt, we should probably not insist on finishing
> > out the current sleep before responding.
> 
> The case which brought up this discussion is the pg_usleep that 
> is called within the vacuum_delay_point being interrupted. 
> 
> When I read the same code comment you cited, it sounded to me 
> that “long sleeps” are those that are in seconds or minutes. The longest 
> vacuum delay allowed is 100ms.

I think that with the proposed patch the actual wait time can be "long".

Indeed, the time between the interruptions and restarts of the nanosleep() call
will lead to drift (as mentioned in the nanosleep() man page). So, with a large
number of interruptions, the actual wait time could be way longer than the
expected wait time.

To put numbers, I did a few tests with the patch (and with v2 shared in [1]):

cost delay is 1ms and cost limit is 10.

With 50 indexes and 10 parallel workers I can see things like:

2024-07-02 08:22:23.789 UTC [2189616] LOG:  expected 1.000000, actual 239.378368
2024-07-02 08:22:24.575 UTC [2189616] LOG:  expected 0.100000, actual 224.331737
2024-07-02 08:22:25.363 UTC [2189616] LOG:  expected 1.300000, actual 230.462793
2024-07-02 08:22:26.154 UTC [2189616] LOG:  expected 1.000000, actual 225.980803

Means we waited more than the max allowed cost delay (100ms).

With 49 parallel workers, it's worst as I can see things like:

2024-07-02 08:26:36.069 UTC [2189807] LOG:  expected 1.000000, actual 1106.790136
2024-07-02 08:26:36.298 UTC [2189807] LOG:  expected 1.000000, actual 218.148985

The first actual wait time is about 1 second (it has been interrupted about
16300 times during this second).

To avoid this drift, the nanosleep() man page suggests to use clock_nanosleep()
with an absolute time value, that might be another idea to explore.

[1]: https://www.postgresql.org/message-id/flat/ZmaXmWDL829fzAVX%40ip-10-97-1-34.eu-west-3.compute.internal

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

pgsql-hackers by date:

From: "Joel Jacobson"
Date: 02 July 2024, 09:05:03
Subject: Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

From: shveta malik
Date: 02 July 2024, 09:09:49
Subject: Re: Conflict Detection and Resolution

Re: Restart pg_usleep when interrupted - Mailing list pgsql-hackers

Previous

Next