On Fri, Mar 10, 2023 at 11:37 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:
>
> On Thu, Mar 09, 2023 at 05:27:08PM -0500, Tom Lane wrote:
> > Is it reasonable to assume that all modern platforms can time
> > millisecond delays accurately? Ten years ago I'd have suggested
> > truncating the delay to a multiple of 10msec and using this logic
> > to track the remainder, but maybe now that's unnecessary.
>
> If so, it might also be worth updating or removing this comment in
> pgsleep.c:
>
> * NOTE: although the delay is specified in microseconds, the effective
> * resolution is only 1/HZ, or 10 milliseconds, on most Unixen. Expect
> * the requested delay to be rounded up to the next resolution boundary.
>
> I've had doubts for some time about whether this is still accurate...
What I see with the old select(), or a more modern clock_nanosleep()
call, is that Linux, FreeBSD, macOS are happy sleeping for .1ms, .5ms,
1ms, 2ms, 3ms, and through innaccuracies and scheduling overheads etc
it works out to about 5-25% extra sleep time (I expect that can be
affected by choice of time source/available hardware, and perhaps
various system calls use different tricks). I definitely recall the
behaviour described, back in the old days where more stuff was
scheduler-tick based. I have no clue for Windows; quick googling
tells me that it might still be pretty chunky, unless you do certain
other stuff that I didn't follow up; we could probably get more
accurate sleep times by rummaging through nt.dll. It would be good to
find out how well WaitEventSet does on Windows; perhaps we should have
a little timing accuracy test in the tree to collect build farm data?
FWIW epoll has a newer _pwait2() call that has higher res timeout
argument, and Windows WaitEventSet could also do high res timers if
you add timer events rather than using the timeout argument, and I
guess conceptually even the old poll() thing could do the equivalent
with a signal alarm timer, but it sounds a lot like a bad idea to do
very short sleeps to me, burning so much CPU on scheduling. I kinda
wonder if the 10ms + residual thing might even turn out to be a better
idea... but I dunno.
The 1ms residual thing looks pretty good to me as a fix to the
immediate problem report, but we might also want to adjust the wording
in config.sgml?