Re: Sub-millisecond [autovacuum_]vacuum_cost_delay broken - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Sub-millisecond [autovacuum_]vacuum_cost_delay broken
Date
Msg-id CA+hUKGL=OkAsHBS_TH3v3SRCi3AZd9r2+8PpJ4DR=P9xvnhF5Q@mail.gmail.com
Whole thread Raw
In response to Re: Sub-millisecond [autovacuum_]vacuum_cost_delay broken  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: Sub-millisecond [autovacuum_]vacuum_cost_delay broken
List pgsql-hackers
On Tue, Mar 14, 2023 at 12:10 PM Nathan Bossart
<nathandbossart@gmail.com> wrote:
> >   * NOTE: although the delay is specified in microseconds, the effective
> > - * resolution is only 1/HZ, or 10 milliseconds, on most Unixen.  Expect
> > - * the requested delay to be rounded up to the next resolution boundary.
> > + * resolution is only 1/HZ on systems that use periodic kernel ticks to wake
> > + * up.  This may cause sleeps to be rounded up by 1-20 milliseconds on older
> > + * Unixen and Windows.
>
> nitpick: Could the 1/HZ versus 20 milliseconds discrepancy cause confusion?
> Otherwise, I think this is the right idea.

Better words welcome; 1-20ms summarises the range I actually measured,
and if reports are correct about Windows' HZ=64 (1/HZ = 15.625ms) then
it neatly covers that too, so I don't feel too bad about not chasing
down the reason for that 10ms/20ms discrepancy; maybe I looked at the
wrong HZ number (which you can change, anyway), I'm not too used to
NetBSD...  BTW they have a project plan to fix that
https://wiki.netbsd.org/projects/project/tickless/

> > + * CAUTION: if interrupted by a signal, this function will return, but its
> > + * interface doesn't report that.  It's not a good idea to use this
> > + * for long sleeps in the backend, because backends are expected to respond to
> > + * interrupts promptly.  Better practice for long sleeps is to use WaitLatch()
> > + * with a timeout.
>
> I'm not sure this argument follows.  If pg_usleep() returns if interrupted,
> then why are we concerned about delayed responses to interrupts?

Because you can't rely on it:

1.  Maybe the signal is delivered just before pg_usleep() begins, and
a handler sets some flag we would like to react to.  Now pg_usleep()
will not be interrupted.  That problem is solved by using latches
instead.
2.  Maybe the signal is one that is no longer handled by a handler at
all; these days, latches use SIGURG, which pops out when you read a
signalfd or kqueue, so pg_usleep() will not wake up.  That problem is
solved by using latches instead.

(The word "interrupt" is a bit overloaded, which doesn't help with
this discussion.)

> > -             delay.tv_usec = microsec % 1000000L;
> > -             (void) select(0, NULL, NULL, NULL, &delay);
> > +             delay.tv_nsec = (microsec % 1000000L) * 1000;
> > +             (void) nanosleep(&delay, NULL);
>
> Using nanosleep() seems reasonable to me.

Thanks for looking!



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: psql \watch 2nd argument: iteration count
Next
From: Amit Kapila
Date:
Subject: Re: Allow logical replication to copy tables in binary format