Re: Sub-millisecond [autovacuum_]vacuum_cost_delay broken - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Sub-millisecond [autovacuum_]vacuum_cost_delay broken
Date
Msg-id 705469.1678409199@sss.pgh.pa.us
Whole thread Raw
In response to Re: Sub-millisecond [autovacuum_]vacuum_cost_delay broken  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Sub-millisecond [autovacuum_]vacuum_cost_delay broken  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
Thomas Munro <thomas.munro@gmail.com> writes:
> Erm, but maybe I'm just looking at this too myopically.  Is there
> really any point in letting people set it to 0.5, if it behaves as if
> you'd set it to 1 and doubled the cost limit?  Isn't it just more
> confusing?  I haven't read the discussion from when fractional delays
> came in, where I imagine that must have come up...

At [1] I argued

>> The reason is this: what we want to do is throttle VACUUM's I/O demand,
>> and by "throttle" I mean "gradually reduce".  There is nothing gradual
>> about issuing a few million I/Os and then sleeping for many milliseconds;
>> that'll just produce spikes and valleys in the I/O demand.  Ideally,
>> what we'd have it do is sleep for a very short interval after each I/O.
>> But that's not too practical, both for code-structure reasons and because
>> most platforms don't give us a way to so finely control the length of a
>> sleep.  Hence the design of sleeping for awhile after every so many I/Os.
>> 
>> However, the current settings are predicated on the assumption that
>> you can't get the kernel to give you a sleep of less than circa 10ms.
>> That assumption is way outdated, I believe; poking around on systems
>> I have here, the minimum delay time using pg_usleep(1) seems to be
>> generally less than 100us, and frequently less than 10us, on anything
>> released in the last decade.
>> 
>> I propose therefore that instead of increasing vacuum_cost_limit,
>> what we ought to be doing is reducing vacuum_cost_delay by a similar
>> factor.  And, to provide some daylight for people to reduce it even
>> more, we ought to arrange for it to be specifiable in microseconds
>> not milliseconds.  There's no GUC_UNIT_US right now, but it's time.

That last point was later overruled in favor of keeping it measured in
msec to avoid breaking existing configuration files.  Nonetheless,
vacuum_cost_delay *is* an actual time to wait (conceptually at least),
not just part of a unitless ratio; and there seem to be good arguments
in favor of letting people make it small.

I take your point that really short sleeps are inefficient so far as the
scheduling overhead goes.  But on modern machines you probably have to get
down to a not-very-large number of microseconds before that's a big deal.

            regards, tom lane

[1] https://www.postgresql.org/message-id/28720.1552101086%40sss.pgh.pa.us



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Date-Time dangling unit fix
Next
From: Michael Paquier
Date:
Subject: Re: Add pg_walinspect function with block info columns