Re: Should vacuum process config file reload more often - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: Should vacuum process config file reload more often
Date
Msg-id CAAKRu_aX+MLn32kPnAQ30WChbiLTkKKmY9e8+B1ZY1BokpQZYA@mail.gmail.com
Whole thread Raw
In response to Re: Should vacuum process config file reload more often  (Melanie Plageman <melanieplageman@gmail.com>)
Responses Re: Should vacuum process config file reload more often
List pgsql-hackers
On Wed, Apr 5, 2023 at 3:43 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Wed, Apr 5, 2023 at 2:56 PM Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > + /*
> > + * Balance and update limit values for autovacuum workers. We must
> > + * always do this in case the autovacuum launcher or another
> > + * autovacuum worker has recalculated the number of workers across
> > + * which we must balance the limit. This is done by the launcher when
> > + * launching a new worker and by workers before vacuuming each table.
> > + */
> >
> > I don't quite understand what's going on here. A big reason that I'm
> > worried about this whole issue in the first place is that sometimes
> > there's a vacuum going on a giant table and you can't get it to go
> > fast. You want it to absorb new settings, and to do so quickly. I
> > realize that this is about the number of workers, not the actual cost
> > limit, so that makes what I'm about to say less important. But ... is
> > this often enough? Like, the time before we move onto the next table
> > could be super long. The time before a new worker is launched should
> > be ~autovacuum_naptime/autovacuum_max_workers or ~20s with default
> > settings, so that's not horrible, but I'm kind of struggling to
> > understand the rationale for this particular choice. Maybe it's fine.
>
> VacuumUpdateCosts() also calls AutoVacuumUpdateCostLimit(), so this will
> happen if a config reload is pending the next time vacuum_delay_point()
> is called (which is pretty often -- roughly once per block vacuumed but
> definitely more than once per table).
>
> Relevant code is at the top of vacuum_delay_point():
>
>     if (ConfigReloadPending && IsAutoVacuumWorkerProcess())
>     {
>         ConfigReloadPending = false;
>         ProcessConfigFile(PGC_SIGHUP);
>         VacuumUpdateCosts();
>     }
>

Gah, I think I misunderstood you. You are saying that only calling
AutoVacuumUpdateCostLimit() after napping while vacuuming a table may
not be enough. The frequency at which the number of workers changes will
likely be different. This is a good point.
It's kind of weird to call AutoVacuumUpdateCostLimit() only after napping...

Hmm. Well, I don't think we want to call AutoVacuumUpdateCostLimit() on
every call to vacuum_delay_point(), though, do we? It includes two
atomic operations. Maybe that pales in comparison to what we are doing
on each page we are vacuuming. I haven't properly thought about it.

Is there some other relevant condition we can use to determine whether
or not to call AutoVacuumUpdateCostLimit() on a given invocation of
vacuum_delay_point()? Maybe something with naptime/max workers?

I'm not sure if there is a more reliable place than vacuum_delay_point()
for us to do this. I poked around heap_vacuum_rel(), but I think we
would want this cost limit update to happen table AM-agnostically.

Thank you for bringing this up!

- Melanie



pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: failure in 019_replslot_limit
Next
From: Michael Paquier
Date:
Subject: Re: Add index scan progress to pg_stat_progress_vacuum