Re: Should vacuum process config file reload more often - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: Should vacuum process config file reload more often
Date
Msg-id CAAKRu_a39kpGO_SxWEGAHMEHFAOACk3wwQJhRBaN+7Ke9ua_qg@mail.gmail.com
Whole thread Raw
In response to Re: Should vacuum process config file reload more often  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Should vacuum process config file reload more often  (Melanie Plageman <melanieplageman@gmail.com>)
List pgsql-hackers
On Fri, Mar 24, 2023 at 1:21 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Mar 24, 2023 at 9:27 AM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> >
> > On Thu, Mar 23, 2023 at 2:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > On Sun, Mar 19, 2023 at 7:47 AM Melanie Plageman <melanieplageman@gmail.com> wrote:
> > > Do we need to calculate the number of workers running with
> > > nworkers_for_balance by iterating over the running worker list? I
> > > guess autovacuum workers can increment/decrement it at the beginning
> > > and end of vacuum.
> >
> > I don't think we can do that because if a worker crashes, we have no way
> > of knowing if it had incremented or decremented the number, so we can't
> > adjust for it.
>
> What kind of crash are you concerned about? If a worker raises an
> ERROR, we can catch it in PG_CATCH() block. If it's a FATAL, we can do
> that in FreeWorkerInfo(). A PANIC error ends up crashing the entire
> server.

Yes, but what about a worker that segfaults? Since table AMs can define
relation_vacuum(), this seems like a real possibility.

I'll address your other code feedback in the next version.

I realized nworkers_for_balance should be initialized to 0 and not 1 --
1 is misleading since there are often 0 autovac workers. We just never
want to use nworkers_for_balance when it is 0. But, workers put a floor
of 1 on the number when they divide limit/nworkers_for_balance (since
they know there must be at least one worker right now since they are a
worker). I thought about whether or not they should call
autovac_balance_cost() if they find that nworkers_for_balance is 0 when
updating their own limit, but I'm not sure.

> > > I need another few readthroughs to figure out of VacuumFailsafeActive does what
> > > I think it does, and should be doing, but in general I think this is a good
> > > idea and a patch in good condition close to being committable.
>
> Another approach would be to make VacuumCostActive a ternary value:
> on, off, and never. When we trigger the failsafe mode we switch it to
> never, meaning that it never becomes active even after reloading the
> config file. A good point is that we don't need to add a new global
> variable, but I'm not sure it's better than the current approach.

Hmm, this is interesting. I don't love the word "never" since it kind of
implies a duration longer than the current table being vacuumed. But we
could find a different word or just document it well. For clarity, we
might want to call it failsafe_mode or something.

I wonder if the primary drawback to converting
LVRelState->failsafe_active to a global VacuumFailsafeActive is just the
general rule of limiting scope to the minimum needed.

- Melanie



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Make EXPLAIN generate a generic plan for a parameterized query
Next
From: Andres Freund
Date:
Subject: Re: Transparent column encryption