Re: Should vacuum process config file reload more often - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Should vacuum process config file reload more often
Date
Msg-id CAD21AoB=EC6wzjjqDXKmNEhu_BhGg0g_Gc50yh_zvXqiDC_OSg@mail.gmail.com
Whole thread Raw
In response to Re: Should vacuum process config file reload more often  (Melanie Plageman <melanieplageman@gmail.com>)
Responses Re: Should vacuum process config file reload more often
Re: Should vacuum process config file reload more often
List pgsql-hackers
On Thu, Mar 2, 2023 at 10:41 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Mon, Feb 27, 2023 at 9:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > On Fri, Feb 24, 2023 at 7:08 AM Melanie Plageman
> > <melanieplageman@gmail.com> wrote:
> > > Users may wish to speed up long-running vacuum of a large table by
> > > decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
> > > config file is only reloaded between tables (for autovacuum) or after
> > > the statement (for explicit vacuum). This has been brought up for
> > > autovacuum in [1].
> > >
> > > Andres suggested that it might be possible to check ConfigReloadPending
> > > in vacuum_delay_point(), so I thought I would draft a rough patch and
> > > start a discussion.
> >
> > In vacuum_delay_point(), we need to update VacuumCostActive too if necessary.
>
> Yes, good point. Thank you!
>
> On Thu, Feb 23, 2023 at 5:08 PM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> > I don't think I can set and leave vac_in_outer_xact the way I am doing
> > it in this patch, since I use vac_in_outer_xact in vacuum_delay_point(),
> > which I believe is reachable from codepaths that would not have called
> > vacuum(). It seems that if a backend sets it, the outer transaction
> > commits, and then the backend ends up calling vacuum_delay_point() in a
> > different way later, it wouldn't be quite right.
>
> Perhaps I could just set in_outer_xact to false in the PG_FINALLY()
> section in vacuum() to avoid this problem.
>
> On Wed, Mar 1, 2023 at 7:15 PM Andres Freund <andres@anarazel.de> wrote:
> > On 2023-02-28 11:16:45 +0900, Masahiko Sawada wrote:
> > > On Tue, Feb 28, 2023 at 10:21 AM Andres Freund <andres@anarazel.de> wrote:
> > > > On 2023-02-27 23:11:53 +0900, Masahiko Sawada wrote:
> > > > > Also, I'm concerned that allowing to change any GUC parameters during
> > > > > vacuum/analyze could be a foot-gun in the future. When modifying
> > > > > vacuum/analyze-related codes, we have to consider the case where any GUC
> > > > > parameters could be changed during vacuum/analyze.
> > > >
> > > > What kind of scenario are you thinking of?
> > >
> > > For example, I guess we will need to take care of changes of
> > > maintenance_work_mem. Currently we initialize the dead tuple space at
> > > the beginning of lazy vacuum, but perhaps we would need to
> > > enlarge/shrink it based on the new value?
> >
> > I don't think we need to do anything about that initially, just because the
> > config can be changed in a more granular way, doesn't mean we have to react to
> > every change for the current operation.
>
> Perhaps we can mention in the docs that a change to maintenance_work_mem
> will not take effect in the middle of vacuuming a table. But, Ithink it probably
> isn't needed.

Agreed.

>
> On another topic, I've just realized that when autovacuuming we only
> update tab->at_vacuum_cost_delay/limit from
> autovacuum_vacuum_cost_delay/limit for each table (in
> table_recheck_autovac()) and then use that to update
> MyWorkerInfo->wi_cost_delay/limit. MyWorkerInfo->wi_cost_delay/limit is
> what is used to update VacuumCostDelay/Limit in AutoVacuumUpdateDelay().
> So, even if we reload the config file in vacuum_delay_point(), if we
> don't use the new value of autovacuum_vacuum_cost_delay/limit it will
> have no effect for autovacuum.

Right, but IIUC wi_cost_limit (and VacuumCostDelayLimit) might be
updated. After the autovacuum launcher reloads the config file, it
calls autovac_balance_cost() that updates that value of active
workers. I'm not sure why we don't update workers' wi_cost_delay,
though.

> I started writing a little helper that could be used to update these
> workerinfo->wi_cost_delay/limit in vacuum_delay_point(),

Since we set vacuum delay parameters for autovacuum workers so that we
ration out I/O equally, I think we should keep the current mechanism
that the autovacuum launcher sets workers' delay parameters and they
update accordingly.

>  but I notice
> when they are first set, we consider the autovacuum table options. So,
> I suppose I would need to consider these when updating
> wi_cost_delay/limit later as well? (during vacuum_delay_point() or
> in AutoVacuumUpdateDelay())
>
> I wasn't quite sure because I found these chained ternaries rather
> difficult to interpret, but I think table_recheck_autovac() is saying
> that the autovacuum table options override all other values for
> vac_cost_delay?
>
>         vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
>             ? avopts->vacuum_cost_delay
>             : (autovacuum_vac_cost_delay >= 0)
>             ? autovacuum_vac_cost_delay
>             : VacuumCostDelay;
>
> i.e. this?
>
>   if (avopts && avopts->vacuum_cost_delay >= 0)
>     vac_cost_delay = avopts->vacuum_cost_delay;
>   else if (autovacuum_vac_cost_delay >= 0)
>     vac_cost_delay = autovacuum_vacuum_cost_delay;
>   else
>     vac_cost_delay = VacuumCostDelay

Yes, if the table has autovacuum table options, we use these values
and the table is excluded from the balancing algorithm I mentioned
above. See the code from table_recheck_autovac(),

       /*
        * If any of the cost delay parameters has been set individually for
        * this table, disable the balancing algorithm.
        */
       tab->at_dobalance =
           !(avopts && (avopts->vacuum_cost_limit > 0 ||
                        avopts->vacuum_cost_delay > 0));

So if the table has autovacuum table options, the vacuum delay
parameters probably should be updated by ALTER TABLE, not by reloading
the config file.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [BUG] pg_stat_statements and extended query protocol
Next
From: "Drouvot, Bertrand"
Date:
Subject: Re: Generate pg_stat_get_xact*() functions with Macros