Thread: Should vacuum process config file reload more often

Should vacuum process config file reload more often

From
Melanie Plageman
Date:
Hi,

Users may wish to speed up long-running vacuum of a large table by
decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
config file is only reloaded between tables (for autovacuum) or after
the statement (for explicit vacuum). This has been brought up for
autovacuum in [1].

Andres suggested that it might be possible to check ConfigReloadPending
in vacuum_delay_point(), so I thought I would draft a rough patch and
start a discussion.

Since vacuum_delay_point() is also called by analyze and we do not want
to reload the configuration file if we are in a user transaction, I
widened the scope of the in_outer_xact variable in vacuum() and allowed
analyze in a user transaction to default to the current configuration
file reload cadence in PostgresMain().

I don't think I can set and leave vac_in_outer_xact the way I am doing
it in this patch, since I use vac_in_outer_xact in vacuum_delay_point(),
which I believe is reachable from codepaths that would not have called
vacuum(). It seems that if a backend sets it, the outer transaction
commits, and then the backend ends up calling vacuum_delay_point() in a
different way later, it wouldn't be quite right.

Apart from this, one higher level question I have is if there are other
gucs whose modification would make reloading the configuration file
during vacuum/analyze unsafe.

- Melanie

[1]
https://www.postgresql.org/message-id/flat/22CA91B4-D341-4075-BD3C-4BAB52AF1E80%40amazon.com#37f05e33d2ce43680f96332fa1c0f3d4

Attachment

Re: Should vacuum process config file reload more often

From
Pavel Borisov
Date:
Hi, Melanie!

On Fri, 24 Feb 2023 at 02:08, Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> Hi,
>
> Users may wish to speed up long-running vacuum of a large table by
> decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
> config file is only reloaded between tables (for autovacuum) or after
> the statement (for explicit vacuum). This has been brought up for
> autovacuum in [1].
>
> Andres suggested that it might be possible to check ConfigReloadPending
> in vacuum_delay_point(), so I thought I would draft a rough patch and
> start a discussion.
>
> Since vacuum_delay_point() is also called by analyze and we do not want
> to reload the configuration file if we are in a user transaction, I
> widened the scope of the in_outer_xact variable in vacuum() and allowed
> analyze in a user transaction to default to the current configuration
> file reload cadence in PostgresMain().
>
> I don't think I can set and leave vac_in_outer_xact the way I am doing
> it in this patch, since I use vac_in_outer_xact in vacuum_delay_point(),
> which I believe is reachable from codepaths that would not have called
> vacuum(). It seems that if a backend sets it, the outer transaction
> commits, and then the backend ends up calling vacuum_delay_point() in a
> different way later, it wouldn't be quite right.
>
> Apart from this, one higher level question I have is if there are other
> gucs whose modification would make reloading the configuration file
> during vacuum/analyze unsafe.

I have a couple of small questions:
Can this patch also read the current GUC value if it's modified by the
SET command, without editing config file?
What will be if we modify config file with mistakes? (When we try to
start the cluster with an erroneous config file it will fail to start,
not sure about re-read config)

Overall the proposal seems legit and useful.

Kind regards,
Pavel Borisov



Re: Should vacuum process config file reload more often

From
Masahiko Sawada
Date:
Hi,

On Fri, Feb 24, 2023 at 7:08 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> Hi,
>
> Users may wish to speed up long-running vacuum of a large table by
> decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
> config file is only reloaded between tables (for autovacuum) or after
> the statement (for explicit vacuum). This has been brought up for
> autovacuum in [1].
>
> Andres suggested that it might be possible to check ConfigReloadPending
> in vacuum_delay_point(), so I thought I would draft a rough patch and
> start a discussion.

In vacuum_delay_point(), we need to update VacuumCostActive too if necessary.

> Apart from this, one higher level question I have is if there are other
> gucs whose modification would make reloading the configuration file
> during vacuum/analyze unsafe.

As far as I know there are not such GUC parameters in the core but
there might be in third-party table AM and index AM extensions. Also,
I'm concerned that allowing to change any GUC parameters during
vacuum/analyze could be a foot-gun in the future. When modifying
vacuum/analyze-related codes, we have to consider the case where any
GUC parameters could be changed during vacuum/analyze. I guess it
would be better to apply the parameter changes for only vacuum delay
related parameters. For example, autovacuum launcher advertises the
values of the vacuum delay parameters on the shared memory not only
for autovacuum processes but also for manual vacuum/analyze processes.
Both processes can update them accordingly in vacuum_delay_point().

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Should vacuum process config file reload more often

From
Andres Freund
Date:
Hi,

On 2023-02-27 23:11:53 +0900, Masahiko Sawada wrote:
> As far as I know there are not such GUC parameters in the core but
> there might be in third-party table AM and index AM extensions.

We already reload in a pretty broad range of situations, so I'm not sure
there's a lot that could be unsafe that isn't already.


> Also, I'm concerned that allowing to change any GUC parameters during
> vacuum/analyze could be a foot-gun in the future. When modifying
> vacuum/analyze-related codes, we have to consider the case where any GUC
> parameters could be changed during vacuum/analyze.

What kind of scenario are you thinking of?


> I guess it would be better to apply the parameter changes for only vacuum
> delay related parameters. For example, autovacuum launcher advertises the
> values of the vacuum delay parameters on the shared memory not only for
> autovacuum processes but also for manual vacuum/analyze processes.  Both
> processes can update them accordingly in vacuum_delay_point().

I don't think this is a good idea. It'd introduce a fair amount of complexity
without, as far as I can tell, a benefit.

Greetings,

Andres Freund



Re: Should vacuum process config file reload more often

From
Masahiko Sawada
Date:
On Tue, Feb 28, 2023 at 10:21 AM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2023-02-27 23:11:53 +0900, Masahiko Sawada wrote:
> > As far as I know there are not such GUC parameters in the core but
> > there might be in third-party table AM and index AM extensions.
>
> We already reload in a pretty broad range of situations, so I'm not sure
> there's a lot that could be unsafe that isn't already.
>
>
> > Also, I'm concerned that allowing to change any GUC parameters during
> > vacuum/analyze could be a foot-gun in the future. When modifying
> > vacuum/analyze-related codes, we have to consider the case where any GUC
> > parameters could be changed during vacuum/analyze.
>
> What kind of scenario are you thinking of?

For example, I guess we will need to take care of changes of
maintenance_work_mem. Currently we initialize the dead tuple space at
the beginning of lazy vacuum, but perhaps we would need to
enlarge/shrink it based on the new value?

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
Thanks for the feedback and questions, Pavel!

On Fri, Feb 24, 2023 at 3:43 AM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
> I have a couple of small questions:
> Can this patch also read the current GUC value if it's modified by the
> SET command, without editing config file?

If a user sets a guc like vacuum_cost_limit with SET, this only modifies
the value for that session. That wouldn't affect the in-progress vacuum
you initiated from that session because you would have to wait for the
vacuum to complete before issuing the SET command.

> What will be if we modify config file with mistakes? (When we try to
> start the cluster with an erroneous config file it will fail to start,
> not sure about re-read config)

If you manually add an invalid valid to your postgresql.conf, when it is
reloaded, the existing value will remain unchanged and an error will be
logged. If you attempt to change the guc value to an invalid value with
ALTER SYSTEM, the ALTER SYSTEM command will fail and the existing value
will remain unchanged.

- Melanie



Re: Should vacuum process config file reload more often

From
Andres Freund
Date:
Hi,

On 2023-02-28 11:16:45 +0900, Masahiko Sawada wrote:
> On Tue, Feb 28, 2023 at 10:21 AM Andres Freund <andres@anarazel.de> wrote:
> > On 2023-02-27 23:11:53 +0900, Masahiko Sawada wrote:
> > > As far as I know there are not such GUC parameters in the core but
> > > there might be in third-party table AM and index AM extensions.
> >
> > We already reload in a pretty broad range of situations, so I'm not sure
> > there's a lot that could be unsafe that isn't already.
> >
> >
> > > Also, I'm concerned that allowing to change any GUC parameters during
> > > vacuum/analyze could be a foot-gun in the future. When modifying
> > > vacuum/analyze-related codes, we have to consider the case where any GUC
> > > parameters could be changed during vacuum/analyze.
> >
> > What kind of scenario are you thinking of?
> 
> For example, I guess we will need to take care of changes of
> maintenance_work_mem. Currently we initialize the dead tuple space at
> the beginning of lazy vacuum, but perhaps we would need to
> enlarge/shrink it based on the new value?

I don't think we need to do anything about that initially, just because the
config can be changed in a more granular way, doesn't mean we have to react to
every change for the current operation.

Greetings,

Andres Freund



Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Mon, Feb 27, 2023 at 9:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> On Fri, Feb 24, 2023 at 7:08 AM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> > Users may wish to speed up long-running vacuum of a large table by
> > decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
> > config file is only reloaded between tables (for autovacuum) or after
> > the statement (for explicit vacuum). This has been brought up for
> > autovacuum in [1].
> >
> > Andres suggested that it might be possible to check ConfigReloadPending
> > in vacuum_delay_point(), so I thought I would draft a rough patch and
> > start a discussion.
>
> In vacuum_delay_point(), we need to update VacuumCostActive too if necessary.

Yes, good point. Thank you!

On Thu, Feb 23, 2023 at 5:08 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> I don't think I can set and leave vac_in_outer_xact the way I am doing
> it in this patch, since I use vac_in_outer_xact in vacuum_delay_point(),
> which I believe is reachable from codepaths that would not have called
> vacuum(). It seems that if a backend sets it, the outer transaction
> commits, and then the backend ends up calling vacuum_delay_point() in a
> different way later, it wouldn't be quite right.

Perhaps I could just set in_outer_xact to false in the PG_FINALLY()
section in vacuum() to avoid this problem.

On Wed, Mar 1, 2023 at 7:15 PM Andres Freund <andres@anarazel.de> wrote:
> On 2023-02-28 11:16:45 +0900, Masahiko Sawada wrote:
> > On Tue, Feb 28, 2023 at 10:21 AM Andres Freund <andres@anarazel.de> wrote:
> > > On 2023-02-27 23:11:53 +0900, Masahiko Sawada wrote:
> > > > Also, I'm concerned that allowing to change any GUC parameters during
> > > > vacuum/analyze could be a foot-gun in the future. When modifying
> > > > vacuum/analyze-related codes, we have to consider the case where any GUC
> > > > parameters could be changed during vacuum/analyze.
> > >
> > > What kind of scenario are you thinking of?
> >
> > For example, I guess we will need to take care of changes of
> > maintenance_work_mem. Currently we initialize the dead tuple space at
> > the beginning of lazy vacuum, but perhaps we would need to
> > enlarge/shrink it based on the new value?
>
> I don't think we need to do anything about that initially, just because the
> config can be changed in a more granular way, doesn't mean we have to react to
> every change for the current operation.

Perhaps we can mention in the docs that a change to maintenance_work_mem
will not take effect in the middle of vacuuming a table. But, Ithink it probably
isn't needed.

On another topic, I've just realized that when autovacuuming we only
update tab->at_vacuum_cost_delay/limit from
autovacuum_vacuum_cost_delay/limit for each table (in
table_recheck_autovac()) and then use that to update
MyWorkerInfo->wi_cost_delay/limit. MyWorkerInfo->wi_cost_delay/limit is
what is used to update VacuumCostDelay/Limit in AutoVacuumUpdateDelay().
So, even if we reload the config file in vacuum_delay_point(), if we
don't use the new value of autovacuum_vacuum_cost_delay/limit it will
have no effect for autovacuum.

I started writing a little helper that could be used to update these
workerinfo->wi_cost_delay/limit in vacuum_delay_point(), but I notice
when they are first set, we consider the autovacuum table options. So,
I suppose I would need to consider these when updating
wi_cost_delay/limit later as well? (during vacuum_delay_point() or
in AutoVacuumUpdateDelay())

I wasn't quite sure because I found these chained ternaries rather
difficult to interpret, but I think table_recheck_autovac() is saying
that the autovacuum table options override all other values for
vac_cost_delay?

        vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
            ? avopts->vacuum_cost_delay
            : (autovacuum_vac_cost_delay >= 0)
            ? autovacuum_vac_cost_delay
            : VacuumCostDelay;

i.e. this?

  if (avopts && avopts->vacuum_cost_delay >= 0)
    vac_cost_delay = avopts->vacuum_cost_delay;
  else if (autovacuum_vac_cost_delay >= 0)
    vac_cost_delay = autovacuum_vacuum_cost_delay;
  else
    vac_cost_delay = VacuumCostDelay

- Melanie



Re: Should vacuum process config file reload more often

From
Amit Kapila
Date:
On Thu, Mar 2, 2023 at 5:45 AM Andres Freund <andres@anarazel.de> wrote:
>
> On 2023-02-28 11:16:45 +0900, Masahiko Sawada wrote:
> > On Tue, Feb 28, 2023 at 10:21 AM Andres Freund <andres@anarazel.de> wrote:
> > > On 2023-02-27 23:11:53 +0900, Masahiko Sawada wrote:
> > > > As far as I know there are not such GUC parameters in the core but
> > > > there might be in third-party table AM and index AM extensions.
> > >
> > > We already reload in a pretty broad range of situations, so I'm not sure
> > > there's a lot that could be unsafe that isn't already.
> > >
> > >
> > > > Also, I'm concerned that allowing to change any GUC parameters during
> > > > vacuum/analyze could be a foot-gun in the future. When modifying
> > > > vacuum/analyze-related codes, we have to consider the case where any GUC
> > > > parameters could be changed during vacuum/analyze.
> > >
> > > What kind of scenario are you thinking of?
> >
> > For example, I guess we will need to take care of changes of
> > maintenance_work_mem. Currently we initialize the dead tuple space at
> > the beginning of lazy vacuum, but perhaps we would need to
> > enlarge/shrink it based on the new value?
>
> I don't think we need to do anything about that initially, just because the
> config can be changed in a more granular way, doesn't mean we have to react to
> every change for the current operation.
>

+1. I also don't see the need to do anything for this case.

--
With Regards,
Amit Kapila.



Re: Should vacuum process config file reload more often

From
Masahiko Sawada
Date:
On Thu, Mar 2, 2023 at 10:41 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Mon, Feb 27, 2023 at 9:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > On Fri, Feb 24, 2023 at 7:08 AM Melanie Plageman
> > <melanieplageman@gmail.com> wrote:
> > > Users may wish to speed up long-running vacuum of a large table by
> > > decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
> > > config file is only reloaded between tables (for autovacuum) or after
> > > the statement (for explicit vacuum). This has been brought up for
> > > autovacuum in [1].
> > >
> > > Andres suggested that it might be possible to check ConfigReloadPending
> > > in vacuum_delay_point(), so I thought I would draft a rough patch and
> > > start a discussion.
> >
> > In vacuum_delay_point(), we need to update VacuumCostActive too if necessary.
>
> Yes, good point. Thank you!
>
> On Thu, Feb 23, 2023 at 5:08 PM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> > I don't think I can set and leave vac_in_outer_xact the way I am doing
> > it in this patch, since I use vac_in_outer_xact in vacuum_delay_point(),
> > which I believe is reachable from codepaths that would not have called
> > vacuum(). It seems that if a backend sets it, the outer transaction
> > commits, and then the backend ends up calling vacuum_delay_point() in a
> > different way later, it wouldn't be quite right.
>
> Perhaps I could just set in_outer_xact to false in the PG_FINALLY()
> section in vacuum() to avoid this problem.
>
> On Wed, Mar 1, 2023 at 7:15 PM Andres Freund <andres@anarazel.de> wrote:
> > On 2023-02-28 11:16:45 +0900, Masahiko Sawada wrote:
> > > On Tue, Feb 28, 2023 at 10:21 AM Andres Freund <andres@anarazel.de> wrote:
> > > > On 2023-02-27 23:11:53 +0900, Masahiko Sawada wrote:
> > > > > Also, I'm concerned that allowing to change any GUC parameters during
> > > > > vacuum/analyze could be a foot-gun in the future. When modifying
> > > > > vacuum/analyze-related codes, we have to consider the case where any GUC
> > > > > parameters could be changed during vacuum/analyze.
> > > >
> > > > What kind of scenario are you thinking of?
> > >
> > > For example, I guess we will need to take care of changes of
> > > maintenance_work_mem. Currently we initialize the dead tuple space at
> > > the beginning of lazy vacuum, but perhaps we would need to
> > > enlarge/shrink it based on the new value?
> >
> > I don't think we need to do anything about that initially, just because the
> > config can be changed in a more granular way, doesn't mean we have to react to
> > every change for the current operation.
>
> Perhaps we can mention in the docs that a change to maintenance_work_mem
> will not take effect in the middle of vacuuming a table. But, Ithink it probably
> isn't needed.

Agreed.

>
> On another topic, I've just realized that when autovacuuming we only
> update tab->at_vacuum_cost_delay/limit from
> autovacuum_vacuum_cost_delay/limit for each table (in
> table_recheck_autovac()) and then use that to update
> MyWorkerInfo->wi_cost_delay/limit. MyWorkerInfo->wi_cost_delay/limit is
> what is used to update VacuumCostDelay/Limit in AutoVacuumUpdateDelay().
> So, even if we reload the config file in vacuum_delay_point(), if we
> don't use the new value of autovacuum_vacuum_cost_delay/limit it will
> have no effect for autovacuum.

Right, but IIUC wi_cost_limit (and VacuumCostDelayLimit) might be
updated. After the autovacuum launcher reloads the config file, it
calls autovac_balance_cost() that updates that value of active
workers. I'm not sure why we don't update workers' wi_cost_delay,
though.

> I started writing a little helper that could be used to update these
> workerinfo->wi_cost_delay/limit in vacuum_delay_point(),

Since we set vacuum delay parameters for autovacuum workers so that we
ration out I/O equally, I think we should keep the current mechanism
that the autovacuum launcher sets workers' delay parameters and they
update accordingly.

>  but I notice
> when they are first set, we consider the autovacuum table options. So,
> I suppose I would need to consider these when updating
> wi_cost_delay/limit later as well? (during vacuum_delay_point() or
> in AutoVacuumUpdateDelay())
>
> I wasn't quite sure because I found these chained ternaries rather
> difficult to interpret, but I think table_recheck_autovac() is saying
> that the autovacuum table options override all other values for
> vac_cost_delay?
>
>         vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
>             ? avopts->vacuum_cost_delay
>             : (autovacuum_vac_cost_delay >= 0)
>             ? autovacuum_vac_cost_delay
>             : VacuumCostDelay;
>
> i.e. this?
>
>   if (avopts && avopts->vacuum_cost_delay >= 0)
>     vac_cost_delay = avopts->vacuum_cost_delay;
>   else if (autovacuum_vac_cost_delay >= 0)
>     vac_cost_delay = autovacuum_vacuum_cost_delay;
>   else
>     vac_cost_delay = VacuumCostDelay

Yes, if the table has autovacuum table options, we use these values
and the table is excluded from the balancing algorithm I mentioned
above. See the code from table_recheck_autovac(),

       /*
        * If any of the cost delay parameters has been set individually for
        * this table, disable the balancing algorithm.
        */
       tab->at_dobalance =
           !(avopts && (avopts->vacuum_cost_limit > 0 ||
                        avopts->vacuum_cost_delay > 0));

So if the table has autovacuum table options, the vacuum delay
parameters probably should be updated by ALTER TABLE, not by reloading
the config file.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Thu, Mar 2, 2023 at 2:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Mar 2, 2023 at 10:41 AM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> > On another topic, I've just realized that when autovacuuming we only
> > update tab->at_vacuum_cost_delay/limit from
> > autovacuum_vacuum_cost_delay/limit for each table (in
> > table_recheck_autovac()) and then use that to update
> > MyWorkerInfo->wi_cost_delay/limit. MyWorkerInfo->wi_cost_delay/limit is
> > what is used to update VacuumCostDelay/Limit in AutoVacuumUpdateDelay().
> > So, even if we reload the config file in vacuum_delay_point(), if we
> > don't use the new value of autovacuum_vacuum_cost_delay/limit it will
> > have no effect for autovacuum.
>
> Right, but IIUC wi_cost_limit (and VacuumCostDelayLimit) might be
> updated. After the autovacuum launcher reloads the config file, it
> calls autovac_balance_cost() that updates that value of active
> workers. I'm not sure why we don't update workers' wi_cost_delay,
> though.

Ah yes, I didn't realize this. Thanks. I went back and did more code
reading/analysis, and I see no reason why we shouldn't update
worker->wi_cost_delay to the new value of autovacuum_vac_cost_delay in
autovac_balance_cost(). Then, as you said, the autovac launcher will
call autovac_balance_cost() when it reloads the configuration file.
Then, the next time the autovac worker calls AutoVacuumUpdateDelay(), it
will update VacuumCostDelay.

> > I started writing a little helper that could be used to update these
> > workerinfo->wi_cost_delay/limit in vacuum_delay_point(),
>
> Since we set vacuum delay parameters for autovacuum workers so that we
> ration out I/O equally, I think we should keep the current mechanism
> that the autovacuum launcher sets workers' delay parameters and they
> update accordingly.

Yes, agreed, it should go in the same place as where we update
wi_cost_limit (autovac_balance_cost()). I think we should potentially
rename autovac_balance_cost() because its name and all its comments
point to its only purpose being to balance the total of the workers
wi_cost_limits to no more than autovacuum_vacuum_cost_limit. And the
autovacuum_vacuum_cost_delay doesn't need to be balanced in this way.

Though, since this change on its own would make autovacuum pick up new
values of autovacuum_vacuum_cost_limit (without having the worker reload
the config file), I wonder if it makes sense to try and have
vacuum_delay_point() only reload the config file if it is an explicit
vacuum or an analyze not being run in an outer transaction (to avoid
overhead of reloading config file)?

The lifecycle of this different vacuum delay-related gucs and how it
differs between autovacuum workers and explicit vacuum is quite tangled
already, though.

> >  but I notice
> > when they are first set, we consider the autovacuum table options. So,
> > I suppose I would need to consider these when updating
> > wi_cost_delay/limit later as well? (during vacuum_delay_point() or
> > in AutoVacuumUpdateDelay())
> >
> > I wasn't quite sure because I found these chained ternaries rather
> > difficult to interpret, but I think table_recheck_autovac() is saying
> > that the autovacuum table options override all other values for
> > vac_cost_delay?
> >
> >         vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
> >             ? avopts->vacuum_cost_delay
> >             : (autovacuum_vac_cost_delay >= 0)
> >             ? autovacuum_vac_cost_delay
> >             : VacuumCostDelay;
> >
> > i.e. this?
> >
> >   if (avopts && avopts->vacuum_cost_delay >= 0)
> >     vac_cost_delay = avopts->vacuum_cost_delay;
> >   else if (autovacuum_vac_cost_delay >= 0)
> >     vac_cost_delay = autovacuum_vacuum_cost_delay;
> >   else
> >     vac_cost_delay = VacuumCostDelay
>
> Yes, if the table has autovacuum table options, we use these values
> and the table is excluded from the balancing algorithm I mentioned
> above. See the code from table_recheck_autovac(),
>
>        /*
>         * If any of the cost delay parameters has been set individually for
>         * this table, disable the balancing algorithm.
>         */
>        tab->at_dobalance =
>            !(avopts && (avopts->vacuum_cost_limit > 0 ||
>                         avopts->vacuum_cost_delay > 0));
>
> So if the table has autovacuum table options, the vacuum delay
> parameters probably should be updated by ALTER TABLE, not by reloading
> the config file.

Yes, if the table has autovacuum table options, I think the user is
out-of-luck until the relation is done being vacuumed because the ALTER
TABLE will need to get a lock.

- Melanie



Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Thu, Mar 2, 2023 at 6:37 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Thu, Mar 2, 2023 at 2:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Mar 2, 2023 at 10:41 AM Melanie Plageman
> > <melanieplageman@gmail.com> wrote:
> > > On another topic, I've just realized that when autovacuuming we only
> > > update tab->at_vacuum_cost_delay/limit from
> > > autovacuum_vacuum_cost_delay/limit for each table (in
> > > table_recheck_autovac()) and then use that to update
> > > MyWorkerInfo->wi_cost_delay/limit. MyWorkerInfo->wi_cost_delay/limit is
> > > what is used to update VacuumCostDelay/Limit in AutoVacuumUpdateDelay().
> > > So, even if we reload the config file in vacuum_delay_point(), if we
> > > don't use the new value of autovacuum_vacuum_cost_delay/limit it will
> > > have no effect for autovacuum.
> >
> > Right, but IIUC wi_cost_limit (and VacuumCostDelayLimit) might be
> > updated. After the autovacuum launcher reloads the config file, it
> > calls autovac_balance_cost() that updates that value of active
> > workers. I'm not sure why we don't update workers' wi_cost_delay,
> > though.
>
> Ah yes, I didn't realize this. Thanks. I went back and did more code
> reading/analysis, and I see no reason why we shouldn't update
> worker->wi_cost_delay to the new value of autovacuum_vac_cost_delay in
> autovac_balance_cost(). Then, as you said, the autovac launcher will
> call autovac_balance_cost() when it reloads the configuration file.
> Then, the next time the autovac worker calls AutoVacuumUpdateDelay(), it
> will update VacuumCostDelay.
>
> > > I started writing a little helper that could be used to update these
> > > workerinfo->wi_cost_delay/limit in vacuum_delay_point(),
> >
> > Since we set vacuum delay parameters for autovacuum workers so that we
> > ration out I/O equally, I think we should keep the current mechanism
> > that the autovacuum launcher sets workers' delay parameters and they
> > update accordingly.
>
> Yes, agreed, it should go in the same place as where we update
> wi_cost_limit (autovac_balance_cost()). I think we should potentially
> rename autovac_balance_cost() because its name and all its comments
> point to its only purpose being to balance the total of the workers
> wi_cost_limits to no more than autovacuum_vacuum_cost_limit. And the
> autovacuum_vacuum_cost_delay doesn't need to be balanced in this way.
>
> Though, since this change on its own would make autovacuum pick up new
> values of autovacuum_vacuum_cost_limit (without having the worker reload
> the config file), I wonder if it makes sense to try and have
> vacuum_delay_point() only reload the config file if it is an explicit
> vacuum or an analyze not being run in an outer transaction (to avoid
> overhead of reloading config file)?
>
> The lifecycle of this different vacuum delay-related gucs and how it
> differs between autovacuum workers and explicit vacuum is quite tangled
> already, though.

So, I've attached a new version of the patch which is quite different
from the previous versions.

In this version I've removed wi_cost_delay from WorkerInfoData. There is
no synchronization of cost_delay amongst workers, so there is no reason
to keep it in shared memory.

One consequence of not updating VacuumCostDelay from wi_cost_delay is
that we have to have a way to keep track of whether or not autovacuum
table options are in use.

This patch does this in a cringeworthy way. I added two global
variables, one to track whether or not cost delay table options are in
use and the other to store the value of the table option cost delay. I
didn't want to use a single variable with a special value to indicate
that table option cost delay is in use because
autovacuum_vacuum_cost_delay already has special values that mean
certain things. My code needs a better solution.

It is worth mentioning that I think that in master,
AutoVacuumUpdateDelay() was incorrectly reading wi_cost_limit and
wi_cost_delay from shared memory without holding a lock.

I've added in a shared lock for reading from wi_cost_limit in this
patch. However, AutoVacuumUpdateLimit() is called unconditionally in
vacuum_delay_point(), which is called quite often (per block-ish), so I
was trying to think if there is a way we could avoid having to check
this shared memory variable on every call to vacuum_delay_point().
Rebalances shouldn't happen very often (done by the launcher when a new
worker is launched and by workers between vacuuming tables). Maybe we
can read from it less frequently?

Also not sure how the patch interacts with failsafe autovac and parallel
vacuum.

- Melanie

Attachment

Re: Should vacuum process config file reload more often

From
Masahiko Sawada
Date:
On Mon, Mar 6, 2023 at 5:26 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Thu, Mar 2, 2023 at 6:37 PM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> >
> > On Thu, Mar 2, 2023 at 2:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Mar 2, 2023 at 10:41 AM Melanie Plageman
> > > <melanieplageman@gmail.com> wrote:
> > > > On another topic, I've just realized that when autovacuuming we only
> > > > update tab->at_vacuum_cost_delay/limit from
> > > > autovacuum_vacuum_cost_delay/limit for each table (in
> > > > table_recheck_autovac()) and then use that to update
> > > > MyWorkerInfo->wi_cost_delay/limit. MyWorkerInfo->wi_cost_delay/limit is
> > > > what is used to update VacuumCostDelay/Limit in AutoVacuumUpdateDelay().
> > > > So, even if we reload the config file in vacuum_delay_point(), if we
> > > > don't use the new value of autovacuum_vacuum_cost_delay/limit it will
> > > > have no effect for autovacuum.
> > >
> > > Right, but IIUC wi_cost_limit (and VacuumCostDelayLimit) might be
> > > updated. After the autovacuum launcher reloads the config file, it
> > > calls autovac_balance_cost() that updates that value of active
> > > workers. I'm not sure why we don't update workers' wi_cost_delay,
> > > though.
> >
> > Ah yes, I didn't realize this. Thanks. I went back and did more code
> > reading/analysis, and I see no reason why we shouldn't update
> > worker->wi_cost_delay to the new value of autovacuum_vac_cost_delay in
> > autovac_balance_cost(). Then, as you said, the autovac launcher will
> > call autovac_balance_cost() when it reloads the configuration file.
> > Then, the next time the autovac worker calls AutoVacuumUpdateDelay(), it
> > will update VacuumCostDelay.
> >
> > > > I started writing a little helper that could be used to update these
> > > > workerinfo->wi_cost_delay/limit in vacuum_delay_point(),
> > >
> > > Since we set vacuum delay parameters for autovacuum workers so that we
> > > ration out I/O equally, I think we should keep the current mechanism
> > > that the autovacuum launcher sets workers' delay parameters and they
> > > update accordingly.
> >
> > Yes, agreed, it should go in the same place as where we update
> > wi_cost_limit (autovac_balance_cost()). I think we should potentially
> > rename autovac_balance_cost() because its name and all its comments
> > point to its only purpose being to balance the total of the workers
> > wi_cost_limits to no more than autovacuum_vacuum_cost_limit. And the
> > autovacuum_vacuum_cost_delay doesn't need to be balanced in this way.
> >
> > Though, since this change on its own would make autovacuum pick up new
> > values of autovacuum_vacuum_cost_limit (without having the worker reload
> > the config file), I wonder if it makes sense to try and have
> > vacuum_delay_point() only reload the config file if it is an explicit
> > vacuum or an analyze not being run in an outer transaction (to avoid
> > overhead of reloading config file)?
> >
> > The lifecycle of this different vacuum delay-related gucs and how it
> > differs between autovacuum workers and explicit vacuum is quite tangled
> > already, though.
>
> So, I've attached a new version of the patch which is quite different
> from the previous versions.

Thank you for updating the patch!

>
> In this version I've removed wi_cost_delay from WorkerInfoData. There is
> no synchronization of cost_delay amongst workers, so there is no reason
> to keep it in shared memory.
>
> One consequence of not updating VacuumCostDelay from wi_cost_delay is
> that we have to have a way to keep track of whether or not autovacuum
> table options are in use.
>
> This patch does this in a cringeworthy way. I added two global
> variables, one to track whether or not cost delay table options are in
> use and the other to store the value of the table option cost delay. I
> didn't want to use a single variable with a special value to indicate
> that table option cost delay is in use because
> autovacuum_vacuum_cost_delay already has special values that mean
> certain things. My code needs a better solution.

While it's true that wi_cost_delay doesn't need to be shared, it seems
to make the logic somewhat complex. We need to handle cost_delay in a
different way from other vacuum-related parameters and we need to make
sure av[_use]_table_option_cost_delay are set properly. Removing
wi_cost_delay from WorkerInfoData saves 8 bytes shared memory per
autovacuum worker but it might be worth considering to keep
wi_cost_delay for simplicity.

---
 void
 AutoVacuumUpdateDelay(void)
 {
-        if (MyWorkerInfo)
+        /*
+         * We are using autovacuum-related GUCs to update
VacuumCostDelay, so we
+         * only want autovacuum workers and autovacuum launcher to do this.
+         */
+        if (!(am_autovacuum_worker || am_autovacuum_launcher))
+                return;

Is there any case where the autovacuum launcher calls
AutoVacuumUpdateDelay() function?

---
In at autovac_balance_cost(), we have,

    int         vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
                                  autovacuum_vac_cost_limit : VacuumCostLimit);
    double      vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
                                  autovacuum_vac_cost_delay : VacuumCostDelay);
    :
    /* not set? nothing to do */
    if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
        return;

IIUC if autovacuum_vac_cost_delay is changed to 0 during autovacuums
running, their vacuum delay parameters are not changed. It's not a bug
of the patch but I think we can fix it in this patch.

>
> It is worth mentioning that I think that in master,
> AutoVacuumUpdateDelay() was incorrectly reading wi_cost_limit and
> wi_cost_delay from shared memory without holding a lock.

Indeed.

> I've added in a shared lock for reading from wi_cost_limit in this
> patch. However, AutoVacuumUpdateLimit() is called unconditionally in
> vacuum_delay_point(), which is called quite often (per block-ish), so I
> was trying to think if there is a way we could avoid having to check
> this shared memory variable on every call to vacuum_delay_point().
> Rebalances shouldn't happen very often (done by the launcher when a new
> worker is launched and by workers between vacuuming tables). Maybe we
> can read from it less frequently?

Yeah, acquiring the lwlock for every call to vacuum_delay_point()
seems to be harmful. One idea would be to have one sig_atomic_t
variable in WorkerInfoData and autovac_balance_cost() set it to true
after rebalancing the worker's cost-limit. The worker can check it
without locking and update its delay parameters if the flag is true.

>
> Also not sure how the patch interacts with failsafe autovac and parallel
> vacuum.

Good point.

When entering the failsafe mode, we disable the vacuum delays (see
lazy_check_wraparound_failsafe()). We need to keep disabling the
vacuum delays even after reloading the config file. One idea is to
have another global variable indicating we're in the failsafe mode.
vacuum_delay_point() doesn't update VacuumCostActive if the flag is
true.

As far as I can see we don't need special treatments for parallel
vacuum cases since it works only in manual vacuum. It calculates the
sleep time based on the shared cost balance and how much the worker
did I/O but the basic mechanism is the same as non-parallel case.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Should vacuum process config file reload more often

From
Jim Nasby
Date:

On 3/2/23 1:36 AM, Masahiko Sawada wrote:

For example, I guess we will need to take care of changes of
maintenance_work_mem. Currently we initialize the dead tuple space at
the beginning of lazy vacuum, but perhaps we would need to
enlarge/shrink it based on the new value?
Doesn't the dead tuple space grow as needed? Last I looked we don't allocate up to 1GB right off the bat.
I don't think we need to do anything about that initially, just because the
config can be changed in a more granular way, doesn't mean we have to react to
every change for the current operation.
Perhaps we can mention in the docs that a change to maintenance_work_mem
will not take effect in the middle of vacuuming a table. But, Ithink it probably
isn't needed.
Agreed.

I disagree that there's no need for this. Sure, if maintenance_work_memory is 10MB then it's no big deal to just abandon your current vacuum and start a new one, but the index vacuuming phase with maintenance_work_mem set to say 500MB can take quite a while. Forcing a user to either suck it up or throw everything in the phase away isn't terribly good.

Of course, if the patch that eliminates the 1GB vacuum limit gets committed the situation will be even worse.

While it'd be nice to also honor maintenance_work_mem getting set lower, I don't see any need to go through heroics to accomplish that. Simply recording the change and honoring it for future attempts to grow the memory and on future passes through the heap would be plenty.

All that said, don't let these suggestions get in the way of committing this. Just having the ability to tweak cost parameters would be a win.

Re: Should vacuum process config file reload more often

From
Andres Freund
Date:
Hi,

On 2023-03-08 11:42:31 -0600, Jim Nasby wrote:
> On 3/2/23 1:36 AM, Masahiko Sawada wrote:
> 
> > > > > For example, I guess we will need to take care of changes of
> > > > > maintenance_work_mem. Currently we initialize the dead tuple space at
> > > > > the beginning of lazy vacuum, but perhaps we would need to
> > > > > enlarge/shrink it based on the new value?
> Doesn't the dead tuple space grow as needed? Last I looked we don't allocate
> up to 1GB right off the bat.
> > > > I don't think we need to do anything about that initially, just because the
> > > > config can be changed in a more granular way, doesn't mean we have to react to
> > > > every change for the current operation.
> > > Perhaps we can mention in the docs that a change to maintenance_work_mem
> > > will not take effect in the middle of vacuuming a table. But, Ithink it probably
> > > isn't needed.
> > Agreed.
> 
> I disagree that there's no need for this. Sure, if maintenance_work_memory
> is 10MB then it's no big deal to just abandon your current vacuum and start
> a new one, but the index vacuuming phase with maintenance_work_mem set to
> say 500MB can take quite a while. Forcing a user to either suck it up or
> throw everything in the phase away isn't terribly good.
> 
> Of course, if the patch that eliminates the 1GB vacuum limit gets committed
> the situation will be even worse.
> 
> While it'd be nice to also honor maintenance_work_mem getting set lower, I
> don't see any need to go through heroics to accomplish that. Simply
> recording the change and honoring it for future attempts to grow the memory
> and on future passes through the heap would be plenty.
> 
> All that said, don't let these suggestions get in the way of committing
> this. Just having the ability to tweak cost parameters would be a win.

Nobody said anything about it not being useful to react to m_w_m changes, just
that it's not required to make some progress . So I really don't understand
what the point of your comment is.

Greetings,

Andres Freund



Re: Should vacuum process config file reload more often

From
John Naylor
Date:

On Thu, Mar 9, 2023 at 12:42 AM Jim Nasby <nasbyj@amazon.com> wrote:
>
> Doesn't the dead tuple space grow as needed? Last I looked we don't allocate up to 1GB right off the bat.

Incorrect.

> Of course, if the patch that eliminates the 1GB vacuum limit gets committed the situation will be even worse.

If you're referring to the proposed tid store, I'd be interested in seeing a reproducible test case with a m_w_m over 1GB where it makes things worse than the current state of affairs.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: Should vacuum process config file reload more often

From
Masahiko Sawada
Date:
On Thu, Mar 9, 2023 at 4:47 PM John Naylor <john.naylor@enterprisedb.com> wrote:
>
>
> On Thu, Mar 9, 2023 at 12:42 AM Jim Nasby <nasbyj@amazon.com> wrote:
> >
> > Doesn't the dead tuple space grow as needed? Last I looked we don't allocate up to 1GB right off the bat.
>
> Incorrect.
>
> > Of course, if the patch that eliminates the 1GB vacuum limit gets committed the situation will be even worse.
>
> If you're referring to the proposed tid store, I'd be interested in seeing a reproducible test case with a m_w_m over
1GBwhere it makes things worse than the current state of affairs. 

And I think that the tidstore makes it easy to react to
maintenance_work_mem changes. We don't need to enlarge it and just
update its memory limit at an appropriate time.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Tue, Mar 7, 2023 at 12:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Mar 6, 2023 at 5:26 AM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> >
> > On Thu, Mar 2, 2023 at 6:37 PM Melanie Plageman
> > In this version I've removed wi_cost_delay from WorkerInfoData. There is
> > no synchronization of cost_delay amongst workers, so there is no reason
> > to keep it in shared memory.
> >
> > One consequence of not updating VacuumCostDelay from wi_cost_delay is
> > that we have to have a way to keep track of whether or not autovacuum
> > table options are in use.
> >
> > This patch does this in a cringeworthy way. I added two global
> > variables, one to track whether or not cost delay table options are in
> > use and the other to store the value of the table option cost delay. I
> > didn't want to use a single variable with a special value to indicate
> > that table option cost delay is in use because
> > autovacuum_vacuum_cost_delay already has special values that mean
> > certain things. My code needs a better solution.
>
> While it's true that wi_cost_delay doesn't need to be shared, it seems
> to make the logic somewhat complex. We need to handle cost_delay in a
> different way from other vacuum-related parameters and we need to make
> sure av[_use]_table_option_cost_delay are set properly. Removing
> wi_cost_delay from WorkerInfoData saves 8 bytes shared memory per
> autovacuum worker but it might be worth considering to keep
> wi_cost_delay for simplicity.

Ah, it turns out we can't really remove wi_cost_delay from WorkerInfo
anyway because the launcher doesn't know anything about table options
and so the workers have to keep an updated wi_cost_delay that the
launcher or other autovac workers who are not vacuuming that table can
read from when calculating the new limit in autovac_balance_cost().

However, wi_cost_delay is a double, so if we start updating it on config
reload in vacuum_delay_point(), we definitely need some protection
against torn reads.

The table options can only change when workers start vacuuming a new
table, so maybe there is some way to use this to solve this problem?

> > It is worth mentioning that I think that in master,
> > AutoVacuumUpdateDelay() was incorrectly reading wi_cost_limit and
> > wi_cost_delay from shared memory without holding a lock.
>
> Indeed.
>
> > I've added in a shared lock for reading from wi_cost_limit in this
> > patch. However, AutoVacuumUpdateLimit() is called unconditionally in
> > vacuum_delay_point(), which is called quite often (per block-ish), so I
> > was trying to think if there is a way we could avoid having to check
> > this shared memory variable on every call to vacuum_delay_point().
> > Rebalances shouldn't happen very often (done by the launcher when a new
> > worker is launched and by workers between vacuuming tables). Maybe we
> > can read from it less frequently?
>
> Yeah, acquiring the lwlock for every call to vacuum_delay_point()
> seems to be harmful. One idea would be to have one sig_atomic_t
> variable in WorkerInfoData and autovac_balance_cost() set it to true
> after rebalancing the worker's cost-limit. The worker can check it
> without locking and update its delay parameters if the flag is true.

Maybe we can do something like this with the table options values?

- Melanie



Re: Should vacuum process config file reload more often

From
Masahiko Sawada
Date:
On Fri, Mar 10, 2023 at 11:23 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Tue, Mar 7, 2023 at 12:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Mar 6, 2023 at 5:26 AM Melanie Plageman
> > <melanieplageman@gmail.com> wrote:
> > >
> > > On Thu, Mar 2, 2023 at 6:37 PM Melanie Plageman
> > > In this version I've removed wi_cost_delay from WorkerInfoData. There is
> > > no synchronization of cost_delay amongst workers, so there is no reason
> > > to keep it in shared memory.
> > >
> > > One consequence of not updating VacuumCostDelay from wi_cost_delay is
> > > that we have to have a way to keep track of whether or not autovacuum
> > > table options are in use.
> > >
> > > This patch does this in a cringeworthy way. I added two global
> > > variables, one to track whether or not cost delay table options are in
> > > use and the other to store the value of the table option cost delay. I
> > > didn't want to use a single variable with a special value to indicate
> > > that table option cost delay is in use because
> > > autovacuum_vacuum_cost_delay already has special values that mean
> > > certain things. My code needs a better solution.
> >
> > While it's true that wi_cost_delay doesn't need to be shared, it seems
> > to make the logic somewhat complex. We need to handle cost_delay in a
> > different way from other vacuum-related parameters and we need to make
> > sure av[_use]_table_option_cost_delay are set properly. Removing
> > wi_cost_delay from WorkerInfoData saves 8 bytes shared memory per
> > autovacuum worker but it might be worth considering to keep
> > wi_cost_delay for simplicity.
>
> Ah, it turns out we can't really remove wi_cost_delay from WorkerInfo
> anyway because the launcher doesn't know anything about table options
> and so the workers have to keep an updated wi_cost_delay that the
> launcher or other autovac workers who are not vacuuming that table can
> read from when calculating the new limit in autovac_balance_cost().

IIUC if any of the cost delay parameters has been set individually,
the autovacuum worker is excluded from the balance algorithm.

>
> However, wi_cost_delay is a double, so if we start updating it on config
> reload in vacuum_delay_point(), we definitely need some protection
> against torn reads.
>
> The table options can only change when workers start vacuuming a new
> table, so maybe there is some way to use this to solve this problem?
>
> > > It is worth mentioning that I think that in master,
> > > AutoVacuumUpdateDelay() was incorrectly reading wi_cost_limit and
> > > wi_cost_delay from shared memory without holding a lock.
> >
> > Indeed.
> >
> > > I've added in a shared lock for reading from wi_cost_limit in this
> > > patch. However, AutoVacuumUpdateLimit() is called unconditionally in
> > > vacuum_delay_point(), which is called quite often (per block-ish), so I
> > > was trying to think if there is a way we could avoid having to check
> > > this shared memory variable on every call to vacuum_delay_point().
> > > Rebalances shouldn't happen very often (done by the launcher when a new
> > > worker is launched and by workers between vacuuming tables). Maybe we
> > > can read from it less frequently?
> >
> > Yeah, acquiring the lwlock for every call to vacuum_delay_point()
> > seems to be harmful. One idea would be to have one sig_atomic_t
> > variable in WorkerInfoData and autovac_balance_cost() set it to true
> > after rebalancing the worker's cost-limit. The worker can check it
> > without locking and update its delay parameters if the flag is true.
>
> Maybe we can do something like this with the table options values?

Since an autovacuum that uses any of table option cost delay
parameters is excluded from the balancing algorithm, the launcher
doesn't need to notify such workers of changes of the cost-limit, no?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
Quotes below are combined from two of Sawada-san's emails.

I've also attached a patch with my suggested current version.

On Thu, Mar 9, 2023 at 10:27 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Mar 10, 2023 at 11:23 AM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> >
> > On Tue, Mar 7, 2023 at 12:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Mon, Mar 6, 2023 at 5:26 AM Melanie Plageman
> > > <melanieplageman@gmail.com> wrote:
> > > >
> > > > On Thu, Mar 2, 2023 at 6:37 PM Melanie Plageman
> > > > In this version I've removed wi_cost_delay from WorkerInfoData. There is
> > > > no synchronization of cost_delay amongst workers, so there is no reason
> > > > to keep it in shared memory.
> > > >
> > > > One consequence of not updating VacuumCostDelay from wi_cost_delay is
> > > > that we have to have a way to keep track of whether or not autovacuum
> > > > table options are in use.
> > > >
> > > > This patch does this in a cringeworthy way. I added two global
> > > > variables, one to track whether or not cost delay table options are in
> > > > use and the other to store the value of the table option cost delay. I
> > > > didn't want to use a single variable with a special value to indicate
> > > > that table option cost delay is in use because
> > > > autovacuum_vacuum_cost_delay already has special values that mean
> > > > certain things. My code needs a better solution.
> > >
> > > While it's true that wi_cost_delay doesn't need to be shared, it seems
> > > to make the logic somewhat complex. We need to handle cost_delay in a
> > > different way from other vacuum-related parameters and we need to make
> > > sure av[_use]_table_option_cost_delay are set properly. Removing
> > > wi_cost_delay from WorkerInfoData saves 8 bytes shared memory per
> > > autovacuum worker but it might be worth considering to keep
> > > wi_cost_delay for simplicity.
> >
> > Ah, it turns out we can't really remove wi_cost_delay from WorkerInfo
> > anyway because the launcher doesn't know anything about table options
> > and so the workers have to keep an updated wi_cost_delay that the
> > launcher or other autovac workers who are not vacuuming that table can
> > read from when calculating the new limit in autovac_balance_cost().
>
> IIUC if any of the cost delay parameters has been set individually,
> the autovacuum worker is excluded from the balance algorithm.

Ah, yes! That's right. So it is not a problem. Then I still think
removing wi_cost_delay from the worker info makes sense. wi_cost_delay
is a double and can't easily be accessed atomically the way
wi_cost_limit can be.

Keeping the cost delay local to the backends also makes it clear that
cost delay is not something that should be written to by other backends
or that can differ from worker to worker. Without table options in the
picture, the cost delay should be the same for any worker who has
reloaded the config file.

As for the cost limit safe access issue, maybe we can avoid a LWLock
acquisition for reading wi_cost_limit by using an atomic similar to what
you suggested here for "did_rebalance".

> > I've added in a shared lock for reading from wi_cost_limit in this
> > patch. However, AutoVacuumUpdateLimit() is called unconditionally in
> > vacuum_delay_point(), which is called quite often (per block-ish), so I
> > was trying to think if there is a way we could avoid having to check
> > this shared memory variable on every call to vacuum_delay_point().
> > Rebalances shouldn't happen very often (done by the launcher when a new
> > worker is launched and by workers between vacuuming tables). Maybe we
> > can read from it less frequently?
>
> Yeah, acquiring the lwlock for every call to vacuum_delay_point()
> seems to be harmful. One idea would be to have one sig_atomic_t
> variable in WorkerInfoData and autovac_balance_cost() set it to true
> after rebalancing the worker's cost-limit. The worker can check it
> without locking and update its delay parameters if the flag is true.

Instead of having the atomic indicate whether or not someone (launcher
or another worker) did a rebalance, it would simply store the current
cost limit. Then the worker can normally access it with a simple read.

My rationale is that if we used an atomic to indicate whether or not we
did a rebalance ("did_rebalance"), we would have the same cache
coherency guarantees as if we just used the atomic for the cost limit.
If we read from the "did_rebalance" variable and missed someone having
written to it on another core, we still wouldn't get around to checking
the wi_cost_limit variable in shared memory, so it doesn't matter that
we bothered to keep it in shared memory and use a lock to access it.

I noticed we don't allow wi_cost_limit to ever be less than 0, so we
could store wi_cost_limit in an atomic uint32.

I'm not sure if it is okay to do pg_atomic_read_u32() and
pg_atomic_unlocked_write_u32() or if we need pg_atomic_write_u32() in
most cases.

I've implemented the atomic cost limit in the attached patch. Though,
I'm pretty unsure about how I initialized the atomics in
AutoVacuumShmemInit()...

If the consensus is that it is simply too confusing to take
wi_cost_delay out of WorkerInfo, we might be able to afford using a
shared lock to access it because we won't call AutoVacuumUpdateDelay()
on every invocation of vacuum_delay_point() -- only when we've reloaded
the config file.

One potential option to avoid taking a shared lock on every call to
AutoVacuumUpdateDelay() is to set a global variable to indicate that we
did update it (since we are the only ones updating it) and then only
take the shared LWLock in AutoVacuumUpdateDelay() if that flag is true.

> ---
>  void
>  AutoVacuumUpdateDelay(void)
>  {
> -        if (MyWorkerInfo)
> +        /*
> +         * We are using autovacuum-related GUCs to update
> VacuumCostDelay, so we
> +         * only want autovacuum workers and autovacuum launcher to do this.
> +         */
> +        if (!(am_autovacuum_worker || am_autovacuum_launcher))
> +                return;
>
> Is there any case where the autovacuum launcher calls
> AutoVacuumUpdateDelay() function?

I had meant to add it to HandleAutoVacLauncherInterrupts() after
reloading the config file (done in attached patch). When using the
global variables for cost delay (instead of wi_cost_delay in worker
info), the autovac launcher also has to do the check in the else branch
of AutoVacuumUpdateDelay()

        VacuumCostDelay = autovacuum_vac_cost_delay >= 0 ?
            autovacuum_vac_cost_delay : VacuumCostDelay;

to make sure VacuumCostDelay is correct for when it calls
autovac_balance_cost().

This also made me think about whether or not we still need cost_limit_base.
It is used to ensure that autovac_balance_cost() never ends up setting
workers' wi_cost_limits above the current autovacuum_vacuum_cost_limit
(or VacuumCostLimit). However, the launcher and all the workers should
know what the value is without cost_limit_base, no?

> ---
> In at autovac_balance_cost(), we have,
>
>     int         vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
>                                   autovacuum_vac_cost_limit : VacuumCostLimit);
>     double      vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
>                                   autovacuum_vac_cost_delay : VacuumCostDelay);
>     :
>     /* not set? nothing to do */
>     if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
>         return;
>
> IIUC if autovacuum_vac_cost_delay is changed to 0 during autovacuums
> running, their vacuum delay parameters are not changed. It's not a bug
> of the patch but I think we can fix it in this patch.

Yes, currently (in master) wi_cost_delay does not get updated anywhere.
In my patch, the global variable we are using for delay is updated but
it is not done in autovac_balance_cost().

> > Also not sure how the patch interacts with failsafe autovac and parallel
> > vacuum.
>
> Good point.
>
> When entering the failsafe mode, we disable the vacuum delays (see
> lazy_check_wraparound_failsafe()). We need to keep disabling the
> vacuum delays even after reloading the config file. One idea is to
> have another global variable indicating we're in the failsafe mode.
> vacuum_delay_point() doesn't update VacuumCostActive if the flag is
> true.

I think we might not need to do this. Other than in
lazy_check_wraparound_failsafe(), VacuumCostActive is only updated in
two places:

1) in vacuum() which autovacuum will call per table. And failsafe is
reset per table as well.

2) in vacuum_delay_point(), but, since VacuumCostActive will already be
false when we enter vacuum_delay_point() the next time after
lazy_check_wraparound_failsafe(), we won't set VacuumCostActive there.

Thanks again for the detailed feedback!

- Melanie

Attachment

Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Fri, Mar 10, 2023 at 6:11 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> Quotes below are combined from two of Sawada-san's emails.
>
> I've also attached a patch with my suggested current version.
>
> On Thu, Mar 9, 2023 at 10:27 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Fri, Mar 10, 2023 at 11:23 AM Melanie Plageman
> > <melanieplageman@gmail.com> wrote:
> > >
> > > On Tue, Mar 7, 2023 at 12:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Mon, Mar 6, 2023 at 5:26 AM Melanie Plageman
> > > > <melanieplageman@gmail.com> wrote:
> > > > >
> > > > > On Thu, Mar 2, 2023 at 6:37 PM Melanie Plageman
> > > > > In this version I've removed wi_cost_delay from WorkerInfoData. There is
> > > > > no synchronization of cost_delay amongst workers, so there is no reason
> > > > > to keep it in shared memory.
> > > > >
> > > > > One consequence of not updating VacuumCostDelay from wi_cost_delay is
> > > > > that we have to have a way to keep track of whether or not autovacuum
> > > > > table options are in use.
> > > > >
> > > > > This patch does this in a cringeworthy way. I added two global
> > > > > variables, one to track whether or not cost delay table options are in
> > > > > use and the other to store the value of the table option cost delay. I
> > > > > didn't want to use a single variable with a special value to indicate
> > > > > that table option cost delay is in use because
> > > > > autovacuum_vacuum_cost_delay already has special values that mean
> > > > > certain things. My code needs a better solution.
> > > >
> > > > While it's true that wi_cost_delay doesn't need to be shared, it seems
> > > > to make the logic somewhat complex. We need to handle cost_delay in a
> > > > different way from other vacuum-related parameters and we need to make
> > > > sure av[_use]_table_option_cost_delay are set properly. Removing
> > > > wi_cost_delay from WorkerInfoData saves 8 bytes shared memory per
> > > > autovacuum worker but it might be worth considering to keep
> > > > wi_cost_delay for simplicity.
> > >
> > > Ah, it turns out we can't really remove wi_cost_delay from WorkerInfo
> > > anyway because the launcher doesn't know anything about table options
> > > and so the workers have to keep an updated wi_cost_delay that the
> > > launcher or other autovac workers who are not vacuuming that table can
> > > read from when calculating the new limit in autovac_balance_cost().
> >
> > IIUC if any of the cost delay parameters has been set individually,
> > the autovacuum worker is excluded from the balance algorithm.
>
> Ah, yes! That's right. So it is not a problem. Then I still think
> removing wi_cost_delay from the worker info makes sense. wi_cost_delay
> is a double and can't easily be accessed atomically the way
> wi_cost_limit can be.
>
> Keeping the cost delay local to the backends also makes it clear that
> cost delay is not something that should be written to by other backends
> or that can differ from worker to worker. Without table options in the
> picture, the cost delay should be the same for any worker who has
> reloaded the config file.
>
> As for the cost limit safe access issue, maybe we can avoid a LWLock
> acquisition for reading wi_cost_limit by using an atomic similar to what
> you suggested here for "did_rebalance".
>
> > > I've added in a shared lock for reading from wi_cost_limit in this
> > > patch. However, AutoVacuumUpdateLimit() is called unconditionally in
> > > vacuum_delay_point(), which is called quite often (per block-ish), so I
> > > was trying to think if there is a way we could avoid having to check
> > > this shared memory variable on every call to vacuum_delay_point().
> > > Rebalances shouldn't happen very often (done by the launcher when a new
> > > worker is launched and by workers between vacuuming tables). Maybe we
> > > can read from it less frequently?
> >
> > Yeah, acquiring the lwlock for every call to vacuum_delay_point()
> > seems to be harmful. One idea would be to have one sig_atomic_t
> > variable in WorkerInfoData and autovac_balance_cost() set it to true
> > after rebalancing the worker's cost-limit. The worker can check it
> > without locking and update its delay parameters if the flag is true.
>
> Instead of having the atomic indicate whether or not someone (launcher
> or another worker) did a rebalance, it would simply store the current
> cost limit. Then the worker can normally access it with a simple read.
>
> My rationale is that if we used an atomic to indicate whether or not we
> did a rebalance ("did_rebalance"), we would have the same cache
> coherency guarantees as if we just used the atomic for the cost limit.
> If we read from the "did_rebalance" variable and missed someone having
> written to it on another core, we still wouldn't get around to checking
> the wi_cost_limit variable in shared memory, so it doesn't matter that
> we bothered to keep it in shared memory and use a lock to access it.
>
> I noticed we don't allow wi_cost_limit to ever be less than 0, so we
> could store wi_cost_limit in an atomic uint32.
>
> I'm not sure if it is okay to do pg_atomic_read_u32() and
> pg_atomic_unlocked_write_u32() or if we need pg_atomic_write_u32() in
> most cases.
>
> I've implemented the atomic cost limit in the attached patch. Though,
> I'm pretty unsure about how I initialized the atomics in
> AutoVacuumShmemInit()...
>
> If the consensus is that it is simply too confusing to take
> wi_cost_delay out of WorkerInfo, we might be able to afford using a
> shared lock to access it because we won't call AutoVacuumUpdateDelay()
> on every invocation of vacuum_delay_point() -- only when we've reloaded
> the config file.

One such implementation is attached.

- Melanie

Attachment

Re: Should vacuum process config file reload more often

From
Masahiko Sawada
Date:
On Sat, Mar 11, 2023 at 8:11 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> Quotes below are combined from two of Sawada-san's emails.
>
> I've also attached a patch with my suggested current version.
>
> On Thu, Mar 9, 2023 at 10:27 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Fri, Mar 10, 2023 at 11:23 AM Melanie Plageman
> > <melanieplageman@gmail.com> wrote:
> > >
> > > On Tue, Mar 7, 2023 at 12:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Mon, Mar 6, 2023 at 5:26 AM Melanie Plageman
> > > > <melanieplageman@gmail.com> wrote:
> > > > >
> > > > > On Thu, Mar 2, 2023 at 6:37 PM Melanie Plageman
> > > > > In this version I've removed wi_cost_delay from WorkerInfoData. There is
> > > > > no synchronization of cost_delay amongst workers, so there is no reason
> > > > > to keep it in shared memory.
> > > > >
> > > > > One consequence of not updating VacuumCostDelay from wi_cost_delay is
> > > > > that we have to have a way to keep track of whether or not autovacuum
> > > > > table options are in use.
> > > > >
> > > > > This patch does this in a cringeworthy way. I added two global
> > > > > variables, one to track whether or not cost delay table options are in
> > > > > use and the other to store the value of the table option cost delay. I
> > > > > didn't want to use a single variable with a special value to indicate
> > > > > that table option cost delay is in use because
> > > > > autovacuum_vacuum_cost_delay already has special values that mean
> > > > > certain things. My code needs a better solution.
> > > >
> > > > While it's true that wi_cost_delay doesn't need to be shared, it seems
> > > > to make the logic somewhat complex. We need to handle cost_delay in a
> > > > different way from other vacuum-related parameters and we need to make
> > > > sure av[_use]_table_option_cost_delay are set properly. Removing
> > > > wi_cost_delay from WorkerInfoData saves 8 bytes shared memory per
> > > > autovacuum worker but it might be worth considering to keep
> > > > wi_cost_delay for simplicity.
> > >
> > > Ah, it turns out we can't really remove wi_cost_delay from WorkerInfo
> > > anyway because the launcher doesn't know anything about table options
> > > and so the workers have to keep an updated wi_cost_delay that the
> > > launcher or other autovac workers who are not vacuuming that table can
> > > read from when calculating the new limit in autovac_balance_cost().
> >
> > IIUC if any of the cost delay parameters has been set individually,
> > the autovacuum worker is excluded from the balance algorithm.
>
> Ah, yes! That's right. So it is not a problem. Then I still think
> removing wi_cost_delay from the worker info makes sense. wi_cost_delay
> is a double and can't easily be accessed atomically the way
> wi_cost_limit can be.
>
> Keeping the cost delay local to the backends also makes it clear that
> cost delay is not something that should be written to by other backends
> or that can differ from worker to worker. Without table options in the
> picture, the cost delay should be the same for any worker who has
> reloaded the config file.

Agreed.

>
> As for the cost limit safe access issue, maybe we can avoid a LWLock
> acquisition for reading wi_cost_limit by using an atomic similar to what
> you suggested here for "did_rebalance".
>
> > > I've added in a shared lock for reading from wi_cost_limit in this
> > > patch. However, AutoVacuumUpdateLimit() is called unconditionally in
> > > vacuum_delay_point(), which is called quite often (per block-ish), so I
> > > was trying to think if there is a way we could avoid having to check
> > > this shared memory variable on every call to vacuum_delay_point().
> > > Rebalances shouldn't happen very often (done by the launcher when a new
> > > worker is launched and by workers between vacuuming tables). Maybe we
> > > can read from it less frequently?
> >
> > Yeah, acquiring the lwlock for every call to vacuum_delay_point()
> > seems to be harmful. One idea would be to have one sig_atomic_t
> > variable in WorkerInfoData and autovac_balance_cost() set it to true
> > after rebalancing the worker's cost-limit. The worker can check it
> > without locking and update its delay parameters if the flag is true.
>
> Instead of having the atomic indicate whether or not someone (launcher
> or another worker) did a rebalance, it would simply store the current
> cost limit. Then the worker can normally access it with a simple read.
>
> My rationale is that if we used an atomic to indicate whether or not we
> did a rebalance ("did_rebalance"), we would have the same cache
> coherency guarantees as if we just used the atomic for the cost limit.
> If we read from the "did_rebalance" variable and missed someone having
> written to it on another core, we still wouldn't get around to checking
> the wi_cost_limit variable in shared memory, so it doesn't matter that
> we bothered to keep it in shared memory and use a lock to access it.
>
> I noticed we don't allow wi_cost_limit to ever be less than 0, so we
> could store wi_cost_limit in an atomic uint32.
>
> I'm not sure if it is okay to do pg_atomic_read_u32() and
> pg_atomic_unlocked_write_u32() or if we need pg_atomic_write_u32() in
> most cases.

I agree to use pg_atomic_uin32. Given that the comment of
pg_atomic_unlocked_write_u32() says:

 * pg_atomic_compare_exchange_u32.  This should only be used in cases where
 * minor performance regressions due to atomics emulation are unacceptable.

I think pg_atomic_write_u32() is enough for our use case.

>
> I've implemented the atomic cost limit in the attached patch. Though,
> I'm pretty unsure about how I initialized the atomics in
> AutoVacuumShmemInit()...

+
                 /* initialize the WorkerInfo free list */
                 for (i = 0; i < autovacuum_max_workers; i++)
                         dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
                                                         &worker[i].wi_links);
+
+                dlist_foreach(iter, &AutoVacuumShmem->av_freeWorkers)
+                        pg_atomic_init_u32(
+
&(dlist_container(WorkerInfoData, wi_links, iter.cur))->wi_cost_limit,
+                                                           0);
+

I think we can do like:

    /* initialize the WorkerInfo free list */
    for (i = 0; i < autovacuum_max_workers; i++)
    {
        dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
                                        &worker[i].wi_links);
        pg_atomic_init_u32(&(worker[i].wi_cost_limit));
    }

>
> If the consensus is that it is simply too confusing to take
> wi_cost_delay out of WorkerInfo, we might be able to afford using a
> shared lock to access it because we won't call AutoVacuumUpdateDelay()
> on every invocation of vacuum_delay_point() -- only when we've reloaded
> the config file.
>
> One potential option to avoid taking a shared lock on every call to
> AutoVacuumUpdateDelay() is to set a global variable to indicate that we
> did update it (since we are the only ones updating it) and then only
> take the shared LWLock in AutoVacuumUpdateDelay() if that flag is true.
>

If we remove wi_cost_delay from WorkerInfo, probably we don't need to
acquire the lwlock in AutoVacuumUpdateDelay()? The shared field we
access in that function will be only wi_dobalance, but this field is
updated only by its owner autovacuum worker.


> > ---
> >  void
> >  AutoVacuumUpdateDelay(void)
> >  {
> > -        if (MyWorkerInfo)
> > +        /*
> > +         * We are using autovacuum-related GUCs to update
> > VacuumCostDelay, so we
> > +         * only want autovacuum workers and autovacuum launcher to do this.
> > +         */
> > +        if (!(am_autovacuum_worker || am_autovacuum_launcher))
> > +                return;
> >
> > Is there any case where the autovacuum launcher calls
> > AutoVacuumUpdateDelay() function?
>
> I had meant to add it to HandleAutoVacLauncherInterrupts() after
> reloading the config file (done in attached patch). When using the
> global variables for cost delay (instead of wi_cost_delay in worker
> info), the autovac launcher also has to do the check in the else branch
> of AutoVacuumUpdateDelay()
>
>         VacuumCostDelay = autovacuum_vac_cost_delay >= 0 ?
>             autovacuum_vac_cost_delay : VacuumCostDelay;
>
> to make sure VacuumCostDelay is correct for when it calls
> autovac_balance_cost().

But doesn't the launcher do a similar thing at the beginning of
autovac_balance_cost()?

    double      vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
                                  autovacuum_vac_cost_delay : VacuumCostDelay);

Related to this point, I think autovac_balance_cost() should use
globally-set cost_limit and cost_delay values to calculate worker's
vacuum-delay parameters. IOW, vac_cost_limit and vac_cost_delay should
come from the config file setting, not table option etc:

    int         vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
                                  autovacuum_vac_cost_limit : VacuumCostLimit);
    double      vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
                                  autovacuum_vac_cost_delay : VacuumCostDelay);

If my understanding is right, the following change is not right;
AutoVacUpdateLimit() updates the VacuumCostLimit based on the value in
MyWorkerInfo:

                 MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
+                AutoVacuumUpdateLimit();

                 /* do a balance */
                 autovac_balance_cost();

-                /* set the active cost parameters from the result of that */
-                AutoVacuumUpdateDelay();

Also, even when using the global variables for cost delay, the
launcher doesn't need to check the global variable. It should always
be able to use either autovacuum_vac_cost_delay/limit or
VacuumCostDelay/Limit.

>
> This also made me think about whether or not we still need cost_limit_base.
> It is used to ensure that autovac_balance_cost() never ends up setting
> workers' wi_cost_limits above the current autovacuum_vacuum_cost_limit
> (or VacuumCostLimit). However, the launcher and all the workers should
> know what the value is without cost_limit_base, no?

Yeah, the current balancing algorithm looks to respect the cost_limit
value set when starting to vacuum the table. The proportion of the
amount of I/O that a worker can consume is calculated based on the
base value and the new worker's cost_limit value cannot exceed the
base value. Given that we're trying to dynamically tune worker's cost
parameters (delay and limit), this concept seems to need to be
updated.

>
> > > Also not sure how the patch interacts with failsafe autovac and parallel
> > > vacuum.
> >
> > Good point.
> >
> > When entering the failsafe mode, we disable the vacuum delays (see
> > lazy_check_wraparound_failsafe()). We need to keep disabling the
> > vacuum delays even after reloading the config file. One idea is to
> > have another global variable indicating we're in the failsafe mode.
> > vacuum_delay_point() doesn't update VacuumCostActive if the flag is
> > true.
>
> I think we might not need to do this. Other than in
> lazy_check_wraparound_failsafe(), VacuumCostActive is only updated in
> two places:
>
> 1) in vacuum() which autovacuum will call per table. And failsafe is
> reset per table as well.
>
> 2) in vacuum_delay_point(), but, since VacuumCostActive will already be
> false when we enter vacuum_delay_point() the next time after
> lazy_check_wraparound_failsafe(), we won't set VacuumCostActive there.

Indeed. But does it mean that there is no code path to turn
vacuum-delay on, even when vacuum_cost_delay is updated from 0 to
non-0?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Wed, Mar 15, 2023 at 1:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> On Sat, Mar 11, 2023 at 8:11 AM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> > I've implemented the atomic cost limit in the attached patch. Though,
> > I'm pretty unsure about how I initialized the atomics in
> > AutoVacuumShmemInit()...
>
> +
>                  /* initialize the WorkerInfo free list */
>                  for (i = 0; i < autovacuum_max_workers; i++)
>                          dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
>                                                          &worker[i].wi_links);
> +
> +                dlist_foreach(iter, &AutoVacuumShmem->av_freeWorkers)
> +                        pg_atomic_init_u32(
> +
> &(dlist_container(WorkerInfoData, wi_links, iter.cur))->wi_cost_limit,
> +                                                           0);
> +
>
> I think we can do like:
>
>     /* initialize the WorkerInfo free list */
>     for (i = 0; i < autovacuum_max_workers; i++)
>     {
>         dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
>                                         &worker[i].wi_links);
>         pg_atomic_init_u32(&(worker[i].wi_cost_limit));
>     }

Ah, yes, I was distracted by the variable name "worker" (as opposed to
"workers").

> > If the consensus is that it is simply too confusing to take
> > wi_cost_delay out of WorkerInfo, we might be able to afford using a
> > shared lock to access it because we won't call AutoVacuumUpdateDelay()
> > on every invocation of vacuum_delay_point() -- only when we've reloaded
> > the config file.
> >
> > One potential option to avoid taking a shared lock on every call to
> > AutoVacuumUpdateDelay() is to set a global variable to indicate that we
> > did update it (since we are the only ones updating it) and then only
> > take the shared LWLock in AutoVacuumUpdateDelay() if that flag is true.
> >
>
> If we remove wi_cost_delay from WorkerInfo, probably we don't need to
> acquire the lwlock in AutoVacuumUpdateDelay()? The shared field we
> access in that function will be only wi_dobalance, but this field is
> updated only by its owner autovacuum worker.

I realized that we cannot use dobalance to decide whether or not to
update wi_cost_delay because dobalance could be false because of table
option cost limit being set (with no table option cost delay) and we
would still need to update VacuumCostDelay and wi_cost_delay with the
new value of autovacuum_vacuum_cost_delay.

But v5 skirts around this issue altogether.

> > > ---
> > >  void
> > >  AutoVacuumUpdateDelay(void)
> > >  {
> > > -        if (MyWorkerInfo)
> > > +        /*
> > > +         * We are using autovacuum-related GUCs to update
> > > VacuumCostDelay, so we
> > > +         * only want autovacuum workers and autovacuum launcher to do this.
> > > +         */
> > > +        if (!(am_autovacuum_worker || am_autovacuum_launcher))
> > > +                return;
> > >
> > > Is there any case where the autovacuum launcher calls
> > > AutoVacuumUpdateDelay() function?
> >
> > I had meant to add it to HandleAutoVacLauncherInterrupts() after
> > reloading the config file (done in attached patch). When using the
> > global variables for cost delay (instead of wi_cost_delay in worker
> > info), the autovac launcher also has to do the check in the else branch
> > of AutoVacuumUpdateDelay()
> >
> >         VacuumCostDelay = autovacuum_vac_cost_delay >= 0 ?
> >             autovacuum_vac_cost_delay : VacuumCostDelay;
> >
> > to make sure VacuumCostDelay is correct for when it calls
> > autovac_balance_cost().
>
> But doesn't the launcher do a similar thing at the beginning of
> autovac_balance_cost()?
>
>     double      vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
>                                   autovacuum_vac_cost_delay : VacuumCostDelay);

Ah, yes. You are right.

> Related to this point, I think autovac_balance_cost() should use
> globally-set cost_limit and cost_delay values to calculate worker's
> vacuum-delay parameters. IOW, vac_cost_limit and vac_cost_delay should
> come from the config file setting, not table option etc:
>
>     int         vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
>                                   autovacuum_vac_cost_limit : VacuumCostLimit);
>     double      vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
>                                   autovacuum_vac_cost_delay : VacuumCostDelay);
>
> If my understanding is right, the following change is not right;
> AutoVacUpdateLimit() updates the VacuumCostLimit based on the value in
> MyWorkerInfo:
>
>                  MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
> +                AutoVacuumUpdateLimit();
>
>                  /* do a balance */
>                  autovac_balance_cost();
>
> -                /* set the active cost parameters from the result of that */
> -                AutoVacuumUpdateDelay();
>
> Also, even when using the global variables for cost delay, the
> launcher doesn't need to check the global variable. It should always
> be able to use either autovacuum_vac_cost_delay/limit or
> VacuumCostDelay/Limit.

Yes, that is true. But, I actually think we can do something more
radical, which relates to this point as well as the issue with
cost_limit_base below.

> > This also made me think about whether or not we still need cost_limit_base.
> > It is used to ensure that autovac_balance_cost() never ends up setting
> > workers' wi_cost_limits above the current autovacuum_vacuum_cost_limit
> > (or VacuumCostLimit). However, the launcher and all the workers should
> > know what the value is without cost_limit_base, no?
>
> Yeah, the current balancing algorithm looks to respect the cost_limit
> value set when starting to vacuum the table. The proportion of the
> amount of I/O that a worker can consume is calculated based on the
> base value and the new worker's cost_limit value cannot exceed the
> base value. Given that we're trying to dynamically tune worker's cost
> parameters (delay and limit), this concept seems to need to be
> updated.

In master, autovacuum workers reload the config file at most once per
table vacuumed. And that is the same time that they update their
wi_cost_limit_base and wi_cost_delay. Thus, when autovac_balance_cost()
is called, there is a good chance that different workers will have
different values for wi_cost_limit_base and wi_cost_delay (and we are
only talking about workers not vacuuming a table with table option
cost-related gucs). So, it made sense that the balancing algorithm tried
to use a ratio to determine what to set the cost limit of each worker
to. It is clamped to the base value, as you say, but it also gives
workers a proportion of the new limit equal to what proportion their base
cost represents of the total cost.

I think all of this doesn't matter anymore now that everyone can reload
the config file often and dynamically change these values.

Thus, in the attached v5, I have removed both wi_cost_limit and wi_cost_delay
from WorkerInfo. I've added a new variable to AutoVacuumShmem called
nworkers_for_balance. Now, autovac_balance_cost() only recalculates this
number and updates it if it has changed. Then, in
AutoVacuumUpdateLimit() workers read from this atomic value and divide
the value of the cost limit gucs by that number to get their own cost limit.

I keep the table option value of cost limit and cost delay in
backend-local memory to reference when updating the worker cost limit.

One nice thing is autovac_balance_cost() only requires an access shared
lock now (though most callers are updating other members before calling
it and still take an exclusive lock).

What do you think?

> > > > Also not sure how the patch interacts with failsafe autovac and parallel
> > > > vacuum.
> > >
> > > Good point.
> > >
> > > When entering the failsafe mode, we disable the vacuum delays (see
> > > lazy_check_wraparound_failsafe()). We need to keep disabling the
> > > vacuum delays even after reloading the config file. One idea is to
> > > have another global variable indicating we're in the failsafe mode.
> > > vacuum_delay_point() doesn't update VacuumCostActive if the flag is
> > > true.
> >
> > I think we might not need to do this. Other than in
> > lazy_check_wraparound_failsafe(), VacuumCostActive is only updated in
> > two places:
> >
> > 1) in vacuum() which autovacuum will call per table. And failsafe is
> > reset per table as well.
> >
> > 2) in vacuum_delay_point(), but, since VacuumCostActive will already be
> > false when we enter vacuum_delay_point() the next time after
> > lazy_check_wraparound_failsafe(), we won't set VacuumCostActive there.
>
> Indeed. But does it mean that there is no code path to turn
> vacuum-delay on, even when vacuum_cost_delay is updated from 0 to
> non-0?

Ah yes! Good point. This is true.
I'm not sure how to cheaply allow for re-enabling delays after disabling
them in the middle of a table vacuum.

I don't see a way around checking if we need to reload the config file
on every call to vacuum_delay_point() (currently, we are only doing this
when we have to wait anyway). It seems expensive to do this check every
time. If we do do this, we would update VacuumCostActive when updating
VacuumCostDelay, and we would need a global variable keeping the
failsafe status, as you mentioned.

It could be okay to say that you can only disable cost-based delays in
the middle of vacuuming a table (i.e. you cannot enable them if they are
already disabled until you start vacuuming the next table). Though maybe
it is weird that you can increase the delay but not re-enable it...

On an unrelated note, I was wondering if there were any docs anywhere
that should be updated to go along with this.

And, I was wondering if it was worth trying to split up the part that
reloads the config file and all of the autovacuum stuff. The reloading
of the config file by itself won't actually result in autovacuum workers
having updated cost delays because of them overwriting it with
wi_cost_delay, but it will allow VACUUM to have those updated values.

- Melanie

Attachment

Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Sat, Mar 18, 2023 at 6:47 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> On Wed, Mar 15, 2023 at 1:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > On Sat, Mar 11, 2023 at 8:11 AM Melanie Plageman
> > <melanieplageman@gmail.com> wrote:
> > > > > Also not sure how the patch interacts with failsafe autovac and parallel
> > > > > vacuum.
> > > >
> > > > Good point.
> > > >
> > > > When entering the failsafe mode, we disable the vacuum delays (see
> > > > lazy_check_wraparound_failsafe()). We need to keep disabling the
> > > > vacuum delays even after reloading the config file. One idea is to
> > > > have another global variable indicating we're in the failsafe mode.
> > > > vacuum_delay_point() doesn't update VacuumCostActive if the flag is
> > > > true.
> > >
> > > I think we might not need to do this. Other than in
> > > lazy_check_wraparound_failsafe(), VacuumCostActive is only updated in
> > > two places:
> > >
> > > 1) in vacuum() which autovacuum will call per table. And failsafe is
> > > reset per table as well.
> > >
> > > 2) in vacuum_delay_point(), but, since VacuumCostActive will already be
> > > false when we enter vacuum_delay_point() the next time after
> > > lazy_check_wraparound_failsafe(), we won't set VacuumCostActive there.
> >
> > Indeed. But does it mean that there is no code path to turn
> > vacuum-delay on, even when vacuum_cost_delay is updated from 0 to
> > non-0?
>
> Ah yes! Good point. This is true.
> I'm not sure how to cheaply allow for re-enabling delays after disabling
> them in the middle of a table vacuum.
>
> I don't see a way around checking if we need to reload the config file
> on every call to vacuum_delay_point() (currently, we are only doing this
> when we have to wait anyway). It seems expensive to do this check every
> time. If we do do this, we would update VacuumCostActive when updating
> VacuumCostDelay, and we would need a global variable keeping the
> failsafe status, as you mentioned.
>
> It could be okay to say that you can only disable cost-based delays in
> the middle of vacuuming a table (i.e. you cannot enable them if they are
> already disabled until you start vacuuming the next table). Though maybe
> it is weird that you can increase the delay but not re-enable it...

So, I thought about it some more, and I think it is a bit odd that you
can increase the delay and limit but not re-enable them if they were
disabled. And, perhaps it would be okay to check ConfigReloadPending at
the top of vacuum_delay_point() instead of only after sleeping. It is
just one more branch. We can check if VacuumCostActive is false after
checking if we should reload and doing so if needed and return early.
I've implemented that in attached v6.

I added in the global we discussed for VacuumFailsafeActive. If we keep
it, we can probably remove the one in LVRelState -- as it seems
redundant. Let me know what you think.

- Melanie

Attachment

Re: Should vacuum process config file reload more often

From
Masahiko Sawada
Date:
On Sun, Mar 19, 2023 at 7:47 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Wed, Mar 15, 2023 at 1:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > On Sat, Mar 11, 2023 at 8:11 AM Melanie Plageman
> > <melanieplageman@gmail.com> wrote:
> > > I've implemented the atomic cost limit in the attached patch. Though,
> > > I'm pretty unsure about how I initialized the atomics in
> > > AutoVacuumShmemInit()...
> >
> > +
> >                  /* initialize the WorkerInfo free list */
> >                  for (i = 0; i < autovacuum_max_workers; i++)
> >                          dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
> >                                                          &worker[i].wi_links);
> > +
> > +                dlist_foreach(iter, &AutoVacuumShmem->av_freeWorkers)
> > +                        pg_atomic_init_u32(
> > +
> > &(dlist_container(WorkerInfoData, wi_links, iter.cur))->wi_cost_limit,
> > +                                                           0);
> > +
> >
> > I think we can do like:
> >
> >     /* initialize the WorkerInfo free list */
> >     for (i = 0; i < autovacuum_max_workers; i++)
> >     {
> >         dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
> >                                         &worker[i].wi_links);
> >         pg_atomic_init_u32(&(worker[i].wi_cost_limit));
> >     }
>
> Ah, yes, I was distracted by the variable name "worker" (as opposed to
> "workers").
>
> > > If the consensus is that it is simply too confusing to take
> > > wi_cost_delay out of WorkerInfo, we might be able to afford using a
> > > shared lock to access it because we won't call AutoVacuumUpdateDelay()
> > > on every invocation of vacuum_delay_point() -- only when we've reloaded
> > > the config file.
> > >
> > > One potential option to avoid taking a shared lock on every call to
> > > AutoVacuumUpdateDelay() is to set a global variable to indicate that we
> > > did update it (since we are the only ones updating it) and then only
> > > take the shared LWLock in AutoVacuumUpdateDelay() if that flag is true.
> > >
> >
> > If we remove wi_cost_delay from WorkerInfo, probably we don't need to
> > acquire the lwlock in AutoVacuumUpdateDelay()? The shared field we
> > access in that function will be only wi_dobalance, but this field is
> > updated only by its owner autovacuum worker.
>
> I realized that we cannot use dobalance to decide whether or not to
> update wi_cost_delay because dobalance could be false because of table
> option cost limit being set (with no table option cost delay) and we
> would still need to update VacuumCostDelay and wi_cost_delay with the
> new value of autovacuum_vacuum_cost_delay.
>
> But v5 skirts around this issue altogether.
>
> > > > ---
> > > >  void
> > > >  AutoVacuumUpdateDelay(void)
> > > >  {
> > > > -        if (MyWorkerInfo)
> > > > +        /*
> > > > +         * We are using autovacuum-related GUCs to update
> > > > VacuumCostDelay, so we
> > > > +         * only want autovacuum workers and autovacuum launcher to do this.
> > > > +         */
> > > > +        if (!(am_autovacuum_worker || am_autovacuum_launcher))
> > > > +                return;
> > > >
> > > > Is there any case where the autovacuum launcher calls
> > > > AutoVacuumUpdateDelay() function?
> > >
> > > I had meant to add it to HandleAutoVacLauncherInterrupts() after
> > > reloading the config file (done in attached patch). When using the
> > > global variables for cost delay (instead of wi_cost_delay in worker
> > > info), the autovac launcher also has to do the check in the else branch
> > > of AutoVacuumUpdateDelay()
> > >
> > >         VacuumCostDelay = autovacuum_vac_cost_delay >= 0 ?
> > >             autovacuum_vac_cost_delay : VacuumCostDelay;
> > >
> > > to make sure VacuumCostDelay is correct for when it calls
> > > autovac_balance_cost().
> >
> > But doesn't the launcher do a similar thing at the beginning of
> > autovac_balance_cost()?
> >
> >     double      vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
> >                                   autovacuum_vac_cost_delay : VacuumCostDelay);
>
> Ah, yes. You are right.
>
> > Related to this point, I think autovac_balance_cost() should use
> > globally-set cost_limit and cost_delay values to calculate worker's
> > vacuum-delay parameters. IOW, vac_cost_limit and vac_cost_delay should
> > come from the config file setting, not table option etc:
> >
> >     int         vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
> >                                   autovacuum_vac_cost_limit : VacuumCostLimit);
> >     double      vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
> >                                   autovacuum_vac_cost_delay : VacuumCostDelay);
> >
> > If my understanding is right, the following change is not right;
> > AutoVacUpdateLimit() updates the VacuumCostLimit based on the value in
> > MyWorkerInfo:
> >
> >                  MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
> > +                AutoVacuumUpdateLimit();
> >
> >                  /* do a balance */
> >                  autovac_balance_cost();
> >
> > -                /* set the active cost parameters from the result of that */
> > -                AutoVacuumUpdateDelay();
> >
> > Also, even when using the global variables for cost delay, the
> > launcher doesn't need to check the global variable. It should always
> > be able to use either autovacuum_vac_cost_delay/limit or
> > VacuumCostDelay/Limit.
>
> Yes, that is true. But, I actually think we can do something more
> radical, which relates to this point as well as the issue with
> cost_limit_base below.
>
> > > This also made me think about whether or not we still need cost_limit_base.
> > > It is used to ensure that autovac_balance_cost() never ends up setting
> > > workers' wi_cost_limits above the current autovacuum_vacuum_cost_limit
> > > (or VacuumCostLimit). However, the launcher and all the workers should
> > > know what the value is without cost_limit_base, no?
> >
> > Yeah, the current balancing algorithm looks to respect the cost_limit
> > value set when starting to vacuum the table. The proportion of the
> > amount of I/O that a worker can consume is calculated based on the
> > base value and the new worker's cost_limit value cannot exceed the
> > base value. Given that we're trying to dynamically tune worker's cost
> > parameters (delay and limit), this concept seems to need to be
> > updated.
>
> In master, autovacuum workers reload the config file at most once per
> table vacuumed. And that is the same time that they update their
> wi_cost_limit_base and wi_cost_delay. Thus, when autovac_balance_cost()
> is called, there is a good chance that different workers will have
> different values for wi_cost_limit_base and wi_cost_delay (and we are
> only talking about workers not vacuuming a table with table option
> cost-related gucs). So, it made sense that the balancing algorithm tried
> to use a ratio to determine what to set the cost limit of each worker
> to. It is clamped to the base value, as you say, but it also gives
> workers a proportion of the new limit equal to what proportion their base
> cost represents of the total cost.
>
> I think all of this doesn't matter anymore now that everyone can reload
> the config file often and dynamically change these values.
>
> Thus, in the attached v5, I have removed both wi_cost_limit and wi_cost_delay
> from WorkerInfo. I've added a new variable to AutoVacuumShmem called
> nworkers_for_balance. Now, autovac_balance_cost() only recalculates this
> number and updates it if it has changed. Then, in
> AutoVacuumUpdateLimit() workers read from this atomic value and divide
> the value of the cost limit gucs by that number to get their own cost limit.
>
> I keep the table option value of cost limit and cost delay in
> backend-local memory to reference when updating the worker cost limit.
>
> One nice thing is autovac_balance_cost() only requires an access shared
> lock now (though most callers are updating other members before calling
> it and still take an exclusive lock).
>
> What do you think?

I think this is a good idea.

Do we need to calculate the number of workers running with
nworkers_for_balance by iterating over the running worker list? I
guess autovacuum workers can increment/decrement it at the beginning
and end of vacuum.

>
> > > > > Also not sure how the patch interacts with failsafe autovac and parallel
> > > > > vacuum.
> > > >
> > > > Good point.
> > > >
> > > > When entering the failsafe mode, we disable the vacuum delays (see
> > > > lazy_check_wraparound_failsafe()). We need to keep disabling the
> > > > vacuum delays even after reloading the config file. One idea is to
> > > > have another global variable indicating we're in the failsafe mode.
> > > > vacuum_delay_point() doesn't update VacuumCostActive if the flag is
> > > > true.
> > >
> > > I think we might not need to do this. Other than in
> > > lazy_check_wraparound_failsafe(), VacuumCostActive is only updated in
> > > two places:
> > >
> > > 1) in vacuum() which autovacuum will call per table. And failsafe is
> > > reset per table as well.
> > >
> > > 2) in vacuum_delay_point(), but, since VacuumCostActive will already be
> > > false when we enter vacuum_delay_point() the next time after
> > > lazy_check_wraparound_failsafe(), we won't set VacuumCostActive there.
> >
> > Indeed. But does it mean that there is no code path to turn
> > vacuum-delay on, even when vacuum_cost_delay is updated from 0 to
> > non-0?
>
> Ah yes! Good point. This is true.
> I'm not sure how to cheaply allow for re-enabling delays after disabling
> them in the middle of a table vacuum.
>
> I don't see a way around checking if we need to reload the config file
> on every call to vacuum_delay_point() (currently, we are only doing this
> when we have to wait anyway). It seems expensive to do this check every
> time. If we do do this, we would update VacuumCostActive when updating
> VacuumCostDelay, and we would need a global variable keeping the
> failsafe status, as you mentioned.
>
> It could be okay to say that you can only disable cost-based delays in
> the middle of vacuuming a table (i.e. you cannot enable them if they are
> already disabled until you start vacuuming the next table). Though maybe
> it is weird that you can increase the delay but not re-enable it...

On Mon, Mar 20, 2023 at 1:48 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> So, I thought about it some more, and I think it is a bit odd that you
> can increase the delay and limit but not re-enable them if they were
> disabled. And, perhaps it would be okay to check ConfigReloadPending at
> the top of vacuum_delay_point() instead of only after sleeping. It is
> just one more branch. We can check if VacuumCostActive is false after
> checking if we should reload and doing so if needed and return early.
> I've implemented that in attached v6.
>
> I added in the global we discussed for VacuumFailsafeActive. If we keep
> it, we can probably remove the one in LVRelState -- as it seems
> redundant. Let me know what you think.

I think the following change is related:

-        if (!VacuumCostActive || InterruptPending)
+        if (InterruptPending || VacuumFailsafeActive ||
+                (!VacuumCostActive && !ConfigReloadPending))
                 return;

+        /*
+         * Reload the configuration file if requested. This allows changes to
+         * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay to
+         * take effect while a table is being vacuumed or analyzed.
+         */
+        if (ConfigReloadPending && !analyze_in_outer_xact)
+        {
+                ConfigReloadPending = false;
+                ProcessConfigFile(PGC_SIGHUP);
+                AutoVacuumUpdateDelay();
+                AutoVacuumUpdateLimit();
+        }

It makes sense to me that we need to reload the config file even when
vacuum-delay is disabled. But I think it's not convenient for users
that we don't reload the configuration file once the failsafe is
triggered. I think users might want to change some GUCs such as
log_autovacuum_min_duration.

>
> On an unrelated note, I was wondering if there were any docs anywhere
> that should be updated to go along with this.

The current patch improves the internal mechanism of (re)balancing
vacuum-cost but doesn't change user-visible behavior. I don't have any
idea so far that we should update somewhere in the doc.

>
> And, I was wondering if it was worth trying to split up the part that
> reloads the config file and all of the autovacuum stuff. The reloading
> of the config file by itself won't actually result in autovacuum workers
> having updated cost delays because of them overwriting it with
> wi_cost_delay, but it will allow VACUUM to have those updated values.

It makes sense to me to have changes for overhauling the rebalance
mechanism in a separate patch.

Looking back at the original concern you mentioned[1]:

speed up long-running vacuum of a large table by
decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
config file is only reloaded between tables (for autovacuum) or after
the statement (for explicit vacuum).

does it make sense to have autovac_balance_cost() update workers'
wi_cost_delay too? Autovacuum launcher already reloads the config file
and does the rebalance. So I thought autovac_balance_cost() can update
the cost_delay as well, and this might be a minimal change to deal
with your concern. This doesn't have the effect for manual VACUUM but
since vacuum delay is disabled by default it won't be a big problem.
As for manual VACUUMs, we would need to reload the config file in
vacuum_delay_point() as the part of your patch does. Overhauling the
rebalance mechanism would be another patch to improve it further.

Regards,

[1] https://www.postgresql.org/message-id/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Should vacuum process config file reload more often

From
Daniel Gustafsson
Date:
> On 23 Mar 2023, at 07:08, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> On Sun, Mar 19, 2023 at 7:47 AM Melanie Plageman <melanieplageman@gmail.com> wrote:

> It makes sense to me that we need to reload the config file even when
> vacuum-delay is disabled. But I think it's not convenient for users
> that we don't reload the configuration file once the failsafe is
> triggered. I think users might want to change some GUCs such as
> log_autovacuum_min_duration.

I agree with this.

>> On an unrelated note, I was wondering if there were any docs anywhere
>> that should be updated to go along with this.
>
> The current patch improves the internal mechanism of (re)balancing
> vacuum-cost but doesn't change user-visible behavior. I don't have any
> idea so far that we should update somewhere in the doc.

I had a look as well and can't really spot anywhere where the current behavior
is detailed, so there is little to update.  On top of that, I also don't think
it's worth adding this to the docs.

>> And, I was wondering if it was worth trying to split up the part that
>> reloads the config file and all of the autovacuum stuff. The reloading
>> of the config file by itself won't actually result in autovacuum workers
>> having updated cost delays because of them overwriting it with
>> wi_cost_delay, but it will allow VACUUM to have those updated values.
>
> It makes sense to me to have changes for overhauling the rebalance
> mechanism in a separate patch.

It would for sure be worth considering,

+bool       VacuumFailsafeActive = false;
This needs documentation, how it's used and how it relates to failsafe_active
in LVRelState (which it might replace(?), but until then).

+   pg_atomic_uint32 nworkers_for_balance;
This needs a short oneline documentation update to the struct comment.


-   double      wi_cost_delay;
-   int         wi_cost_limit;
-   int         wi_cost_limit_base;
This change makes the below comment in do_autovacuum in need of an update:
  /*
   * Remove my info from shared memory.  We could, but intentionally.
   * don't, clear wi_cost_limit and friends --- this is on the
   * assumption that we probably have more to do with similar cost
   * settings, so we don't want to give up our share of I/O for a very
   * short interval and thereby thrash the global balance.
   */


+   if (av_table_option_cost_delay >= 0)
+       VacuumCostDelay = av_table_option_cost_delay;
+   else
+       VacuumCostDelay = autovacuum_vac_cost_delay >= 0 ?
+           autovacuum_vac_cost_delay : VacuumCostDelay;
While it's a matter of personal preference, I for one would like if we reduced
the number of ternary operators in the vacuum code, especially those mixed into
if statements.  The vacuum code is full of this already though so this isn't
less of an objection (as it follows style) than an observation.


+    * note: in cost_limit, zero also means use value from elsewhere, because
+    * zero is not a valid value.
...
+       int         vac_cost_limit = autovacuum_vac_cost_limit > 0 ?
+       autovacuum_vac_cost_limit : VacuumCostLimit;
Not mentioning the fact that a magic value in a GUC means it's using the value
from another GUC (which is not great IMHO), it seems we are using zero as well
as -1 as that magic value here?  (not introduced in this patch.) The docs does
AFAICT only specify -1 as that value though.  Am I missing something or is the
code and documentation slightly out of sync?

I need another few readthroughs to figure out of VacuumFailsafeActive does what
I think it does, and should be doing, but in general I think this is a good
idea and a patch in good condition close to being committable.

--
Daniel Gustafsson


Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Thu, Mar 23, 2023 at 2:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> On Sun, Mar 19, 2023 at 7:47 AM Melanie Plageman <melanieplageman@gmail.com> wrote:
> Do we need to calculate the number of workers running with
> nworkers_for_balance by iterating over the running worker list? I
> guess autovacuum workers can increment/decrement it at the beginning
> and end of vacuum.

I don't think we can do that because if a worker crashes, we have no way
of knowing if it had incremented or decremented the number, so we can't
adjust for it.

> > > > > > Also not sure how the patch interacts with failsafe autovac and parallel
> > > > > > vacuum.
> > > > >
> > > > > Good point.
> > > > >
> > > > > When entering the failsafe mode, we disable the vacuum delays (see
> > > > > lazy_check_wraparound_failsafe()). We need to keep disabling the
> > > > > vacuum delays even after reloading the config file. One idea is to
> > > > > have another global variable indicating we're in the failsafe mode.
> > > > > vacuum_delay_point() doesn't update VacuumCostActive if the flag is
> > > > > true.
> > > >
> > > > I think we might not need to do this. Other than in
> > > > lazy_check_wraparound_failsafe(), VacuumCostActive is only updated in
> > > > two places:
> > > >
> > > > 1) in vacuum() which autovacuum will call per table. And failsafe is
> > > > reset per table as well.
> > > >
> > > > 2) in vacuum_delay_point(), but, since VacuumCostActive will already be
> > > > false when we enter vacuum_delay_point() the next time after
> > > > lazy_check_wraparound_failsafe(), we won't set VacuumCostActive there.
> > >
> > > Indeed. But does it mean that there is no code path to turn
> > > vacuum-delay on, even when vacuum_cost_delay is updated from 0 to
> > > non-0?
> >
> > Ah yes! Good point. This is true.
> > I'm not sure how to cheaply allow for re-enabling delays after disabling
> > them in the middle of a table vacuum.
> >
> > I don't see a way around checking if we need to reload the config file
> > on every call to vacuum_delay_point() (currently, we are only doing this
> > when we have to wait anyway). It seems expensive to do this check every
> > time. If we do do this, we would update VacuumCostActive when updating
> > VacuumCostDelay, and we would need a global variable keeping the
> > failsafe status, as you mentioned.
> >
> > It could be okay to say that you can only disable cost-based delays in
> > the middle of vacuuming a table (i.e. you cannot enable them if they are
> > already disabled until you start vacuuming the next table). Though maybe
> > it is weird that you can increase the delay but not re-enable it...
>
> On Mon, Mar 20, 2023 at 1:48 AM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> > So, I thought about it some more, and I think it is a bit odd that you
> > can increase the delay and limit but not re-enable them if they were
> > disabled. And, perhaps it would be okay to check ConfigReloadPending at
> > the top of vacuum_delay_point() instead of only after sleeping. It is
> > just one more branch. We can check if VacuumCostActive is false after
> > checking if we should reload and doing so if needed and return early.
> > I've implemented that in attached v6.
> >
> > I added in the global we discussed for VacuumFailsafeActive. If we keep
> > it, we can probably remove the one in LVRelState -- as it seems
> > redundant. Let me know what you think.
>
> I think the following change is related:
>
> -        if (!VacuumCostActive || InterruptPending)
> +        if (InterruptPending || VacuumFailsafeActive ||
> +                (!VacuumCostActive && !ConfigReloadPending))
>                  return;
>
> +        /*
> +         * Reload the configuration file if requested. This allows changes to
> +         * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay to
> +         * take effect while a table is being vacuumed or analyzed.
> +         */
> +        if (ConfigReloadPending && !analyze_in_outer_xact)
> +        {
> +                ConfigReloadPending = false;
> +                ProcessConfigFile(PGC_SIGHUP);
> +                AutoVacuumUpdateDelay();
> +                AutoVacuumUpdateLimit();
> +        }
>
> It makes sense to me that we need to reload the config file even when
> vacuum-delay is disabled. But I think it's not convenient for users
> that we don't reload the configuration file once the failsafe is
> triggered. I think users might want to change some GUCs such as
> log_autovacuum_min_duration.

Ah, okay. Attached v7 has this change (it reloads even if failsafe is
active).

> > And, I was wondering if it was worth trying to split up the part that
> > reloads the config file and all of the autovacuum stuff. The reloading
> > of the config file by itself won't actually result in autovacuum workers
> > having updated cost delays because of them overwriting it with
> > wi_cost_delay, but it will allow VACUUM to have those updated values.
>
> It makes sense to me to have changes for overhauling the rebalance
> mechanism in a separate patch.
>
> Looking back at the original concern you mentioned[1]:
>
> speed up long-running vacuum of a large table by
> decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
> config file is only reloaded between tables (for autovacuum) or after
> the statement (for explicit vacuum).
>
> does it make sense to have autovac_balance_cost() update workers'
> wi_cost_delay too? Autovacuum launcher already reloads the config file
> and does the rebalance. So I thought autovac_balance_cost() can update
> the cost_delay as well, and this might be a minimal change to deal
> with your concern. This doesn't have the effect for manual VACUUM but
> since vacuum delay is disabled by default it won't be a big problem.
> As for manual VACUUMs, we would need to reload the config file in
> vacuum_delay_point() as the part of your patch does. Overhauling the
> rebalance mechanism would be another patch to improve it further.

So, we can't do this without acquiring an access shared lock on every
call to vacuum_delay_point() because cost delay is a double.

I will work on a patchset with separate commits for reloading the config
file, though (with autovac not benefitting in the first commit).

On Thu, Mar 23, 2023 at 12:24 PM Daniel Gustafsson <daniel@yesql.se> wrote:
>
> +bool       VacuumFailsafeActive = false;
> This needs documentation, how it's used and how it relates to failsafe_active
> in LVRelState (which it might replace(?), but until then).

Thanks! I've removed LVRelState->failsafe_active.

I've also separated the VacuumFailsafeActive change into its own commit.
I will say that that commit message needs some work.

> +   pg_atomic_uint32 nworkers_for_balance;
> This needs a short oneline documentation update to the struct comment.

Done. I also prefixed with av to match the other members. I am thinking
that this variable name could be better. I want to convey that it is the
number of workers sharing a cost limit, so I considered
av_limit_sharers or something like that. I am looking to convey that
it is the number of workers amongst whom we must split the cost limit.

>
> -   double      wi_cost_delay;
> -   int         wi_cost_limit;
> -   int         wi_cost_limit_base;
> This change makes the below comment in do_autovacuum in need of an update:
>   /*
>    * Remove my info from shared memory.  We could, but intentionally.
>    * don't, clear wi_cost_limit and friends --- this is on the
>    * assumption that we probably have more to do with similar cost
>    * settings, so we don't want to give up our share of I/O for a very
>    * short interval and thereby thrash the global balance.
>    */

Updated to mention wi_dobalance instead.
On the topic of wi_dobalance, should we bother making it an atomic flag
instead? We would avoid taking a lock a few times, though probably not
frequently enough to matter. I was wondering if making it atomically
accessible would be less confusing than acquiring a lock only to set
one member in do_autovacuum() (and otherwise it is only read). I think
if I had to make it an atomic flag, I would reverse the logic and make
it wi_skip_balance or something like that.

> +   if (av_table_option_cost_delay >= 0)
> +       VacuumCostDelay = av_table_option_cost_delay;
> +   else
> +       VacuumCostDelay = autovacuum_vac_cost_delay >= 0 ?
> +           autovacuum_vac_cost_delay : VacuumCostDelay;
> While it's a matter of personal preference, I for one would like if we reduced
> the number of ternary operators in the vacuum code, especially those mixed into
> if statements.  The vacuum code is full of this already though so this isn't
> less of an objection (as it follows style) than an observation.

I agree. This one was better served as an "else if" anyway -- updated!

>
> +    * note: in cost_limit, zero also means use value from elsewhere, because
> +    * zero is not a valid value.
> ...
> +       int         vac_cost_limit = autovacuum_vac_cost_limit > 0 ?
> +       autovacuum_vac_cost_limit : VacuumCostLimit;
> Not mentioning the fact that a magic value in a GUC means it's using the value
> from another GUC (which is not great IMHO), it seems we are using zero as well
> as -1 as that magic value here?  (not introduced in this patch.) The docs does
> AFAICT only specify -1 as that value though.  Am I missing something or is the
> code and documentation slightly out of sync?

I copied that comment from elsewhere, but, yes it is a weird situation.
So, you can set autovacuum_vacuum_cost_limit to 0, -1 or a
positive number. You can only set vacuum_cost_limit to a positive value.
The documentation mentions that setting autovacuum_vacuum_cost_limit to
-1, the default, will have it use vacuum_cost_limit. However, it says
nothing about what setting it to 0 does. In the code, everywhere assumes
if autovacuum_vacuum_cost_limit is 0 OR -1, use vacuum_cost_limit.

This is in contrast to autovacuum_vacuum_cost_delay, for which 0 means
to disable it -- so setting autovacuum_vacuum_cost_delay to 0 will
specifically not fall back to vacuum_cost_limit.

I think the problem is that 0 is not a valid cost limit (i.e. it has no
meaning like infinity/no limit), so we basically don't want to allow the
cost limit to be set to 0, but GUC values have to be a range with a max
and a min, so we can't just exclude 0 if we want to allow -1 (as far as
I know). I think it would be nice to be able to specify multiple valid
ranges for GUCs to the GUC machinery.

So, to answer your question, yes, the code and docs are a bit
out-of-sync.

> I need another few readthroughs to figure out of VacuumFailsafeActive does what
> I think it does, and should be doing, but in general I think this is a good
> idea and a patch in good condition close to being committable.

I will take a pass at splitting up the main commit into two. However, I
have attached a new version with the other specific updates discussed in
this thread. Feel free to provide review on this version in the meantime.

- Melanie

Attachment

Re: Should vacuum process config file reload more often

From
Masahiko Sawada
Date:
On Fri, Mar 24, 2023 at 9:27 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Thu, Mar 23, 2023 at 2:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > On Sun, Mar 19, 2023 at 7:47 AM Melanie Plageman <melanieplageman@gmail.com> wrote:
> > Do we need to calculate the number of workers running with
> > nworkers_for_balance by iterating over the running worker list? I
> > guess autovacuum workers can increment/decrement it at the beginning
> > and end of vacuum.
>
> I don't think we can do that because if a worker crashes, we have no way
> of knowing if it had incremented or decremented the number, so we can't
> adjust for it.

What kind of crash are you concerned about? If a worker raises an
ERROR, we can catch it in PG_CATCH() block. If it's a FATAL, we can do
that in FreeWorkerInfo(). A PANIC error ends up crashing the entire
server.

>
> > > > > > > Also not sure how the patch interacts with failsafe autovac and parallel
> > > > > > > vacuum.
> > > > > >
> > > > > > Good point.
> > > > > >
> > > > > > When entering the failsafe mode, we disable the vacuum delays (see
> > > > > > lazy_check_wraparound_failsafe()). We need to keep disabling the
> > > > > > vacuum delays even after reloading the config file. One idea is to
> > > > > > have another global variable indicating we're in the failsafe mode.
> > > > > > vacuum_delay_point() doesn't update VacuumCostActive if the flag is
> > > > > > true.
> > > > >
> > > > > I think we might not need to do this. Other than in
> > > > > lazy_check_wraparound_failsafe(), VacuumCostActive is only updated in
> > > > > two places:
> > > > >
> > > > > 1) in vacuum() which autovacuum will call per table. And failsafe is
> > > > > reset per table as well.
> > > > >
> > > > > 2) in vacuum_delay_point(), but, since VacuumCostActive will already be
> > > > > false when we enter vacuum_delay_point() the next time after
> > > > > lazy_check_wraparound_failsafe(), we won't set VacuumCostActive there.
> > > >
> > > > Indeed. But does it mean that there is no code path to turn
> > > > vacuum-delay on, even when vacuum_cost_delay is updated from 0 to
> > > > non-0?
> > >
> > > Ah yes! Good point. This is true.
> > > I'm not sure how to cheaply allow for re-enabling delays after disabling
> > > them in the middle of a table vacuum.
> > >
> > > I don't see a way around checking if we need to reload the config file
> > > on every call to vacuum_delay_point() (currently, we are only doing this
> > > when we have to wait anyway). It seems expensive to do this check every
> > > time. If we do do this, we would update VacuumCostActive when updating
> > > VacuumCostDelay, and we would need a global variable keeping the
> > > failsafe status, as you mentioned.
> > >
> > > It could be okay to say that you can only disable cost-based delays in
> > > the middle of vacuuming a table (i.e. you cannot enable them if they are
> > > already disabled until you start vacuuming the next table). Though maybe
> > > it is weird that you can increase the delay but not re-enable it...
> >
> > On Mon, Mar 20, 2023 at 1:48 AM Melanie Plageman
> > <melanieplageman@gmail.com> wrote:
> > > So, I thought about it some more, and I think it is a bit odd that you
> > > can increase the delay and limit but not re-enable them if they were
> > > disabled. And, perhaps it would be okay to check ConfigReloadPending at
> > > the top of vacuum_delay_point() instead of only after sleeping. It is
> > > just one more branch. We can check if VacuumCostActive is false after
> > > checking if we should reload and doing so if needed and return early.
> > > I've implemented that in attached v6.
> > >
> > > I added in the global we discussed for VacuumFailsafeActive. If we keep
> > > it, we can probably remove the one in LVRelState -- as it seems
> > > redundant. Let me know what you think.
> >
> > I think the following change is related:
> >
> > -        if (!VacuumCostActive || InterruptPending)
> > +        if (InterruptPending || VacuumFailsafeActive ||
> > +                (!VacuumCostActive && !ConfigReloadPending))
> >                  return;
> >
> > +        /*
> > +         * Reload the configuration file if requested. This allows changes to
> > +         * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay to
> > +         * take effect while a table is being vacuumed or analyzed.
> > +         */
> > +        if (ConfigReloadPending && !analyze_in_outer_xact)
> > +        {
> > +                ConfigReloadPending = false;
> > +                ProcessConfigFile(PGC_SIGHUP);
> > +                AutoVacuumUpdateDelay();
> > +                AutoVacuumUpdateLimit();
> > +        }
> >
> > It makes sense to me that we need to reload the config file even when
> > vacuum-delay is disabled. But I think it's not convenient for users
> > that we don't reload the configuration file once the failsafe is
> > triggered. I think users might want to change some GUCs such as
> > log_autovacuum_min_duration.
>
> Ah, okay. Attached v7 has this change (it reloads even if failsafe is
> active).
>
> > > And, I was wondering if it was worth trying to split up the part that
> > > reloads the config file and all of the autovacuum stuff. The reloading
> > > of the config file by itself won't actually result in autovacuum workers
> > > having updated cost delays because of them overwriting it with
> > > wi_cost_delay, but it will allow VACUUM to have those updated values.
> >
> > It makes sense to me to have changes for overhauling the rebalance
> > mechanism in a separate patch.
> >
> > Looking back at the original concern you mentioned[1]:
> >
> > speed up long-running vacuum of a large table by
> > decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
> > config file is only reloaded between tables (for autovacuum) or after
> > the statement (for explicit vacuum).
> >
> > does it make sense to have autovac_balance_cost() update workers'
> > wi_cost_delay too? Autovacuum launcher already reloads the config file
> > and does the rebalance. So I thought autovac_balance_cost() can update
> > the cost_delay as well, and this might be a minimal change to deal
> > with your concern. This doesn't have the effect for manual VACUUM but
> > since vacuum delay is disabled by default it won't be a big problem.
> > As for manual VACUUMs, we would need to reload the config file in
> > vacuum_delay_point() as the part of your patch does. Overhauling the
> > rebalance mechanism would be another patch to improve it further.
>
> So, we can't do this without acquiring an access shared lock on every
> call to vacuum_delay_point() because cost delay is a double.
>
> I will work on a patchset with separate commits for reloading the config
> file, though (with autovac not benefitting in the first commit).
>
> On Thu, Mar 23, 2023 at 12:24 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> >
> > +bool       VacuumFailsafeActive = false;
> > This needs documentation, how it's used and how it relates to failsafe_active
> > in LVRelState (which it might replace(?), but until then).
>
> Thanks! I've removed LVRelState->failsafe_active.
>
> I've also separated the VacuumFailsafeActive change into its own commit.

@@ -492,6 +493,7 @@ vacuum(List *relations, VacuumParams *params,

                 in_vacuum = true;
                 VacuumCostActive = (VacuumCostDelay > 0);
+                VacuumFailsafeActive = false;
                 VacuumCostBalance = 0;
                 VacuumPageHit = 0;
                 VacuumPageMiss = 0;

I think we need to reset VacuumFailsafeActive also in PG_FINALLY()
block in vacuum().

One comment on 0002 patch:

+        /*
+         * Reload the configuration file if requested. This allows changes to
+         * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay to
+         * take effect while a table is being vacuumed or analyzed.
+         */
+        if (ConfigReloadPending && !analyze_in_outer_xact)
+        {
+                ConfigReloadPending = false;
+                ProcessConfigFile(PGC_SIGHUP);
+                AutoVacuumUpdateDelay();
+                AutoVacuumUpdateLimit();
+        }

I think we need comments on why we don't reload the config file if
we're analyzing a table in a user transaction.

>
> > I need another few readthroughs to figure out of VacuumFailsafeActive does what
> > I think it does, and should be doing, but in general I think this is a good
> > idea and a patch in good condition close to being committable.

Another approach would be to make VacuumCostActive a ternary value:
on, off, and never. When we trigger the failsafe mode we switch it to
never, meaning that it never becomes active even after reloading the
config file. A good point is that we don't need to add a new global
variable, but I'm not sure it's better than the current approach.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Fri, Mar 24, 2023 at 1:21 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Mar 24, 2023 at 9:27 AM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> >
> > On Thu, Mar 23, 2023 at 2:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > On Sun, Mar 19, 2023 at 7:47 AM Melanie Plageman <melanieplageman@gmail.com> wrote:
> > > Do we need to calculate the number of workers running with
> > > nworkers_for_balance by iterating over the running worker list? I
> > > guess autovacuum workers can increment/decrement it at the beginning
> > > and end of vacuum.
> >
> > I don't think we can do that because if a worker crashes, we have no way
> > of knowing if it had incremented or decremented the number, so we can't
> > adjust for it.
>
> What kind of crash are you concerned about? If a worker raises an
> ERROR, we can catch it in PG_CATCH() block. If it's a FATAL, we can do
> that in FreeWorkerInfo(). A PANIC error ends up crashing the entire
> server.

Yes, but what about a worker that segfaults? Since table AMs can define
relation_vacuum(), this seems like a real possibility.

I'll address your other code feedback in the next version.

I realized nworkers_for_balance should be initialized to 0 and not 1 --
1 is misleading since there are often 0 autovac workers. We just never
want to use nworkers_for_balance when it is 0. But, workers put a floor
of 1 on the number when they divide limit/nworkers_for_balance (since
they know there must be at least one worker right now since they are a
worker). I thought about whether or not they should call
autovac_balance_cost() if they find that nworkers_for_balance is 0 when
updating their own limit, but I'm not sure.

> > > I need another few readthroughs to figure out of VacuumFailsafeActive does what
> > > I think it does, and should be doing, but in general I think this is a good
> > > idea and a patch in good condition close to being committable.
>
> Another approach would be to make VacuumCostActive a ternary value:
> on, off, and never. When we trigger the failsafe mode we switch it to
> never, meaning that it never becomes active even after reloading the
> config file. A good point is that we don't need to add a new global
> variable, but I'm not sure it's better than the current approach.

Hmm, this is interesting. I don't love the word "never" since it kind of
implies a duration longer than the current table being vacuumed. But we
could find a different word or just document it well. For clarity, we
might want to call it failsafe_mode or something.

I wonder if the primary drawback to converting
LVRelState->failsafe_active to a global VacuumFailsafeActive is just the
general rule of limiting scope to the minimum needed.

- Melanie



Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Thu, Mar 23, 2023 at 8:27 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> On Thu, Mar 23, 2023 at 2:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > And, I was wondering if it was worth trying to split up the part that
> > > reloads the config file and all of the autovacuum stuff. The reloading
> > > of the config file by itself won't actually result in autovacuum workers
> > > having updated cost delays because of them overwriting it with
> > > wi_cost_delay, but it will allow VACUUM to have those updated values.
> >
> > It makes sense to me to have changes for overhauling the rebalance
> > mechanism in a separate patch.
> >
> > Looking back at the original concern you mentioned[1]:
> >
> > speed up long-running vacuum of a large table by
> > decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
> > config file is only reloaded between tables (for autovacuum) or after
> > the statement (for explicit vacuum).
> >
> > does it make sense to have autovac_balance_cost() update workers'
> > wi_cost_delay too? Autovacuum launcher already reloads the config file
> > and does the rebalance. So I thought autovac_balance_cost() can update
> > the cost_delay as well, and this might be a minimal change to deal
> > with your concern. This doesn't have the effect for manual VACUUM but
> > since vacuum delay is disabled by default it won't be a big problem.
> > As for manual VACUUMs, we would need to reload the config file in
> > vacuum_delay_point() as the part of your patch does. Overhauling the
> > rebalance mechanism would be another patch to improve it further.
>
> So, we can't do this without acquiring an access shared lock on every
> call to vacuum_delay_point() because cost delay is a double.
>
> I will work on a patchset with separate commits for reloading the config
> file, though (with autovac not benefitting in the first commit).

So, I realized we could actually do as you say and have autovac workers
update their wi_cost_delay and keep the balance changes in a separate
commit. I've done this in attached v8.

Workers take the exclusive lock to update their wi_cost_delay and
wi_cost_limit only when there is a config reload. So, there is one
commit that implements this behavior and a separate commit to revise the
worker rebalancing.

Note that we must have the workers also update wi_cost_limit_base and
then call autovac_balance_cost() when they reload the config file
(instead of waiting for launcher to call autovac_balance_cost()) to
avoid potentially calculating the sleep with a new value of cost delay
and an old value of cost limit.

In the commit which revises the worker rebalancing, I'm still wondering
if wi_dobalance should be an atomic flag -- probably not worth it,
right?

On Fri, Mar 24, 2023 at 1:27 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> I realized nworkers_for_balance should be initialized to 0 and not 1 --
> 1 is misleading since there are often 0 autovac workers. We just never
> want to use nworkers_for_balance when it is 0. But, workers put a floor
> of 1 on the number when they divide limit/nworkers_for_balance (since
> they know there must be at least one worker right now since they are a
> worker). I thought about whether or not they should call
> autovac_balance_cost() if they find that nworkers_for_balance is 0 when
> updating their own limit, but I'm not sure.

I've gone ahead and updated this. I haven't made the workers call
autovac_balance_cost() if they find that nworkers_for_balance is 0 when
they try and use it when updating their limit because I'm not sure if
this can happen. I would be interested in input here.

I'm also still interested in feedback on the variable name
av_nworkers_for_balance.

> On Fri, Mar 24, 2023 at 1:21 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > I need another few readthroughs to figure out of VacuumFailsafeActive does what
> > > > I think it does, and should be doing, but in general I think this is a good
> > > > idea and a patch in good condition close to being committable.
> >
> > Another approach would be to make VacuumCostActive a ternary value:
> > on, off, and never. When we trigger the failsafe mode we switch it to
> > never, meaning that it never becomes active even after reloading the
> > config file. A good point is that we don't need to add a new global
> > variable, but I'm not sure it's better than the current approach.
>
> Hmm, this is interesting. I don't love the word "never" since it kind of
> implies a duration longer than the current table being vacuumed. But we
> could find a different word or just document it well. For clarity, we
> might want to call it failsafe_mode or something.
>
> I wonder if the primary drawback to converting
> LVRelState->failsafe_active to a global VacuumFailsafeActive is just the
> general rule of limiting scope to the minimum needed.

Okay, so I've changed my mind about this. I like having a ternary for
VacuumCostActive and keeping failsafe_active in LVRelState. What I
didn't like was having non-vacuum code have to care about the
distinction between failsafe + inactive and just inactive. To handle
this, I converted VacuumCostActive to VacuumCostInactive since there are
two inactive cases (inactive and failsafe and plain inactive) and only
one active case. Then, I defined VacuumCostInactive as an int but use
enum values for it in vacuum code to distinguish between failsafe +
inactive and just inactive (I call it VACUUM_COST_INACTIVE_AND_LOCKED
and VACUUM_COST_INACTIVE_AND_UNLOCKED). Non-vacuum code only needs to
check if VacuumCostInactive is 0 like if (!VacuumCostInactive). I'm
happy with the result, and I think it employs only well-defined C
behavior.

- Melanie

Attachment

Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Sat, Mar 25, 2023 at 3:03 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Thu, Mar 23, 2023 at 8:27 PM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> > On Thu, Mar 23, 2023 at 2:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > And, I was wondering if it was worth trying to split up the part that
> > > > reloads the config file and all of the autovacuum stuff. The reloading
> > > > of the config file by itself won't actually result in autovacuum workers
> > > > having updated cost delays because of them overwriting it with
> > > > wi_cost_delay, but it will allow VACUUM to have those updated values.
> > >
> > > It makes sense to me to have changes for overhauling the rebalance
> > > mechanism in a separate patch.
> > >
> > > Looking back at the original concern you mentioned[1]:
> > >
> > > speed up long-running vacuum of a large table by
> > > decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
> > > config file is only reloaded between tables (for autovacuum) or after
> > > the statement (for explicit vacuum).
> > >
> > > does it make sense to have autovac_balance_cost() update workers'
> > > wi_cost_delay too? Autovacuum launcher already reloads the config file
> > > and does the rebalance. So I thought autovac_balance_cost() can update
> > > the cost_delay as well, and this might be a minimal change to deal
> > > with your concern. This doesn't have the effect for manual VACUUM but
> > > since vacuum delay is disabled by default it won't be a big problem.
> > > As for manual VACUUMs, we would need to reload the config file in
> > > vacuum_delay_point() as the part of your patch does. Overhauling the
> > > rebalance mechanism would be another patch to improve it further.
> >
> > So, we can't do this without acquiring an access shared lock on every
> > call to vacuum_delay_point() because cost delay is a double.
> >
> > I will work on a patchset with separate commits for reloading the config
> > file, though (with autovac not benefitting in the first commit).
>
> So, I realized we could actually do as you say and have autovac workers
> update their wi_cost_delay and keep the balance changes in a separate
> commit. I've done this in attached v8.
>
> Workers take the exclusive lock to update their wi_cost_delay and
> wi_cost_limit only when there is a config reload. So, there is one
> commit that implements this behavior and a separate commit to revise the
> worker rebalancing.

So, I've attached an alternate version of the patchset which takes the
approach of having one commit which only enables cost-based delay GUC
refresh for VACUUM and another commit which enables it for autovacuum
and makes the changes to balancing variables.

I still think the commit which has workers updating their own
wi_cost_delay in vacuum_delay_point() is a bit weird. It relies on no one
else emulating our bad behavior and reading from wi_cost_delay without a
lock and on no one else deciding to ever write to wi_cost_delay (even
though it is in shared memory [this is the same as master]). It is only
safe because our process is the only one (right now) writing to
wi_cost_delay, so when we read from it without a lock, we know it isn't
being written to. And everyone else takes a lock when reading from
wi_cost_delay right now. So, it seems...not great.

This approach also introduces a function that is only around for
one commit until the next commit obsoletes it, which seems a bit silly.

Basically, I think it is probably better to just have one commit
enabling guc refresh for VACUUM and then another which correctly
implements what is needed for autovacuum to do the same.
Attached v9 does this.

I've provided both complete versions of both approaches (v9 and v8).

- Melanie

Attachment

Re: Should vacuum process config file reload more often

From
Kyotaro Horiguchi
Date:
At Mon, 27 Mar 2023 14:12:03 -0400, Melanie Plageman <melanieplageman@gmail.com> wrote in 
> So, I've attached an alternate version of the patchset which takes the
> approach of having one commit which only enables cost-based delay GUC
> refresh for VACUUM and another commit which enables it for autovacuum
> and makes the changes to balancing variables.
> 
> I still think the commit which has workers updating their own
> wi_cost_delay in vacuum_delay_point() is a bit weird. It relies on no one
> else emulating our bad behavior and reading from wi_cost_delay without a
> lock and on no one else deciding to ever write to wi_cost_delay (even
> though it is in shared memory [this is the same as master]). It is only
> safe because our process is the only one (right now) writing to
> wi_cost_delay, so when we read from it without a lock, we know it isn't
> being written to. And everyone else takes a lock when reading from
> wi_cost_delay right now. So, it seems...not great.
> 
> This approach also introduces a function that is only around for
> one commit until the next commit obsoletes it, which seems a bit silly.

(I'm not sure what this refers to, though..) I don't think it's silly
if a later patch removes something that the preceding patches
introdcued, as long as that contributes to readability.  Untimately,
they will be merged together on committing.

> Basically, I think it is probably better to just have one commit
> enabling guc refresh for VACUUM and then another which correctly
> implements what is needed for autovacuum to do the same.
> Attached v9 does this.
> 
> I've provided both complete versions of both approaches (v9 and v8).

I took a look at v9 and have a few comments.

0001:

I don't believe it is necessary, as mentioned in the commit
message. It apperas that we are resetting it at the appropriate times.

0002:

I felt a bit uneasy on this. It seems somewhat complex (and makes the
succeeding patches complex), has confusing names, and doesn't seem
like self-contained. I think it'd be simpler to add a global boolean
(maybe VacuumCostActiveForceDisable or such) that forces
VacuumCostActive to be false and set VacuumCostActive using a setter
function that follows the boolean.


0003:

+     * Reload the configuration file if requested. This allows changes to
+     * vacuum_cost_limit and vacuum_cost_delay to take effect while a table is
+     * being vacuumed or analyzed. Analyze should not reload configuration
+     * file if it is in an outer transaction, as GUC values shouldn't be
+     * allowed to refer to some uncommitted state (e.g. database objects
+     * created in this transaction).

I'm not sure GUC reload is or should be related to transactions. For
instance, work_mem can be changed by a reload during a transaction
unless it has been set in the current transaction. I don't think we
need to deliberately suppress changes in variables caused by realods
during transactions only for analzye. If analyze doesn't like changes
to certain GUC variables, their values should be snapshotted before
starting the process.


0004:
-    double        at_vacuum_cost_delay;
-    int            at_vacuum_cost_limit;
+    double        at_table_option_vac_cost_delay;
+    int            at_table_option_vac_cost_limit;

We call that options "relopt(ion)". I don't think there's any reason
to use different names.


     dlist_head    av_runningWorkers;
     WorkerInfo    av_startingWorker;
     AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+    pg_atomic_uint32 av_nworkers_for_balance;

The name of the new member doesn't seem to follow the surrounding
convention. (However, I don't think the member is needed. See below.)

-        /*
-         * Remember the prevailing values of the vacuum cost GUCs.  We have to
-         * restore these at the bottom of the loop, else we'll compute wrong
-         * values in the next iteration of autovac_balance_cost().
-         */
-        stdVacuumCostDelay = VacuumCostDelay;
-        stdVacuumCostLimit = VacuumCostLimit;
+        av_table_option_cost_limit = tab->at_table_option_vac_cost_limit;
+        av_table_option_cost_delay = tab->at_table_option_vac_cost_delay;

I think this requires a comment.


+        /* There is at least 1 autovac worker (this worker). */
+        int            nworkers_for_balance = Max(pg_atomic_read_u32(
+                                &AutoVacuumShmem->av_nworkers_for_balance), 1);

I think it *must* be greater than 0. However, to begin with, I don't
think we need that variable to be shared. I don't believe it matters
if we count involved workers every time we calcualte the delay.

+/*
+ * autovac_balance_cost
+ *        Recalculate the number of workers to consider, given table options and
+ *        the current number of active workers.
+ *
+ * Caller must hold the AutovacuumLock in at least shared mode.

The function name doesn't seem align with what it does. However, I
mentioned above that it might be unnecessary.



+AutoVacuumUpdateLimit(void)
 
If I'm not missing anything, this function does something quite
different from the original autovac_balance_cost().  The original
function distributes the total cost based on the GUC variables among
workers proportionally according to each worker's cost
parameters. Howevwer, this function distributes the total cost
equally.


+        int            vac_cost_limit = autovacuum_vac_cost_limit > 0 ?
+        autovacuum_vac_cost_limit : VacuumCostLimit;
...
+        int            balanced_cost_limit = vac_cost_limit / nworkers_for_balance;
...
+        VacuumCostLimit = Max(Min(balanced_cost_limit, vac_cost_limit), 1);
     }

This seems to repeatedly divide VacuumCostLimit by
nworkers_for_balance. I'm not sure, but this function might only be
called after a reload. If that's the case, I don't think it's safe
coding, even if it works.


regars.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Tue, Mar 28, 2023 at 4:21 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Mon, 27 Mar 2023 14:12:03 -0400, Melanie Plageman <melanieplageman@gmail.com> wrote in
> > So, I've attached an alternate version of the patchset which takes the
> > approach of having one commit which only enables cost-based delay GUC
> > refresh for VACUUM and another commit which enables it for autovacuum
> > and makes the changes to balancing variables.
> >
> > I still think the commit which has workers updating their own
> > wi_cost_delay in vacuum_delay_point() is a bit weird. It relies on no one
> > else emulating our bad behavior and reading from wi_cost_delay without a
> > lock and on no one else deciding to ever write to wi_cost_delay (even
> > though it is in shared memory [this is the same as master]). It is only
> > safe because our process is the only one (right now) writing to
> > wi_cost_delay, so when we read from it without a lock, we know it isn't
> > being written to. And everyone else takes a lock when reading from
> > wi_cost_delay right now. So, it seems...not great.
> >
> > This approach also introduces a function that is only around for
> > one commit until the next commit obsoletes it, which seems a bit silly.
>
> (I'm not sure what this refers to, though..) I don't think it's silly
> if a later patch removes something that the preceding patches
> introdcued, as long as that contributes to readability.  Untimately,
> they will be merged together on committing.

I was under the impression that reviewers thought config reload and
worker balance changes should be committed in separate commits.

Either way, the ephemeral function is not my primary concern. I felt
more uncomfortable with increasing how often we update a double in
shared memory which is read without acquiring a lock.

> > Basically, I think it is probably better to just have one commit
> > enabling guc refresh for VACUUM and then another which correctly
> > implements what is needed for autovacuum to do the same.
> > Attached v9 does this.
> >
> > I've provided both complete versions of both approaches (v9 and v8).
>
> I took a look at v9 and have a few comments.
>
> 0001:
>
> I don't believe it is necessary, as mentioned in the commit
> message. It apperas that we are resetting it at the appropriate times.

VacuumCostBalance must be zeroed out when we disable vacuum cost.
Previously, we did not reenable VacuumCostActive once it was disabled,
but now that we do, I think it is good practice to always zero out
VacuumCostBalance when we disable vacuum cost. I will squash this commit
into the one introducing VacuumCostInactive, though.

>
> 0002:
>
> I felt a bit uneasy on this. It seems somewhat complex (and makes the
> succeeding patches complex),

Even if we introduced a second global variable to indicate that failsafe
mode has been engaged, we would still require the additional checks
of VacuumCostInactive.

> has confusing names,

I would be happy to rename the values of the enum to make them less
confusing. Are you thinking "force" instead of "locked"?
maybe:
VACUUM_COST_FORCE_INACTIVE and
VACUUM_COST_INACTIVE
?

> and doesn't seem like self-contained.

By changing the variable from VacuumCostActive to VacuumCostInactive, I
have kept all non-vacuum code from having to distinguish between it
being inactive due to failsafe mode or due to user settings.

> I think it'd be simpler to add a global boolean (maybe
> VacuumCostActiveForceDisable or such) that forces VacuumCostActive to
> be false and set VacuumCostActive using a setter function that follows
> the boolean.

I think maintaining an additional global variable is more brittle than
including the information in a single variable.

> 0003:
>
> +        * Reload the configuration file if requested. This allows changes to
> +        * vacuum_cost_limit and vacuum_cost_delay to take effect while a table is
> +        * being vacuumed or analyzed. Analyze should not reload configuration
> +        * file if it is in an outer transaction, as GUC values shouldn't be
> +        * allowed to refer to some uncommitted state (e.g. database objects
> +        * created in this transaction).
>
> I'm not sure GUC reload is or should be related to transactions. For
> instance, work_mem can be changed by a reload during a transaction
> unless it has been set in the current transaction. I don't think we
> need to deliberately suppress changes in variables caused by realods
> during transactions only for analzye. If analyze doesn't like changes
> to certain GUC variables, their values should be snapshotted before
> starting the process.

Currently, we only reload the config file in top-level statements. We
don't reload the configuration file from within a nested transaction
command. BEGIN starts a transaction but not a transaction command. So
BEGIN; ANALYZE; probably wouldn't violate this rule. But it is simpler
to just forbid reloading when it is not a top-level transaction command.
I have updated the comment to reflect this.

> 0004:
> -       double          at_vacuum_cost_delay;
> -       int                     at_vacuum_cost_limit;
> +       double          at_table_option_vac_cost_delay;
> +       int                     at_table_option_vac_cost_limit;
>
> We call that options "relopt(ion)". I don't think there's any reason
> to use different names.

I've updated the names.

>         dlist_head      av_runningWorkers;
>         WorkerInfo      av_startingWorker;
>         AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
> +       pg_atomic_uint32 av_nworkers_for_balance;
>
> The name of the new member doesn't seem to follow the surrounding
> convention. (However, I don't think the member is needed. See below.)

I've updated the name to fit the convention better.

> -               /*
> -                * Remember the prevailing values of the vacuum cost GUCs.  We have to
> -                * restore these at the bottom of the loop, else we'll compute wrong
> -                * values in the next iteration of autovac_balance_cost().
> -                */
> -               stdVacuumCostDelay = VacuumCostDelay;
> -               stdVacuumCostLimit = VacuumCostLimit;
> +               av_table_option_cost_limit = tab->at_table_option_vac_cost_limit;
> +               av_table_option_cost_delay = tab->at_table_option_vac_cost_delay;
>
> I think this requires a comment.

I've added one.

>
> +               /* There is at least 1 autovac worker (this worker). */
> +               int                     nworkers_for_balance = Max(pg_atomic_read_u32(
> +                                                               &AutoVacuumShmem->av_nworkers_for_balance), 1);
>
> I think it *must* be greater than 0. However, to begin with, I don't
> think we need that variable to be shared. I don't believe it matters
> if we count involved workers every time we calculate the delay.

We are not calculating the delay but the cost limit. The cost limit must
be balanced across all of the workers currently actively vacuuming
tables without cost-related table options.

There shouldn't be a way for this to be zero, since this worker calls
autovac_balance_cost() before it starts vacuuming the table. I wanted to
rule out any possibility of a divide by 0 issue. I have changed it to an
assert instead.

> +/*
> + * autovac_balance_cost
> + *             Recalculate the number of workers to consider, given table options and
> + *             the current number of active workers.
> + *
> + * Caller must hold the AutovacuumLock in at least shared mode.
>
> The function name doesn't seem align with what it does. However, I
> mentioned above that it might be unnecessary.

This is the same name as the function had previously. However, I think
it does make sense to rename it. The cost limit must be balanced across
the workers. This function calculated how many workers the cost limit
should be balanced across. I renamed it to
autovac_recalculate_workers_for_balance()

> +AutoVacuumUpdateLimit(void)
>
> If I'm not missing anything, this function does something quite
> different from the original autovac_balance_cost().  The original
> function distributes the total cost based on the GUC variables among
> workers proportionally according to each worker's cost
> parameters. Howevwer, this function distributes the total cost
> equally.

Yes, as I mentioned in the commit message, because all the workers now
have no reason to have different cost parameters (due to reloading the
config file on almost every page), there is no reason to use ratios.
Workers vacuuming a table with no cost-related table options simply need
to divide the limit equally amongst themselves because they all will
have the same limit and delay values.

>
> +               int                     vac_cost_limit = autovacuum_vac_cost_limit > 0 ?
> +               autovacuum_vac_cost_limit : VacuumCostLimit;
> ...
> +               int                     balanced_cost_limit = vac_cost_limit / nworkers_for_balance;
> ...
> +               VacuumCostLimit = Max(Min(balanced_cost_limit, vac_cost_limit), 1);
>         }
>
> This seems to repeatedly divide VacuumCostLimit by
> nworkers_for_balance. I'm not sure, but this function might only be
> called after a reload. If that's the case, I don't think it's safe
> coding, even if it works.

Good point about repeatedly dividing VacuumCostLimit by
nworkers_for_balance. I've added a variable to keep track of the base
cost limit and separated the functionality of updating the limit into
two parts -- one AutoVacuumUpdateLimit() which is only meant to be
called after reload and references VacuumCostLimit to set the
av_base_cost_limit and another, AutoVacuumBalanceLimit(), which only
overrides VacuumCostLimit but uses av_base_cost_limit.

I've noted in the comments that AutoVacuumBalanceLimit() should be
called to adjust to a potential change in nworkers_for_balance
(currently every time after we sleep in vacuum_delay_point()) and
AutoVacuumUpdateLimit() should only be called once after a config
reload, as it references VacuumCostLimit.

I will note that this problem also exists in master, as
autovac_balance_cost references VacuumCostLimit in order to set worker
cost limits and then AutoVacuumUpdateDelay() overrides VacuumCostLimit
with the value calculated in autovac_balance_cost() from
VacuumCostLimit.

v10 attached with mentioned updates.

- Melanie

Attachment

Re: Should vacuum process config file reload more often

From
Kyotaro Horiguchi
Date:
At Tue, 28 Mar 2023 20:35:28 -0400, Melanie Plageman <melanieplageman@gmail.com> wrote in 
> On Tue, Mar 28, 2023 at 4:21 AM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
> >
> > At Mon, 27 Mar 2023 14:12:03 -0400, Melanie Plageman <melanieplageman@gmail.com> wrote in
> > > So, I've attached an alternate version of the patchset which takes the
> > > approach of having one commit which only enables cost-based delay GUC
> > > refresh for VACUUM and another commit which enables it for autovacuum
> > > and makes the changes to balancing variables.
> > >
> > > I still think the commit which has workers updating their own
> > > wi_cost_delay in vacuum_delay_point() is a bit weird. It relies on no one
> > > else emulating our bad behavior and reading from wi_cost_delay without a
> > > lock and on no one else deciding to ever write to wi_cost_delay (even
> > > though it is in shared memory [this is the same as master]). It is only
> > > safe because our process is the only one (right now) writing to
> > > wi_cost_delay, so when we read from it without a lock, we know it isn't
> > > being written to. And everyone else takes a lock when reading from
> > > wi_cost_delay right now. So, it seems...not great.
> > >
> > > This approach also introduces a function that is only around for
> > > one commit until the next commit obsoletes it, which seems a bit silly.
> >
> > (I'm not sure what this refers to, though..) I don't think it's silly
> > if a later patch removes something that the preceding patches
> > introdcued, as long as that contributes to readability.  Untimately,
> > they will be merged together on committing.
> 
> I was under the impression that reviewers thought config reload and
> worker balance changes should be committed in separate commits.
> 
> Either way, the ephemeral function is not my primary concern. I felt
> more uncomfortable with increasing how often we update a double in
> shared memory which is read without acquiring a lock.
> 
> > > Basically, I think it is probably better to just have one commit
> > > enabling guc refresh for VACUUM and then another which correctly
> > > implements what is needed for autovacuum to do the same.
> > > Attached v9 does this.
> > >
> > > I've provided both complete versions of both approaches (v9 and v8).
> >
> > I took a look at v9 and have a few comments.
> >
> > 0001:
> >
> > I don't believe it is necessary, as mentioned in the commit
> > message. It apperas that we are resetting it at the appropriate times.
> 
> VacuumCostBalance must be zeroed out when we disable vacuum cost.
> Previously, we did not reenable VacuumCostActive once it was disabled,
> but now that we do, I think it is good practice to always zero out
> VacuumCostBalance when we disable vacuum cost. I will squash this commit
> into the one introducing VacuumCostInactive, though.

> >
> > 0002:
> >
> > I felt a bit uneasy on this. It seems somewhat complex (and makes the
> > succeeding patches complex),
> 
> Even if we introduced a second global variable to indicate that failsafe
> mode has been engaged, we would still require the additional checks
> of VacuumCostInactive.
>
> > has confusing names,
> 
> I would be happy to rename the values of the enum to make them less
> confusing. Are you thinking "force" instead of "locked"?
> maybe:
> VACUUM_COST_FORCE_INACTIVE and
> VACUUM_COST_INACTIVE
> ?
> 
> > and doesn't seem like self-contained.
> 
> By changing the variable from VacuumCostActive to VacuumCostInactive, I
> have kept all non-vacuum code from having to distinguish between it
> being inactive due to failsafe mode or due to user settings.

My concern is that VacuumCostActive is logic-inverted and turned into
a ternary variable in a subtle way. The expression
"!VacuumCostInactive" is quite confusing. (I sometimes feel the same
way about "!XLogRecPtrIsInvalid(lsn)", and I believe most people write
it with another macro like "lsn != InvalidXLogrecPtr"). Additionally,
the constraint in this patch will be implemented as open code.  So I
wanted to suggest something like the attached. The main idea is to use
a wrapper function to enforce the restriction, and by doing so, we
eliminated the need to make the variable into a ternary without a good
reason.

> > I think it'd be simpler to add a global boolean (maybe
> > VacuumCostActiveForceDisable or such) that forces VacuumCostActive to
> > be false and set VacuumCostActive using a setter function that follows
> > the boolean.
> 
> I think maintaining an additional global variable is more brittle than
> including the information in a single variable.
> 
> > 0003:
> >
> > +        * Reload the configuration file if requested. This allows changes to
> > +        * vacuum_cost_limit and vacuum_cost_delay to take effect while a table is
> > +        * being vacuumed or analyzed. Analyze should not reload configuration
> > +        * file if it is in an outer transaction, as GUC values shouldn't be
> > +        * allowed to refer to some uncommitted state (e.g. database objects
> > +        * created in this transaction).
> >
> > I'm not sure GUC reload is or should be related to transactions. For
> > instance, work_mem can be changed by a reload during a transaction
> > unless it has been set in the current transaction. I don't think we
> > need to deliberately suppress changes in variables caused by realods
> > during transactions only for analzye. If analyze doesn't like changes
> > to certain GUC variables, their values should be snapshotted before
> > starting the process.
> 
> Currently, we only reload the config file in top-level statements. We
> don't reload the configuration file from within a nested transaction
> command. BEGIN starts a transaction but not a transaction command. So
> BEGIN; ANALYZE; probably wouldn't violate this rule. But it is simpler
> to just forbid reloading when it is not a top-level transaction command.
> I have updated the comment to reflect this.

I feel it's a bit fragile. We may not be able to manage the reload
timeing perfectly. I think we might accidentally add a reload
timing. In that case, the assumption could break. In most cases, I
think we use snapshotting in various ways to avoid unintended variable
changes. (And I beilieve the analyze code also does that.)

> > +               /* There is at least 1 autovac worker (this worker). */
> > +               int                     nworkers_for_balance = Max(pg_atomic_read_u32(
> > +                                                               &AutoVacuumShmem->av_nworkers_for_balance), 1);
> >
> > I think it *must* be greater than 0. However, to begin with, I don't
> > think we need that variable to be shared. I don't believe it matters
> > if we count involved workers every time we calculate the delay.
> 
> We are not calculating the delay but the cost limit. The cost limit must

Ah, right, it's limit, but my main point still stands.

> be balanced across all of the workers currently actively vacuuming
> tables without cost-related table options.

The purpose of the old autovac_balance_cost() is to distribute the
cost among all involved tables, proportionally based on each worker's
cost specification.  Adjusting the limit just for tables affected by
reloads disrupts the cost balance.

> > If I'm not missing anything, this function does something quite
> > different from the original autovac_balance_cost().  The original
> > function distributes the total cost based on the GUC variables among
> > workers proportionally according to each worker's cost
> > parameters. Howevwer, this function distributes the total cost
> > equally.
> 
> Yes, as I mentioned in the commit message, because all the workers now
> have no reason to have different cost parameters (due to reloading the
> config file on almost every page), there is no reason to use ratios.
> Workers vacuuming a table with no cost-related table options simply need
> to divide the limit equally amongst themselves because they all will
> have the same limit and delay values.

I'm not sure about the assumption in the commit message. For instance,
if the total cost limit drops significantly, it's possible that the
workers left out of this calculation might end up using all the
reduced cost. Wouldn't this imply that all workers should recompute
their individual limits?

> >
> > +               int                     vac_cost_limit = autovacuum_vac_cost_limit > 0 ?
> > +               autovacuum_vac_cost_limit : VacuumCostLimit;
> > ...
> > +               int                     balanced_cost_limit = vac_cost_limit / nworkers_for_balance;
> > ...
> > +               VacuumCostLimit = Max(Min(balanced_cost_limit, vac_cost_limit), 1);
> >         }
> >
> > This seems to repeatedly divide VacuumCostLimit by
> > nworkers_for_balance. I'm not sure, but this function might only be
> > called after a reload. If that's the case, I don't think it's safe
> > coding, even if it works.
> 
> Good point about repeatedly dividing VacuumCostLimit by
> nworkers_for_balance. I've added a variable to keep track of the base
> cost limit and separated the functionality of updating the limit into
> two parts -- one AutoVacuumUpdateLimit() which is only meant to be
> called after reload and references VacuumCostLimit to set the
> av_base_cost_limit and another, AutoVacuumBalanceLimit(), which only
> overrides VacuumCostLimit but uses av_base_cost_limit.

Sorry, but will check this later.

> I've noted in the comments that AutoVacuumBalanceLimit() should be
> called to adjust to a potential change in nworkers_for_balance
> (currently every time after we sleep in vacuum_delay_point()) and
> AutoVacuumUpdateLimit() should only be called once after a config
> reload, as it references VacuumCostLimit.
> 
> I will note that this problem also exists in master, as
> autovac_balance_cost references VacuumCostLimit in order to set worker
> cost limits and then AutoVacuumUpdateDelay() overrides VacuumCostLimit
> with the value calculated in autovac_balance_cost() from
> VacuumCostLimit.
> 
> v10 attached with mentioned updates.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment

Re: Should vacuum process config file reload more often

From
Kyotaro Horiguchi
Date:
At Wed, 29 Mar 2023 12:09:08 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> timeing perfectly. I think we might accidentally add a reload
> timing. In that case, the assumption could break. In most cases, I
> think we use snapshotting in various ways to avoid unintended variable
> changes. (And I beilieve the analyze code also does that.)

Okay, I was missing the following code.

autovacuum.c:2893
        /*
         * If any of the cost delay parameters has been set individually for
         * this table, disable the balancing algorithm.
         */
        tab->at_dobalance =
            !(avopts && (avopts->vacuum_cost_limit > 0 ||
                         avopts->vacuum_cost_delay > 0));

So, sorry for the noise. I'll review it while this into cnosideration.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Should vacuum process config file reload more often

From
Kyotaro Horiguchi
Date:
At Wed, 29 Mar 2023 13:21:55 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> autovacuum.c:2893
>         /*
>          * If any of the cost delay parameters has been set individually for
>          * this table, disable the balancing algorithm.
>          */
>         tab->at_dobalance =
>             !(avopts && (avopts->vacuum_cost_limit > 0 ||
>                          avopts->vacuum_cost_delay > 0));
> 
> So, sorry for the noise. I'll review it while this into cnosideration.

Then I found that the code is quite confusing as it is.

For the tables that don't have cost_delay and cost_limit specified
indificually, at_vacuum_cost_limit and _delay store the system global
values detemined by GUCs. wi_cost_delay, _limit and _limit_base stores
the same values with them. As the result I concluded tha
autovac_balance_cost() does exactly what Melanie's patch does, except
that nworkers_for_balance is not stored in shared memory.

I discovered that commit 1021bd6a89 brought in do_balance.

>    Since the mechanism is already complicated, just disable it for those
>    cases rather than trying to make it cope.  There are undesirable

After reading this, I get why the code is so complex.  It is a remnant
of when balancing was done with tables that had individually specified
cost parameters. And I found the following description in the doc.

https://www.postgresql.org/docs/devel/routine-vacuuming.html
> When multiple workers are running, the autovacuum cost delay
> parameters (see Section 20.4.4) are “balanced” among all the running
> workers, so that the total I/O impact on the system is the same
> regardless of the number of workers actually running. However, any
> workers processing tables whose per-table
> autovacuum_vacuum_cost_delay or autovacuum_vacuum_cost_limit storage
> parameters have been set are not considered in the balancing
> algorithm.

The initial balancing mechanism was brought in by e2a186b03c back in
2007. The balancing code has had that unnecessarily complexity ever
since.

Since I can't think of a better idea than Melanie's proposal for
handling this code, I'll keep reviewing it with that approach in mind.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Should vacuum process config file reload more often

From
Kyotaro Horiguchi
Date:
At Wed, 29 Mar 2023 13:21:55 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> So, sorry for the noise. I'll review it while this into cnosideration.

0003:

It's not this patche's fault, but I don't like the fact that the
variables used for GUC, VacuumCostDelay and VacuumCostLimit, are
updated outside the GUC mechanism. Also I don't like the incorrect
sorting of variables, where some working variables are referred to as
GUC parameters or vise versa.

Although it's somewhat unrelated to the goal of this patch, I think we
should clean up the code tidy before proceeding. Shouldn't we separate
the actual parameters from the GUC base variables, and sort out the
all related variaghble? (something like the attached, on top of your
patch.)


I have some comments on 0003 as-is.

+        tab->at_relopt_vac_cost_limit = avopts ?
+            avopts->vacuum_cost_limit : 0;
+        tab->at_relopt_vac_cost_delay = avopts ?
+            avopts->vacuum_cost_delay : -1;

The value is not used when do_balance is false, so I don't see a
specific reason for these variables to be different when avopts is
null.

+autovac_recalculate_workers_for_balance(void)
+{
+    dlist_iter    iter;
+    int            orig_nworkers_for_balance;
+    int            nworkers_for_balance = 0;
+
+    if (autovacuum_vac_cost_delay == 0 ||
+        (autovacuum_vac_cost_delay == -1 && VacuumCostDelay == 0))
         return;
+    if (autovacuum_vac_cost_limit <= 0 && VacuumCostLimit <= 0)
+        return;
+

I'm not quite sure how these conditions relate to the need to count
workers that shares the global I/O cost. (Though I still believe this
funtion might not be necessary.)

+    if (av_relopt_cost_limit > 0)
+        VacuumCostLimit = av_relopt_cost_limit;
+    else
+    {
+        av_base_cost_limit = autovacuum_vac_cost_limit > 0 ?
+            autovacuum_vac_cost_limit : VacuumCostLimit;
+
+        AutoVacuumBalanceLimit();

I think each worker should use MyWorkerInfo->wi_dobalance to identyify
whether the worker needs to use balanced cost values.


+void
+AutoVacuumBalanceLimit(void)

I'm not sure this function needs to be a separate function.

(Sorry, timed out..)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment

Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
Thanks for the detailed review!

On Tue, Mar 28, 2023 at 11:09 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Tue, 28 Mar 2023 20:35:28 -0400, Melanie Plageman <melanieplageman@gmail.com> wrote in
> > On Tue, Mar 28, 2023 at 4:21 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
> > >
> > > At Mon, 27 Mar 2023 14:12:03 -0400, Melanie Plageman <melanieplageman@gmail.com> wrote in
> > >
> > > 0002:
> > >
> > > I felt a bit uneasy on this. It seems somewhat complex (and makes the
> > > succeeding patches complex),
> >
> > Even if we introduced a second global variable to indicate that failsafe
> > mode has been engaged, we would still require the additional checks
> > of VacuumCostInactive.
> >
> > > has confusing names,
> >
> > I would be happy to rename the values of the enum to make them less
> > confusing. Are you thinking "force" instead of "locked"?
> > maybe:
> > VACUUM_COST_FORCE_INACTIVE and
> > VACUUM_COST_INACTIVE
> > ?
> >
> > > and doesn't seem like self-contained.
> >
> > By changing the variable from VacuumCostActive to VacuumCostInactive, I
> > have kept all non-vacuum code from having to distinguish between it
> > being inactive due to failsafe mode or due to user settings.
>
> My concern is that VacuumCostActive is logic-inverted and turned into
> a ternary variable in a subtle way. The expression
> "!VacuumCostInactive" is quite confusing. (I sometimes feel the same
> way about "!XLogRecPtrIsInvalid(lsn)", and I believe most people write
> it with another macro like "lsn != InvalidXLogrecPtr"). Additionally,
> the constraint in this patch will be implemented as open code.  So I
> wanted to suggest something like the attached. The main idea is to use
> a wrapper function to enforce the restriction, and by doing so, we
> eliminated the need to make the variable into a ternary without a good
> reason.

So, the rationale for making it a ternary is that the variable is the
combination of two pieces of information which has only has 3 valid
states:
failsafe inactive + cost active = cost active
failsafe inactive + cost inactive = cost inactive
failsafe active + cost inactive = cost inactive and locked
the fourth is invalid
failsafe active + cost active = invalid
That is harder to enforce with two variables.
Also, the two pieces of information are not meaningful individually.
So, I thought it made sense to make a single variable.

Your suggested patch introduces an additional variable which shadows
LVRelState->failsafe_active but doesn't actually get set/reset at all of
the correct places. If we did introduce a second global variable, I
don't think we should also keep LVRelState->failsafe_active, as keeping
them in sync will be difficult.

As for the double negative (!VacuumCostInactive), I agree that it is not
ideal, however, if we use a ternary and keep VacuumCostActive, there is
no way for non-vacuum code to treat it as a boolean.
With the ternary VacuumCostInactive, only vacuum code has to know about
the distinction between inactive+failsafe active and inactive+failsafe
inactive.

As for the setter function, I think that having a function to set
VacuumCostActive based on failsafe_active is actually doing more harm
than good. Only vacuum code has to know about the distinction as it is,
so we aren't really saving any trouble (there would really only be two
callers of the suggested function). And, since the function hides
whether or not VacuumCostActive was actually set to the passed-in value,
we can't easily do other necessary maintenance -- like zero out
VacuumCostBalance if we disabled vacuum cost.

> > > 0003:
> > >
> > > +        * Reload the configuration file if requested. This allows changes to
> > > +        * vacuum_cost_limit and vacuum_cost_delay to take effect while a table is
> > > +        * being vacuumed or analyzed. Analyze should not reload configuration
> > > +        * file if it is in an outer transaction, as GUC values shouldn't be
> > > +        * allowed to refer to some uncommitted state (e.g. database objects
> > > +        * created in this transaction).
> > >
> > > I'm not sure GUC reload is or should be related to transactions. For
> > > instance, work_mem can be changed by a reload during a transaction
> > > unless it has been set in the current transaction. I don't think we
> > > need to deliberately suppress changes in variables caused by realods
> > > during transactions only for analzye. If analyze doesn't like changes
> > > to certain GUC variables, their values should be snapshotted before
> > > starting the process.
> >
> > Currently, we only reload the config file in top-level statements. We
> > don't reload the configuration file from within a nested transaction
> > command. BEGIN starts a transaction but not a transaction command. So
> > BEGIN; ANALYZE; probably wouldn't violate this rule. But it is simpler
> > to just forbid reloading when it is not a top-level transaction command.
> > I have updated the comment to reflect this.
>
> I feel it's a bit fragile. We may not be able to manage the reload
> timeing perfectly. I think we might accidentally add a reload
> timing. In that case, the assumption could break. In most cases, I
> think we use snapshotting in various ways to avoid unintended variable
> changes. (And I beilieve the analyze code also does that.)

I'm not sure I fully understand the problem you are thinking of. What do
you mean about managing the reload timing? Are you suggesting there is a
problem with excluding analzye in an outer transaction from doing the
reload or with doing the reload during vacuum and analyze when they are
top-level statements?

And, by snapshotting do you mean how vacuum_rel() and do_analyze_rel() do
NewGUCNestLevel() so that they can then do AtEOXact_GUC() and rollback
guc changes done during that operation?
How are you envisioning that being used here?

On Wed, Mar 29, 2023 at 2:00 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Wed, 29 Mar 2023 13:21:55 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
> > autovacuum.c:2893
> >               /*
> >                * If any of the cost delay parameters has been set individually for
> >                * this table, disable the balancing algorithm.
> >                */
> >               tab->at_dobalance =
> >                       !(avopts && (avopts->vacuum_cost_limit > 0 ||
> >                                                avopts->vacuum_cost_delay > 0));
> >
> > So, sorry for the noise. I'll review it while this into cnosideration.
>
> Then I found that the code is quite confusing as it is.
>
> For the tables that don't have cost_delay and cost_limit specified
> indificually, at_vacuum_cost_limit and _delay store the system global
> values detemined by GUCs. wi_cost_delay, _limit and _limit_base stores
> the same values with them. As the result I concluded tha
> autovac_balance_cost() does exactly what Melanie's patch does, except
> that nworkers_for_balance is not stored in shared memory.
>
> I discovered that commit 1021bd6a89 brought in do_balance.
>
> >    Since the mechanism is already complicated, just disable it for those
> >    cases rather than trying to make it cope.  There are undesirable
>
> After reading this, I get why the code is so complex.  It is a remnant
> of when balancing was done with tables that had individually specified
> cost parameters. And I found the following description in the doc.
>
> https://www.postgresql.org/docs/devel/routine-vacuuming.html
> > When multiple workers are running, the autovacuum cost delay
> > parameters (see Section 20.4.4) are “balanced” among all the running
> > workers, so that the total I/O impact on the system is the same
> > regardless of the number of workers actually running. However, any
> > workers processing tables whose per-table
> > autovacuum_vacuum_cost_delay or autovacuum_vacuum_cost_limit storage
> > parameters have been set are not considered in the balancing
> > algorithm.
>
> The initial balancing mechanism was brought in by e2a186b03c back in
> 2007. The balancing code has had that unnecessarily complexity ever
> since.
>
> Since I can't think of a better idea than Melanie's proposal for
> handling this code, I'll keep reviewing it with that approach in mind.

Thanks for doing this archaeology. I didn't know the history of dobalance
and hadn't looked into 1021bd6a89.
I was a bit confused by why dobalance was false even if only table
option cost delay is set and not table option cost limit.

I think we can retain this behavior for now, but it may be worth
re-examining in the future.

On Wed, Mar 29, 2023 at 4:35 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Wed, 29 Mar 2023 13:21:55 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
> > So, sorry for the noise. I'll review it while this into cnosideration.
>
> 0003:
>
> It's not this patche's fault, but I don't like the fact that the
> variables used for GUC, VacuumCostDelay and VacuumCostLimit, are
> updated outside the GUC mechanism. Also I don't like the incorrect
> sorting of variables, where some working variables are referred to as
> GUC parameters or vise versa.
>
> Although it's somewhat unrelated to the goal of this patch, I think we
> should clean up the code tidy before proceeding. Shouldn't we separate
> the actual parameters from the GUC base variables, and sort out the
> all related variaghble? (something like the attached, on top of your
> patch.)

So, I agree we should separate the parameters used in the code from the
GUC variables -- since there are multiple users with different needs
(autovac workers, parallel vac workers, and vacuum). However, I was
hesitant to tackle that here.

I'm not sure how these changes will impact extensions that rely on
these vacuum parameters and their direct relationship to the guc values.

In your patch, you didn't update the parameter with the guc value of
vacuum_cost_limit and vacuum_cost_delay, but were we to do so, we would
need to make sure it was updated every time after a config reload. This
isn't hard to do in the current code, but I'm not sure how we can ensure
that future callers of ProcessConfigFile() in vacuum code always update
these values afterward. Perhaps we could add some after_reload hook?
Which does seem like a larger project.

> I have some comments on 0003 as-is.
>
> +               tab->at_relopt_vac_cost_limit = avopts ?
> +                       avopts->vacuum_cost_limit : 0;
> +               tab->at_relopt_vac_cost_delay = avopts ?
> +                       avopts->vacuum_cost_delay : -1;
>
> The value is not used when do_balance is false, so I don't see a
> specific reason for these variables to be different when avopts is
> null.

Actually we need to set these to 0 and -1, because we set
av_relopt_cost_limit and av_relopt_cost_delay with them and those values
are checked regardless of wi_dobalance.

We need to do this because we want to use the correct value to override
VacuumCostLimit and VacuumCostDelay. wi_dobalance may be false because
we have a table option cost delay but we have no table option cost
limit. When we override VacuumCostDelay, we want to use the table option
value but when we override VacuumCostLimit, we want to use the regular
value. We need these initialized to values that will allow us to do
that.

> +autovac_recalculate_workers_for_balance(void)
> +{
> +       dlist_iter      iter;
> +       int                     orig_nworkers_for_balance;
> +       int                     nworkers_for_balance = 0;
> +
> +       if (autovacuum_vac_cost_delay == 0 ||
> +               (autovacuum_vac_cost_delay == -1 && VacuumCostDelay == 0))
>                 return;
> +       if (autovacuum_vac_cost_limit <= 0 && VacuumCostLimit <= 0)
> +               return;
> +
>
> I'm not quite sure how these conditions relate to the need to count
> workers that shares the global I/O cost.

Ah, this is a good point, we should still keep this number up-to-date
even if the costs are disabled at the time we are checking it in case
cost-based delays are re-enabled later before we recalculate this
number.  I had this code originally because autovac_balance_cost() would
exit early if cost-based delays were disabled -- but this only worked
because they couldn't be re-enabled during vacuuming a table and
autovac_balance_cost() was called always in between vacuuming tables.

I've removed these lines.

And perhaps there is an argument for calling
autovac_recalculate_workers_for_balance() in vacuum_delay_point() after
reloading the config file...
I have not done so in attached version.

> (Though I still believe this funtion might not be necessary.)

I don't see how we can do without this function. We need an up-to-date
count of the number of autovacuum workers vacuuming tables which do not
have vacuum cost-related table options.


> +       if (av_relopt_cost_limit > 0)
> +               VacuumCostLimit = av_relopt_cost_limit;
> +       else
> +       {
> +               av_base_cost_limit = autovacuum_vac_cost_limit > 0 ?
> +                       autovacuum_vac_cost_limit : VacuumCostLimit;
> +
> +               AutoVacuumBalanceLimit();
>
> I think each worker should use MyWorkerInfo->wi_dobalance to identyify
> whether the worker needs to use balanced cost values.

Ah, there is a bug here. I have fixed it by making wi_dobalance an
atomic flag so that we can check it before calling
AutoVacuumBalanceLimit() (without taking a lock).

I don't see other (non-test code) callers using atomic flags, so I can't
tell if we need to loop to ensure that pg_atomic_test_set_flag() returns
true.

> +void
> +AutoVacuumBalanceLimit(void)
>
> I'm not sure this function needs to be a separate function.

We need to call it more often than we can call AutoVacuumUpdateLimit(),
so the logic needs to be separate. Are you suggesting we inline the
logic in the two places it is needed?

v11 attached with updates mentioned above.

- Melanie

Attachment

Re: Should vacuum process config file reload more often

From
Masahiko Sawada
Date:
Hi,

Thank you for updating the patches.

On Thu, Mar 30, 2023 at 5:01 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> Thanks for the detailed review!
>
> On Tue, Mar 28, 2023 at 11:09 PM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
> >
> > At Tue, 28 Mar 2023 20:35:28 -0400, Melanie Plageman <melanieplageman@gmail.com> wrote in
> > > On Tue, Mar 28, 2023 at 4:21 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
> > > >
> > > > At Mon, 27 Mar 2023 14:12:03 -0400, Melanie Plageman <melanieplageman@gmail.com> wrote in
> > > >
> > > > 0002:
> > > >
> > > > I felt a bit uneasy on this. It seems somewhat complex (and makes the
> > > > succeeding patches complex),
> > >
> > > Even if we introduced a second global variable to indicate that failsafe
> > > mode has been engaged, we would still require the additional checks
> > > of VacuumCostInactive.
> > >
> > > > has confusing names,
> > >
> > > I would be happy to rename the values of the enum to make them less
> > > confusing. Are you thinking "force" instead of "locked"?
> > > maybe:
> > > VACUUM_COST_FORCE_INACTIVE and
> > > VACUUM_COST_INACTIVE
> > > ?
> > >
> > > > and doesn't seem like self-contained.
> > >
> > > By changing the variable from VacuumCostActive to VacuumCostInactive, I
> > > have kept all non-vacuum code from having to distinguish between it
> > > being inactive due to failsafe mode or due to user settings.
> >
> > My concern is that VacuumCostActive is logic-inverted and turned into
> > a ternary variable in a subtle way. The expression
> > "!VacuumCostInactive" is quite confusing. (I sometimes feel the same
> > way about "!XLogRecPtrIsInvalid(lsn)", and I believe most people write
> > it with another macro like "lsn != InvalidXLogrecPtr"). Additionally,
> > the constraint in this patch will be implemented as open code.  So I
> > wanted to suggest something like the attached. The main idea is to use
> > a wrapper function to enforce the restriction, and by doing so, we
> > eliminated the need to make the variable into a ternary without a good
> > reason.
>
> So, the rationale for making it a ternary is that the variable is the
> combination of two pieces of information which has only has 3 valid
> states:
> failsafe inactive + cost active = cost active
> failsafe inactive + cost inactive = cost inactive
> failsafe active + cost inactive = cost inactive and locked
> the fourth is invalid
> failsafe active + cost active = invalid
> That is harder to enforce with two variables.
> Also, the two pieces of information are not meaningful individually.
> So, I thought it made sense to make a single variable.
>
> Your suggested patch introduces an additional variable which shadows
> LVRelState->failsafe_active but doesn't actually get set/reset at all of
> the correct places. If we did introduce a second global variable, I
> don't think we should also keep LVRelState->failsafe_active, as keeping
> them in sync will be difficult.
>
> As for the double negative (!VacuumCostInactive), I agree that it is not
> ideal, however, if we use a ternary and keep VacuumCostActive, there is
> no way for non-vacuum code to treat it as a boolean.
> With the ternary VacuumCostInactive, only vacuum code has to know about
> the distinction between inactive+failsafe active and inactive+failsafe
> inactive.

As another idea, why don't we use macros for that? For example,
suppose VacuumCostStatus is like:

typedef enum VacuumCostStatus
{
    VACUUM_COST_INACTIVE_LOCKED = 0,
    VACUUM_COST_INACTIVE,
    VACUUM_COST_ACTIVE,
} VacuumCostStatus;
VacuumCostStatus VacuumCost;

non-vacuum code can use the following macros:

#define VacuumCostActive() (VacuumCost == VACUUM_COST_ACTIVE)
#define VacuumCostInactive() (VacuumCost <= VACUUM_COST_INACTIVE) //
or we can use !VacuumCostActive() instead.

Or is there any reason why we need to keep VacuumCostActive and treat
it as a boolean?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Should vacuum process config file reload more often

From
Daniel Gustafsson
Date:
> On 30 Mar 2023, at 04:57, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

> As another idea, why don't we use macros for that? For example,
> suppose VacuumCostStatus is like:
>
> typedef enum VacuumCostStatus
> {
>    VACUUM_COST_INACTIVE_LOCKED = 0,
>    VACUUM_COST_INACTIVE,
>    VACUUM_COST_ACTIVE,
> } VacuumCostStatus;
> VacuumCostStatus VacuumCost;
>
> non-vacuum code can use the following macros:
>
> #define VacuumCostActive() (VacuumCost == VACUUM_COST_ACTIVE)
> #define VacuumCostInactive() (VacuumCost <= VACUUM_COST_INACTIVE) //
> or we can use !VacuumCostActive() instead.

I'm in favor of something along these lines.  A variable with a name that
implies a boolean value (active/inactive) but actually contains a tri-value is
easily misunderstood.  A VacuumCostState tri-value variable (or a better name)
with a set of convenient macros for extracting the boolean active/inactive that
most of the code needs to be concerned with would more for more readable code I
think.

--
Daniel Gustafsson




Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Thu, Mar 30, 2023 at 3:26 PM Daniel Gustafsson <daniel@yesql.se> wrote:
>
> > On 30 Mar 2023, at 04:57, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> > As another idea, why don't we use macros for that? For example,
> > suppose VacuumCostStatus is like:
> >
> > typedef enum VacuumCostStatus
> > {
> >    VACUUM_COST_INACTIVE_LOCKED = 0,
> >    VACUUM_COST_INACTIVE,
> >    VACUUM_COST_ACTIVE,
> > } VacuumCostStatus;
> > VacuumCostStatus VacuumCost;
> >
> > non-vacuum code can use the following macros:
> >
> > #define VacuumCostActive() (VacuumCost == VACUUM_COST_ACTIVE)
> > #define VacuumCostInactive() (VacuumCost <= VACUUM_COST_INACTIVE) //
> > or we can use !VacuumCostActive() instead.
>
> I'm in favor of something along these lines.  A variable with a name that
> implies a boolean value (active/inactive) but actually contains a tri-value is
> easily misunderstood.  A VacuumCostState tri-value variable (or a better name)
> with a set of convenient macros for extracting the boolean active/inactive that
> most of the code needs to be concerned with would more for more readable code I
> think.

The macros are very error-prone. I was just implementing this idea and
mistakenly tried to set the macro instead of the variable in multiple
places. Avoiding this involves another set of macros, and, in the end, I
think the complexity is much worse. Given the reviewers' uniform dislike
of VacuumCostInactive, I favor going back to two variables
(VacuumCostActive + VacuumFailsafeActive) and moving
LVRelState->failsafe_active to the global VacuumFailsafeActive.

I will reimplement this in the next version.

On the subject of globals, the next version will implement
Horiguchi-san's proposal to separate GUC variables from the globals used
in the code (quoted below). It should hopefully reduce the complexity of
this patchset.

> Although it's somewhat unrelated to the goal of this patch, I think we
> should clean up the code tidy before proceeding. Shouldn't we separate
> the actual parameters from the GUC base variables, and sort out the
> all related variaghble? (something like the attached, on top of your
> patch.)

- Melanie



Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Fri, Mar 31, 2023 at 10:31 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Thu, Mar 30, 2023 at 3:26 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> >
> > > On 30 Mar 2023, at 04:57, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > > As another idea, why don't we use macros for that? For example,
> > > suppose VacuumCostStatus is like:
> > >
> > > typedef enum VacuumCostStatus
> > > {
> > >    VACUUM_COST_INACTIVE_LOCKED = 0,
> > >    VACUUM_COST_INACTIVE,
> > >    VACUUM_COST_ACTIVE,
> > > } VacuumCostStatus;
> > > VacuumCostStatus VacuumCost;
> > >
> > > non-vacuum code can use the following macros:
> > >
> > > #define VacuumCostActive() (VacuumCost == VACUUM_COST_ACTIVE)
> > > #define VacuumCostInactive() (VacuumCost <= VACUUM_COST_INACTIVE) //
> > > or we can use !VacuumCostActive() instead.
> >
> > I'm in favor of something along these lines.  A variable with a name that
> > implies a boolean value (active/inactive) but actually contains a tri-value is
> > easily misunderstood.  A VacuumCostState tri-value variable (or a better name)
> > with a set of convenient macros for extracting the boolean active/inactive that
> > most of the code needs to be concerned with would more for more readable code I
> > think.
>
> The macros are very error-prone. I was just implementing this idea and
> mistakenly tried to set the macro instead of the variable in multiple
> places. Avoiding this involves another set of macros, and, in the end, I
> think the complexity is much worse. Given the reviewers' uniform dislike
> of VacuumCostInactive, I favor going back to two variables
> (VacuumCostActive + VacuumFailsafeActive) and moving
> LVRelState->failsafe_active to the global VacuumFailsafeActive.
>
> I will reimplement this in the next version.
>
> On the subject of globals, the next version will implement
> Horiguchi-san's proposal to separate GUC variables from the globals used
> in the code (quoted below). It should hopefully reduce the complexity of
> this patchset.
>
> > Although it's somewhat unrelated to the goal of this patch, I think we
> > should clean up the code tidy before proceeding. Shouldn't we separate
> > the actual parameters from the GUC base variables, and sort out the
> > all related variaghble? (something like the attached, on top of your
> > patch.)

Attached is v12. It has a number of updates, including a commit to
separate VacuumCostLimit and VacuumCostDelay from the gucs
vacuum_cost_limit and vacuum_cost_delay, and a return to
VacuumCostActive.

- Melanie

Attachment

Re: Should vacuum process config file reload more often

From
Masahiko Sawada
Date:
On Sat, Apr 1, 2023 at 4:09 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Fri, Mar 31, 2023 at 10:31 AM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> >
> > On Thu, Mar 30, 2023 at 3:26 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> > >
> > > > On 30 Mar 2023, at 04:57, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > > As another idea, why don't we use macros for that? For example,
> > > > suppose VacuumCostStatus is like:
> > > >
> > > > typedef enum VacuumCostStatus
> > > > {
> > > >    VACUUM_COST_INACTIVE_LOCKED = 0,
> > > >    VACUUM_COST_INACTIVE,
> > > >    VACUUM_COST_ACTIVE,
> > > > } VacuumCostStatus;
> > > > VacuumCostStatus VacuumCost;
> > > >
> > > > non-vacuum code can use the following macros:
> > > >
> > > > #define VacuumCostActive() (VacuumCost == VACUUM_COST_ACTIVE)
> > > > #define VacuumCostInactive() (VacuumCost <= VACUUM_COST_INACTIVE) //
> > > > or we can use !VacuumCostActive() instead.
> > >
> > > I'm in favor of something along these lines.  A variable with a name that
> > > implies a boolean value (active/inactive) but actually contains a tri-value is
> > > easily misunderstood.  A VacuumCostState tri-value variable (or a better name)
> > > with a set of convenient macros for extracting the boolean active/inactive that
> > > most of the code needs to be concerned with would more for more readable code I
> > > think.
> >
> > The macros are very error-prone. I was just implementing this idea and
> > mistakenly tried to set the macro instead of the variable in multiple
> > places. Avoiding this involves another set of macros, and, in the end, I
> > think the complexity is much worse. Given the reviewers' uniform dislike
> > of VacuumCostInactive, I favor going back to two variables
> > (VacuumCostActive + VacuumFailsafeActive) and moving
> > LVRelState->failsafe_active to the global VacuumFailsafeActive.
> >
> > I will reimplement this in the next version.

Thank you for updating the patches. Here are comments for 0001, 0002,
and 0003 patches:

 0001:

@@ -391,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
         Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
         Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
                    params->truncate != VACOPTVALUE_AUTO);
-        vacrel->failsafe_active = false;
+        VacuumFailsafeActive = false;

If we go with the idea of using VacuumCostActive +
VacuumFailsafeActive, we need to make sure that both are cleared at
the end of the vacuum per table. Since the patch clears it only here,
it remains true even after vacuum() if we trigger the failsafe mode
for the last table in the table list.

In addition to that,  to ensure that also in an error case, I think we
need to clear it also in PG_FINALLY() block in vacuum().

---
@@ -306,6 +306,7 @@ extern PGDLLIMPORT pg_atomic_uint32
*VacuumSharedCostBalance;
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;

+extern bool VacuumFailsafeActive;

Do we need PGDLLIMPORT for VacuumFailSafeActive?

0002:

@@ -2388,6 +2398,7 @@ vac_max_items_to_alloc_size(int max_items)
         return offsetof(VacDeadItems, items) +
sizeof(ItemPointerData) * max_items;
 }

+
 /*
  *     vac_tid_reaped() -- is a particular tid deletable?
  *

Unnecessary new line. There are some other unnecessary new lines in this patch.

---
@@ -307,6 +309,8 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;

 extern bool VacuumFailsafeActive;
+extern int     VacuumCostLimit;
+extern double VacuumCostDelay;

and

@@ -266,8 +266,6 @@ extern PGDLLIMPORT int max_parallel_maintenance_workers;
 extern PGDLLIMPORT int VacuumCostPageHit;
 extern PGDLLIMPORT int VacuumCostPageMiss;
 extern PGDLLIMPORT int VacuumCostPageDirty;
-extern PGDLLIMPORT int VacuumCostLimit;
-extern PGDLLIMPORT double VacuumCostDelay;

Do we need PGDLLIMPORT too?

---
@@ -1773,20 +1773,33 @@ FreeWorkerInfo(int code, Datum arg)
        }
 }

+
 /*
- * Update the cost-based delay parameters, so that multiple workers consume
- * each a fraction of the total available I/O.
+ * Update vacuum cost-based delay-related parameters for autovacuum workers and
+ * backends executing VACUUM or ANALYZE using the value of relevant gucs and
+ * global state. This must be called during setup for vacuum and after every
+ * config reload to ensure up-to-date values.
  */
 void
-AutoVacuumUpdateDelay(void)
+VacuumUpdateCosts(void)
 {

Isn't it better to define VacuumUpdateCosts() in vacuum.c rather than
autovacuum.c as this is now a common code for both vacuum and
autovacuum?

0003:

@@ -501,9 +502,9 @@ vacuum(List *relations, VacuumParams *params,
         {
                 ListCell   *cur;

-                VacuumUpdateCosts();
                 in_vacuum = true;
-                VacuumCostActive = (VacuumCostDelay > 0);
+                VacuumFailsafeActive = false;
+                VacuumUpdateCosts();

Hmm, if we initialize VacuumFailsafeActive here, should it be included
in 0001 patch?

---
+        if (VacuumCostDelay > 0)
+                VacuumCostActive = true;
+        else
+        {
+                VacuumCostActive = false;
+                VacuumCostBalance = 0;
+        }

I agree to update VacuumCostActive in VacuumUpdateCosts(). But if we
do that I think this change should be included in 0002 patch.

---
+        if (ConfigReloadPending && !analyze_in_outer_xact)
+        {
+                ConfigReloadPending = false;
+                ProcessConfigFile(PGC_SIGHUP);
+                VacuumUpdateCosts();
+        }

Since analyze_in_outer_xact is false by default, we reload the config
file in vacuum_delay_point() by default. We need to note that
vacuum_delay_point() could be called via other paths, for example
gin_cleanup_pending_list() and ambulkdelete() called by
validate_index(). So it seems to me that we should do the opposite; we
have another global variable, say vacuum_can_reload_config, which is
false by default, and is set to true only when vacuum() allows it. In
vacuum_delay_point(), we reload the config file iff
(ConfigReloadPending && vacuum_can_reload_config).

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Sun, Apr 2, 2023 at 10:28 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> Thank you for updating the patches. Here are comments for 0001, 0002,
> and 0003 patches:

Thanks for the review!

v13 attached with requested updates.

>  0001:
>
> @@ -391,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
>          Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
>          Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
>                     params->truncate != VACOPTVALUE_AUTO);
> -        vacrel->failsafe_active = false;
> +        VacuumFailsafeActive = false;
>
> If we go with the idea of using VacuumCostActive +
> VacuumFailsafeActive, we need to make sure that both are cleared at
> the end of the vacuum per table. Since the patch clears it only here,
> it remains true even after vacuum() if we trigger the failsafe mode
> for the last table in the table list.
>
> In addition to that,  to ensure that also in an error case, I think we
> need to clear it also in PG_FINALLY() block in vacuum().

So, in 0001, I tried to keep it exactly the same as
LVRelState->failsafe_active except for it being a global. We don't
actually use VacuumFailsafeActive in this commit except in vacuumlazy.c,
which does its own management of the value (it resets it to false at the
top of heap_vacuum_rel()).

In the later commit which references VacuumFailsafeActive outside of
vacuumlazy.c, I had reset it in PG_FINALLY(). I hadn't reset it in the
relation list loop in vacuum(). Autovacuum calls vacuum() for each
relation. However, you are right that for VACUUM with a list of
relations for a table access method other than heap, once set to true,
if the table AM forgets to reset the value to false at the end of
vacuuming the relation, it would stay true.

I've set it to false now at the bottom of the loop through relations in
vacuum().

> ---
> @@ -306,6 +306,7 @@ extern PGDLLIMPORT pg_atomic_uint32
> *VacuumSharedCostBalance;
>  extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
>  extern PGDLLIMPORT int VacuumCostBalanceLocal;
>
> +extern bool VacuumFailsafeActive;
>
> Do we need PGDLLIMPORT for VacuumFailSafeActive?

I didn't add one because I thought extensions and other code probably
shouldn't access this variable. I thought PGDLLIMPORT was only needed
for extensions built on windows to access variables.

> 0002:
>
> @@ -2388,6 +2398,7 @@ vac_max_items_to_alloc_size(int max_items)
>          return offsetof(VacDeadItems, items) +
> sizeof(ItemPointerData) * max_items;
>  }
>
> +
>  /*
>   *     vac_tid_reaped() -- is a particular tid deletable?
>   *
>
> Unnecessary new line. There are some other unnecessary new lines in this patch.

Thanks! I think I got them all.

> ---
> @@ -307,6 +309,8 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
>  extern PGDLLIMPORT int VacuumCostBalanceLocal;
>
>  extern bool VacuumFailsafeActive;
> +extern int     VacuumCostLimit;
> +extern double VacuumCostDelay;
>
> and
>
> @@ -266,8 +266,6 @@ extern PGDLLIMPORT int max_parallel_maintenance_workers;
>  extern PGDLLIMPORT int VacuumCostPageHit;
>  extern PGDLLIMPORT int VacuumCostPageMiss;
>  extern PGDLLIMPORT int VacuumCostPageDirty;
> -extern PGDLLIMPORT int VacuumCostLimit;
> -extern PGDLLIMPORT double VacuumCostDelay;
>
> Do we need PGDLLIMPORT too?

I was on the fence about this. I annotated the new guc variables
vacuum_cost_delay and vacuum_cost_limit with PGDLLIMPORT, but I did not
annotate the variables used in vacuum code (VacuumCostLimit/Delay). I
think whether or not this is the right choice depends on two things:
whether or not my understanding of PGDLLIMPORT is correct and, if it is,
whether or not we want extensions to be able to access
VacuumCostLimit/Delay or if just access to the guc variables is
sufficient/desirable.

> ---
> @@ -1773,20 +1773,33 @@ FreeWorkerInfo(int code, Datum arg)
>         }
>  }
>
> +
>  /*
> - * Update the cost-based delay parameters, so that multiple workers consume
> - * each a fraction of the total available I/O.
> + * Update vacuum cost-based delay-related parameters for autovacuum workers and
> + * backends executing VACUUM or ANALYZE using the value of relevant gucs and
> + * global state. This must be called during setup for vacuum and after every
> + * config reload to ensure up-to-date values.
>   */
>  void
> -AutoVacuumUpdateDelay(void)
> +VacuumUpdateCosts(void
>
> Isn't it better to define VacuumUpdateCosts() in vacuum.c rather than
> autovacuum.c as this is now a common code for both vacuum and
> autovacuum?

We can't access members of WorkerInfoData from inside vacuum.c

> 0003:
>
> @@ -501,9 +502,9 @@ vacuum(List *relations, VacuumParams *params,
>          {
>                  ListCell   *cur;
>
> -                VacuumUpdateCosts();
>                  in_vacuum = true;
> -                VacuumCostActive = (VacuumCostDelay > 0);
> +                VacuumFailsafeActive = false;
> +                VacuumUpdateCosts();
>
> Hmm, if we initialize VacuumFailsafeActive here, should it be included
> in 0001 patch?

See comment above. This is the first patch where we use or reference it
outside of vacuumlazy.c

> ---
> +        if (VacuumCostDelay > 0)
> +                VacuumCostActive = true;
> +        else
> +        {
> +                VacuumCostActive = false;
> +                VacuumCostBalance = 0;
> +        }
>
> I agree to update VacuumCostActive in VacuumUpdateCosts(). But if we
> do that I think this change should be included in 0002 patch.

I'm a bit hesitant to do this because in 0002 VacuumCostActive cannot
change status while vacuuming a table or even between tables for VACUUM
when a list of relations is specified (except for being disabled by
failsafe mode) Adding it to VacuumUpdateCosts() in 0003 makes it clear
that it could change while vacuuming a table, so we must update it.

I previously had 0002 introduce AutoVacuumUpdateLimit(), which only
updated VacuumCostLimit with wi_cost_limit for autovacuum workers and
then called that in vacuum_delay_point() (instead of
AutoVacuumUpdateDelay() or VacuumUpdateCosts()). I abandoned that idea
in favor of the simplicity of having VacuumUpdateCosts() just update
those variables for everyone, since it could be reused in 0003.

Now, I'm thinking the previous method might be more clear?
Or is what I have okay?

> ---
> +        if (ConfigReloadPending && !analyze_in_outer_xact)
> +        {
> +                ConfigReloadPending = false;
> +                ProcessConfigFile(PGC_SIGHUP);
> +                VacuumUpdateCosts();
> +        }
>
> Since analyze_in_outer_xact is false by default, we reload the config
> file in vacuum_delay_point() by default. We need to note that
> vacuum_delay_point() could be called via other paths, for example
> gin_cleanup_pending_list() and ambulkdelete() called by
> validate_index(). So it seems to me that we should do the opposite; we
> have another global variable, say vacuum_can_reload_config, which is
> false by default, and is set to true only when vacuum() allows it. In
> vacuum_delay_point(), we reload the config file iff
> (ConfigReloadPending && vacuum_can_reload_config).

Wow, great point. Thanks for catching this. I've made the update you
suggested. I also set vacuum_can_reload_config to false in PG_FINALLY()
in vacuum().

- Melanie

Attachment

Re: Should vacuum process config file reload more often

From
Tom Lane
Date:
Melanie Plageman <melanieplageman@gmail.com> writes:
> v13 attached with requested updates.

I'm afraid I'd not been paying any attention to this discussion,
but better late than never.  I'm okay with letting autovacuum
processes reload config files more often than now.  However,
I object to allowing ProcessConfigFile to be called from within
commands in a normal user backend.  The existing semantics are
that user backends respond to SIGHUP only at the start of processing
a user command, and I'm uncomfortable with suddenly deciding that
that can work differently if the command happens to be VACUUM.
It seems unprincipled and perhaps actively unsafe.

            regards, tom lane



Re: Should vacuum process config file reload more often

From
Andres Freund
Date:
Hi,

On 2023-04-03 14:43:14 -0400, Tom Lane wrote:
> Melanie Plageman <melanieplageman@gmail.com> writes:
> > v13 attached with requested updates.
> 
> I'm afraid I'd not been paying any attention to this discussion,
> but better late than never.  I'm okay with letting autovacuum
> processes reload config files more often than now.  However,
> I object to allowing ProcessConfigFile to be called from within
> commands in a normal user backend.  The existing semantics are
> that user backends respond to SIGHUP only at the start of processing
> a user command, and I'm uncomfortable with suddenly deciding that
> that can work differently if the command happens to be VACUUM.
> It seems unprincipled and perhaps actively unsafe.

I think it should be ok in commands like VACUUM that already internally start
their own transactions, and thus require to be run outside of a transaction
and at the toplevel. I share your concerns about allowing config reload in
arbitrary places. While we might want to go there, it would require a lot more
analysis.

Greetings,

Andres Freund



Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Mon, Apr 3, 2023 at 3:08 PM Andres Freund <andres@anarazel.de> wrote:
> On 2023-04-03 14:43:14 -0400, Tom Lane wrote:
> > Melanie Plageman <melanieplageman@gmail.com> writes:
> > > v13 attached with requested updates.
> >
> > I'm afraid I'd not been paying any attention to this discussion,
> > but better late than never.  I'm okay with letting autovacuum
> > processes reload config files more often than now.  However,
> > I object to allowing ProcessConfigFile to be called from within
> > commands in a normal user backend.  The existing semantics are
> > that user backends respond to SIGHUP only at the start of processing
> > a user command, and I'm uncomfortable with suddenly deciding that
> > that can work differently if the command happens to be VACUUM.
> > It seems unprincipled and perhaps actively unsafe.
>
> I think it should be ok in commands like VACUUM that already internally start
> their own transactions, and thus require to be run outside of a transaction
> and at the toplevel. I share your concerns about allowing config reload in
> arbitrary places. While we might want to go there, it would require a lot more
> analysis.

As an alternative for your consideration, attached v14 set implements
the config file reload for autovacuum only (in 0003) and then enables it
for VACUUM and ANALYZE not in a nested transaction command (in 0004).

Previously I had the commits in the reverse order for ease of review (to
separate changes to worker limit balancing logic from config reload
code).

- Melanie

Attachment

Re: Should vacuum process config file reload more often

From
Masahiko Sawada
Date:
On Tue, Apr 4, 2023 at 1:41 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Sun, Apr 2, 2023 at 10:28 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > Thank you for updating the patches. Here are comments for 0001, 0002,
> > and 0003 patches:
>
> Thanks for the review!
>
> v13 attached with requested updates.
>
> >  0001:
> >
> > @@ -391,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
> >          Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
> >          Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
> >                     params->truncate != VACOPTVALUE_AUTO);
> > -        vacrel->failsafe_active = false;
> > +        VacuumFailsafeActive = false;
> >
> > If we go with the idea of using VacuumCostActive +
> > VacuumFailsafeActive, we need to make sure that both are cleared at
> > the end of the vacuum per table. Since the patch clears it only here,
> > it remains true even after vacuum() if we trigger the failsafe mode
> > for the last table in the table list.
> >
> > In addition to that,  to ensure that also in an error case, I think we
> > need to clear it also in PG_FINALLY() block in vacuum().
>
> So, in 0001, I tried to keep it exactly the same as
> LVRelState->failsafe_active except for it being a global. We don't
> actually use VacuumFailsafeActive in this commit except in vacuumlazy.c,
> which does its own management of the value (it resets it to false at the
> top of heap_vacuum_rel()).
>
> In the later commit which references VacuumFailsafeActive outside of
> vacuumlazy.c, I had reset it in PG_FINALLY(). I hadn't reset it in the
> relation list loop in vacuum(). Autovacuum calls vacuum() for each
> relation. However, you are right that for VACUUM with a list of
> relations for a table access method other than heap, once set to true,
> if the table AM forgets to reset the value to false at the end of
> vacuuming the relation, it would stay true.
>
> I've set it to false now at the bottom of the loop through relations in
> vacuum().

Agreed. Probably we can merge 0001 into 0003 but I leave it to
committers. The 0001 patch mostly looks good to me except for one
point:

@@ -391,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
         Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
         Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
                    params->truncate != VACOPTVALUE_AUTO);
-        vacrel->failsafe_active = false;
+        VacuumFailsafeActive = false;
         vacrel->consider_bypass_optimization = true;
         vacrel->do_index_vacuuming = true;

Looking at the 0003 patch, we set VacuumFailsafeActive to false per table:

+                        /*
+                         * Ensure VacuumFailsafeActive has been reset
before vacuuming the
+                         * next relation relation.
+                         */
+                        VacuumFailsafeActive = false;

Given that we ensure it's reset before vacuuming the next table, do we
need to reset it in heap_vacuum_rel?

(there is a typo; s/relation relation/relation/)

>
> > ---
> > @@ -306,6 +306,7 @@ extern PGDLLIMPORT pg_atomic_uint32
> > *VacuumSharedCostBalance;
> >  extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
> >  extern PGDLLIMPORT int VacuumCostBalanceLocal;
> >
> > +extern bool VacuumFailsafeActive;
> >
> > Do we need PGDLLIMPORT for VacuumFailSafeActive?
>
> I didn't add one because I thought extensions and other code probably
> shouldn't access this variable. I thought PGDLLIMPORT was only needed
> for extensions built on windows to access variables.

Agreed.

>
> > 0002:
> >
> > @@ -2388,6 +2398,7 @@ vac_max_items_to_alloc_size(int max_items)
> >          return offsetof(VacDeadItems, items) +
> > sizeof(ItemPointerData) * max_items;
> >  }
> >
> > +
> >  /*
> >   *     vac_tid_reaped() -- is a particular tid deletable?
> >   *
> >
> > Unnecessary new line. There are some other unnecessary new lines in this patch.
>
> Thanks! I think I got them all.
>
> > ---
> > @@ -307,6 +309,8 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
> >  extern PGDLLIMPORT int VacuumCostBalanceLocal;
> >
> >  extern bool VacuumFailsafeActive;
> > +extern int     VacuumCostLimit;
> > +extern double VacuumCostDelay;
> >
> > and
> >
> > @@ -266,8 +266,6 @@ extern PGDLLIMPORT int max_parallel_maintenance_workers;
> >  extern PGDLLIMPORT int VacuumCostPageHit;
> >  extern PGDLLIMPORT int VacuumCostPageMiss;
> >  extern PGDLLIMPORT int VacuumCostPageDirty;
> > -extern PGDLLIMPORT int VacuumCostLimit;
> > -extern PGDLLIMPORT double VacuumCostDelay;
> >
> > Do we need PGDLLIMPORT too?
>
> I was on the fence about this. I annotated the new guc variables
> vacuum_cost_delay and vacuum_cost_limit with PGDLLIMPORT, but I did not
> annotate the variables used in vacuum code (VacuumCostLimit/Delay). I
> think whether or not this is the right choice depends on two things:
> whether or not my understanding of PGDLLIMPORT is correct and, if it is,
> whether or not we want extensions to be able to access
> VacuumCostLimit/Delay or if just access to the guc variables is
> sufficient/desirable.

I guess it would be better to keep both accessible for backward
compatibility. Extensions are able to access both GUC values and
values that are actually used for vacuum delays (as we used to use the
same variables).

>
> > ---
> > @@ -1773,20 +1773,33 @@ FreeWorkerInfo(int code, Datum arg)
> >         }
> >  }
> >
> > +
> >  /*
> > - * Update the cost-based delay parameters, so that multiple workers consume
> > - * each a fraction of the total available I/O.
> > + * Update vacuum cost-based delay-related parameters for autovacuum workers and
> > + * backends executing VACUUM or ANALYZE using the value of relevant gucs and
> > + * global state. This must be called during setup for vacuum and after every
> > + * config reload to ensure up-to-date values.
> >   */
> >  void
> > -AutoVacuumUpdateDelay(void)
> > +VacuumUpdateCosts(void
> >
> > Isn't it better to define VacuumUpdateCosts() in vacuum.c rather than
> > autovacuum.c as this is now a common code for both vacuum and
> > autovacuum?
>
> We can't access members of WorkerInfoData from inside vacuum.c

Oops, you're right.

>
> > 0003:
> >
> > @@ -501,9 +502,9 @@ vacuum(List *relations, VacuumParams *params,
> >          {
> >                  ListCell   *cur;
> >
> > -                VacuumUpdateCosts();
> >                  in_vacuum = true;
> > -                VacuumCostActive = (VacuumCostDelay > 0);
> > +                VacuumFailsafeActive = false;
> > +                VacuumUpdateCosts();
> >
> > Hmm, if we initialize VacuumFailsafeActive here, should it be included
> > in 0001 patch?
>
> See comment above. This is the first patch where we use or reference it
> outside of vacuumlazy.c
>
> > ---
> > +        if (VacuumCostDelay > 0)
> > +                VacuumCostActive = true;
> > +        else
> > +        {
> > +                VacuumCostActive = false;
> > +                VacuumCostBalance = 0;
> > +        }
> >
> > I agree to update VacuumCostActive in VacuumUpdateCosts(). But if we
> > do that I think this change should be included in 0002 patch.
>
> I'm a bit hesitant to do this because in 0002 VacuumCostActive cannot
> change status while vacuuming a table or even between tables for VACUUM
> when a list of relations is specified (except for being disabled by
> failsafe mode) Adding it to VacuumUpdateCosts() in 0003 makes it clear
> that it could change while vacuuming a table, so we must update it.
>

Agreed.

> I previously had 0002 introduce AutoVacuumUpdateLimit(), which only
> updated VacuumCostLimit with wi_cost_limit for autovacuum workers and
> then called that in vacuum_delay_point() (instead of
> AutoVacuumUpdateDelay() or VacuumUpdateCosts()). I abandoned that idea
> in favor of the simplicity of having VacuumUpdateCosts() just update
> those variables for everyone, since it could be reused in 0003.
>
> Now, I'm thinking the previous method might be more clear?
> Or is what I have okay?

I'm fine with the current one.

>
> > ---
> > +        if (ConfigReloadPending && !analyze_in_outer_xact)
> > +        {
> > +                ConfigReloadPending = false;
> > +                ProcessConfigFile(PGC_SIGHUP);
> > +                VacuumUpdateCosts();
> > +        }
> >
> > Since analyze_in_outer_xact is false by default, we reload the config
> > file in vacuum_delay_point() by default. We need to note that
> > vacuum_delay_point() could be called via other paths, for example
> > gin_cleanup_pending_list() and ambulkdelete() called by
> > validate_index(). So it seems to me that we should do the opposite; we
> > have another global variable, say vacuum_can_reload_config, which is
> > false by default, and is set to true only when vacuum() allows it. In
> > vacuum_delay_point(), we reload the config file iff
> > (ConfigReloadPending && vacuum_can_reload_config).
>
> Wow, great point. Thanks for catching this. I've made the update you
> suggested. I also set vacuum_can_reload_config to false in PG_FINALLY()
> in vacuum().

Here are some review comments for 0002-0004 patches:

0002:
-        if (MyWorkerInfo)
+        if (am_autovacuum_launcher)
+                return;
+
+        if (am_autovacuum_worker)
         {
-                VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
                 VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+                VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
+        }

Isn't it a bit safer to check MyWorkerInfo instead of
am_autovacuum_worker? Also, I don't think there is any reason why we
want to exclude only the autovacuum launcher.

---
+ * TODO: should VacuumCostLimit and VacuumCostDelay be initialized to valid or
+ * invalid values?

How about using the default value of normal backends, 200 and 0?

0003:

@@ -83,6 +84,7 @@ int                   vacuum_cost_limit;
  */
 int                    VacuumCostLimit = 0;
 double         VacuumCostDelay = -1;
+static bool vacuum_can_reload_config = false;

In vacuum.c, we use snake case for GUC parameters and camel case for
other global variables, so it seems better to rename it
VacuumCanReloadConfig. Sorry, that's my fault.

0004:

+                if (tab->at_dobalance)
+                        pg_atomic_test_set_flag(&MyWorkerInfo->wi_dobalance);
+                else

The comment of pg_atomic_test_set_flag() says that it returns false if
the flag has not successfully been set:

 * pg_atomic_test_set_flag - TAS()
 *
 * Returns true if the flag has successfully been set, false otherwise.
 *
 * Acquire (including read barrier) semantics.

But IIUC we don't need to worry about that as only one process updates
the flag, right? It might be a good idea to add some comments why we
don't need to check the return value.

---
-                if (worker->wi_proc != NULL)
-                        elog(DEBUG2, "autovac_balance_cost(pid=%d
db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d,
cost_delay=%g)",
-                                 worker->wi_proc->pid,
worker->wi_dboid, worker->wi_tableoid,
-                                 worker->wi_dobalance ? "yes" : "no",
-                                 worker->wi_cost_limit,
worker->wi_cost_limit_base,
-                                 worker->wi_cost_delay);

I think it's better to keep this kind of log in some form for
debugging. For example, we can show these values of autovacuum workers
in VacuumUpdateCosts().

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Should vacuum process config file reload more often

From
Daniel Gustafsson
Date:
> On 4 Apr 2023, at 00:35, Melanie Plageman <melanieplageman@gmail.com> wrote:
>
> On Mon, Apr 3, 2023 at 3:08 PM Andres Freund <andres@anarazel.de> wrote:
>> On 2023-04-03 14:43:14 -0400, Tom Lane wrote:
>>> Melanie Plageman <melanieplageman@gmail.com> writes:
>>>> v13 attached with requested updates.
>>>
>>> I'm afraid I'd not been paying any attention to this discussion,
>>> but better late than never.  I'm okay with letting autovacuum
>>> processes reload config files more often than now.  However,
>>> I object to allowing ProcessConfigFile to be called from within
>>> commands in a normal user backend.  The existing semantics are
>>> that user backends respond to SIGHUP only at the start of processing
>>> a user command, and I'm uncomfortable with suddenly deciding that
>>> that can work differently if the command happens to be VACUUM.
>>> It seems unprincipled and perhaps actively unsafe.
>>
>> I think it should be ok in commands like VACUUM that already internally start
>> their own transactions, and thus require to be run outside of a transaction
>> and at the toplevel. I share your concerns about allowing config reload in
>> arbitrary places. While we might want to go there, it would require a lot more
>> analysis.

Thinking more on this I'm leaning towards going with allowing more frequent
reloads in autovacuum, and saving the same for VACUUM for more careful study.
The general case is probably fine but I'm not convinced that there aren't error
cases which can present unpleasant scenarios.

Regarding the autovacuum part of this patch I think we are down to the final
details and I think it's doable to finish this in time for 16.

> As an alternative for your consideration, attached v14 set implements
> the config file reload for autovacuum only (in 0003) and then enables it
> for VACUUM and ANALYZE not in a nested transaction command (in 0004).
>
> Previously I had the commits in the reverse order for ease of review (to
> separate changes to worker limit balancing logic from config reload
> code).

A few comments on top of already submitted reviews, will do another pass over
this later today.

+ * VacuumFailsafeActive is a defined as a global so that we can determine
+ * whether or not to re-enable cost-based vacuum delay when vacuuming a table.

This comment should be expanded to document who we expect to inspect this
variable in order to decide on cost-based vacuum.

Moving the failsafe switch into a global context means we face the risk of an
extension changing it independently of the GUCs that control it (or the code
relying on it) such that these are out of sync.  External code messing up
internal state is not new and of course outside of our control, but it's worth
at least considering.  There isn't too much we can do here, but perhaps expand
this comment to include a "do not change this" note?

+extern bool VacuumFailsafeActive;

While I agree with upthread review comments that extensions shoulnd't poke at
this, not decorating it with PGDLLEXPORT adds little protection and only cause
inconsistencies in symbol exports across platforms.  We only explicitly hide
symbols in shared libraries IIRC.

+extern int VacuumCostLimit;
+extern double VacuumCostDelay;
 ...
-extern PGDLLIMPORT int VacuumCostLimit;
-extern PGDLLIMPORT double VacuumCostDelay;

Same with these, I don't think this is according to our default visibility.
Moreover, I'm not sure it's a good idea to perform this rename.  This will keep
VacuumCostLimit and VacuumCostDelay exported, but change their meaning.  Any
external code referring to these thinking they are backing the GUCs will still
compile, but may be broken in subtle ways.  Is there a reason for not keeping
the current GUC variables and instead add net new ones?


+ * TODO: should VacuumCostLimit and VacuumCostDelay be initialized to valid or
+ * invalid values?
+ */
+int                    VacuumCostLimit = 0;
+double         VacuumCostDelay = -1;

I think the important part is to make sure they are never accessed without
VacuumUpdateCosts having been called first.  I think that's the case here, but
it's not entirely clear.  Do you see a codepath where that could happen?  If
they are initialized to a sentinel value we also need to check for that, so
initializing to the defaults from the corresponding GUCs seems better.

+* Update VacuumCostLimit with the correct value for an autovacuum worker, given

Trivial whitespace error in function comment.


+static double av_relopt_cost_delay = -1;
+static int av_relopt_cost_limit = 0;

These need a comment IMO, ideally one that explain why they are initialized to
those values.


+       /* There is at least 1 autovac worker (this worker). */
+       Assert(nworkers_for_balance > 0);

Is there a scenario where this is expected to fail?  If so I think this should
be handled and not just an Assert.

--
Daniel Gustafsson




Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Tue, Apr 4, 2023 at 4:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> The 0001 patch mostly looks good to me except for one
> point:
>
> @@ -391,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
>          Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
>          Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
>                     params->truncate != VACOPTVALUE_AUTO);
> -        vacrel->failsafe_active = false;
> +        VacuumFailsafeActive = false;
>          vacrel->consider_bypass_optimization = true;
>          vacrel->do_index_vacuuming = true;
>
> Looking at the 0003 patch, we set VacuumFailsafeActive to false per table:
>
> +                        /*
> +                         * Ensure VacuumFailsafeActive has been reset
> before vacuuming the
> +                         * next relation relation.
> +                         */
> +                        VacuumFailsafeActive = false;
>
> Given that we ensure it's reset before vacuuming the next table, do we
> need to reset it in heap_vacuum_rel?

I've changed the one in heap_vacuum_rel() to an assert.

> (there is a typo; s/relation relation/relation/)

Thanks! fixed.

> > > 0002:
> > >
> > > @@ -2388,6 +2398,7 @@ vac_max_items_to_alloc_size(int max_items)
> > >          return offsetof(VacDeadItems, items) +
> > > sizeof(ItemPointerData) * max_items;
> > >  }
> > >
> > > +
> > >  /*
> > >   *     vac_tid_reaped() -- is a particular tid deletable?
> > >   *
> > >
> > > Unnecessary new line. There are some other unnecessary new lines in this patch.
> >
> > Thanks! I think I got them all.
> >
> > > ---
> > > @@ -307,6 +309,8 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
> > >  extern PGDLLIMPORT int VacuumCostBalanceLocal;
> > >
> > >  extern bool VacuumFailsafeActive;
> > > +extern int     VacuumCostLimit;
> > > +extern double VacuumCostDelay;
> > >
> > > and
> > >
> > > @@ -266,8 +266,6 @@ extern PGDLLIMPORT int max_parallel_maintenance_workers;
> > >  extern PGDLLIMPORT int VacuumCostPageHit;
> > >  extern PGDLLIMPORT int VacuumCostPageMiss;
> > >  extern PGDLLIMPORT int VacuumCostPageDirty;
> > > -extern PGDLLIMPORT int VacuumCostLimit;
> > > -extern PGDLLIMPORT double VacuumCostDelay;
> > >
> > > Do we need PGDLLIMPORT too?
> >
> > I was on the fence about this. I annotated the new guc variables
> > vacuum_cost_delay and vacuum_cost_limit with PGDLLIMPORT, but I did not
> > annotate the variables used in vacuum code (VacuumCostLimit/Delay). I
> > think whether or not this is the right choice depends on two things:
> > whether or not my understanding of PGDLLIMPORT is correct and, if it is,
> > whether or not we want extensions to be able to access
> > VacuumCostLimit/Delay or if just access to the guc variables is
> > sufficient/desirable.
>
> I guess it would be better to keep both accessible for backward
> compatibility. Extensions are able to access both GUC values and
> values that are actually used for vacuum delays (as we used to use the
> same variables).

> Here are some review comments for 0002-0004 patches:
>
> 0002:
> -        if (MyWorkerInfo)
> +        if (am_autovacuum_launcher)
> +                return;
> +
> +        if (am_autovacuum_worker)
>          {
> -                VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
>                  VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
> +                VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
> +        }
>
> Isn't it a bit safer to check MyWorkerInfo instead of
> am_autovacuum_worker?

Ah, since we access it. I've made the change.

> Also, I don't think there is any reason why we want to exclude only
> the autovacuum launcher.

My rationale is that the launcher is the only other process type which
might reasonably be executing this code besides autovac workers, client
backends doing VACUUM/ANALYZE, and parallel vacuum workers. Is it
confusing to have the launcher have VacuumCostLimt and VacuumCostDelay
set to the guc values for explicit VACUUM and ANALYZE -- even if the
launcher doesn't use these variables?

I've removed the check, because I do agree with you that it may be
unnecessarily confusing in the code.

> ---
> + * TODO: should VacuumCostLimit and VacuumCostDelay be initialized to valid or
> + * invalid values?
>
> How about using the default value of normal backends, 200 and 0?

I've gone with this suggestion

> 0003:
>
> @@ -83,6 +84,7 @@ int                   vacuum_cost_limit;
>   */
>  int                    VacuumCostLimit = 0;
>  double         VacuumCostDelay = -1;
> +static bool vacuum_can_reload_config = false;
>
> In vacuum.c, we use snake case for GUC parameters and camel case for
> other global variables, so it seems better to rename it
> VacuumCanReloadConfig. Sorry, that's my fault.

I have renamed it.

> 0004:
>
> +                if (tab->at_dobalance)
> +                        pg_atomic_test_set_flag(&MyWorkerInfo->wi_dobalance);
> +                else
>
> The comment of pg_atomic_test_set_flag() says that it returns false if
> the flag has not successfully been set:
>
>  * pg_atomic_test_set_flag - TAS()
>  *
>  * Returns true if the flag has successfully been set, false otherwise.
>  *
>  * Acquire (including read barrier) semantics.
>
> But IIUC we don't need to worry about that as only one process updates
> the flag, right? It might be a good idea to add some comments why we
> don't need to check the return value.

I have added this comment.

> ---
> -                if (worker->wi_proc != NULL)
> -                        elog(DEBUG2, "autovac_balance_cost(pid=%d
> db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d,
> cost_delay=%g)",
> -                                 worker->wi_proc->pid,
> worker->wi_dboid, worker->wi_tableoid,
> -                                 worker->wi_dobalance ? "yes" : "no",
> -                                 worker->wi_cost_limit,
> worker->wi_cost_limit_base,
> -                                 worker->wi_cost_delay);
>
> I think it's better to keep this kind of log in some form for
> debugging. For example, we can show these values of autovacuum workers
> in VacuumUpdateCosts().

I added a message to do_autovacuum() after calling VacuumUpdateCosts()
in the loop vacuuming each table. That means it will happen once per
table. It's not ideal that I had to move the call to VacuumUpdateCosts()
behind the shared lock in that loop so that we could access the pid and
such in the logging message after updating the cost and delay, but it is
probably okay. Though noone is going to be changing those at this
point, it still seemed better to access them under the lock.

This does mean we won't log anything when we do change the values of
VacuumCostDelay and VacuumCostLimit while vacuuming a table. Is it worth
adding some code to do that in VacuumUpdateCosts() (only when the value
has changed not on every call to VacuumUpdateCosts())? Or perhaps we
could add it in the config reload branch that is already in
vacuum_delay_point()?


On Tue, Apr 4, 2023 at 9:36 AM Daniel Gustafsson <daniel@yesql.se> wrote:
>

Thanks for the review!

> > On 4 Apr 2023, at 00:35, Melanie Plageman <melanieplageman@gmail.com> wrote:
> >
> > On Mon, Apr 3, 2023 at 3:08 PM Andres Freund <andres@anarazel.de> wrote:
> >> On 2023-04-03 14:43:14 -0400, Tom Lane wrote:
> >>> Melanie Plageman <melanieplageman@gmail.com> writes:
> >>>> v13 attached with requested updates.
> >>>
> >>> I'm afraid I'd not been paying any attention to this discussion,
> >>> but better late than never.  I'm okay with letting autovacuum
> >>> processes reload config files more often than now.  However,
> >>> I object to allowing ProcessConfigFile to be called from within
> >>> commands in a normal user backend.  The existing semantics are
> >>> that user backends respond to SIGHUP only at the start of processing
> >>> a user command, and I'm uncomfortable with suddenly deciding that
> >>> that can work differently if the command happens to be VACUUM.
> >>> It seems unprincipled and perhaps actively unsafe.
> >>
> >> I think it should be ok in commands like VACUUM that already internally start
> >> their own transactions, and thus require to be run outside of a transaction
> >> and at the toplevel. I share your concerns about allowing config reload in
> >> arbitrary places. While we might want to go there, it would require a lot more
> >> analysis.
>
> Thinking more on this I'm leaning towards going with allowing more frequent
> reloads in autovacuum, and saving the same for VACUUM for more careful study.
> The general case is probably fine but I'm not convinced that there aren't error
> cases which can present unpleasant scenarios.

In attached v15, I've dropped support for VACUUM and non-nested ANALYZE.
It is like a 5 line change and could be added back at any time.

> > As an alternative for your consideration, attached v14 set implements
> > the config file reload for autovacuum only (in 0003) and then enables it
> > for VACUUM and ANALYZE not in a nested transaction command (in 0004).
> >
> > Previously I had the commits in the reverse order for ease of review (to
> > separate changes to worker limit balancing logic from config reload
> > code).
>
> A few comments on top of already submitted reviews, will do another pass over
> this later today.
>
> + * VacuumFailsafeActive is a defined as a global so that we can determine
> + * whether or not to re-enable cost-based vacuum delay when vacuuming a table.
>
> This comment should be expanded to document who we expect to inspect this
> variable in order to decide on cost-based vacuum.
>
> Moving the failsafe switch into a global context means we face the risk of an
> extension changing it independently of the GUCs that control it (or the code
> relying on it) such that these are out of sync.  External code messing up
> internal state is not new and of course outside of our control, but it's worth
> at least considering.  There isn't too much we can do here, but perhaps expand
> this comment to include a "do not change this" note?

I've updated the comment to mention how table AM-agnostic VACUUM code
uses it and to say that table AMs can set it if they want that behavior.

> +extern bool VacuumFailsafeActive;
>
> While I agree with upthread review comments that extensions shoulnd't poke at
> this, not decorating it with PGDLLEXPORT adds little protection and only cause
> inconsistencies in symbol exports across platforms.  We only explicitly hide
> symbols in shared libraries IIRC.

I've updated this.

> +extern int VacuumCostLimit;
> +extern double VacuumCostDelay;
>  ...
> -extern PGDLLIMPORT int VacuumCostLimit;
> -extern PGDLLIMPORT double VacuumCostDelay;
>
> Same with these, I don't think this is according to our default visibility.
> Moreover, I'm not sure it's a good idea to perform this rename.  This will keep
> VacuumCostLimit and VacuumCostDelay exported, but change their meaning.  Any
> external code referring to these thinking they are backing the GUCs will still
> compile, but may be broken in subtle ways.  Is there a reason for not keeping
> the current GUC variables and instead add net new ones?

When VacuumCostLimit was the same variable in the code and for the GUC
vacuum_cost_limit, everytime we reload the config file, VacuumCostLimit
is overwritten. Autovacuum workers have to overwrite this value with the
appropriate one for themselves given the balancing logic and the value
of autovacuum_vacuum_cost_limit. However, the problem is, because you
can specify -1 for autovacuum_vacuum_cost_limit to indicate it should
fall back to vacuum_cost_limit, we have to reference the value of
VacuumCostLimit when calculating the new autovacuum worker's cost limit
after a config reload.

But, you have to be sure you *only* do this after a config reload when
the value of VacuumCostLimit is fresh and unmodified or you risk
dividing the value of VacuumCostLimit over and over. That means it is
unsafe to call functions updating the cost limit more than once.

This orchestration wasn't as difficult when we only reloaded the config
file once every table. We were careful about it and also kept the
original "base" cost limit around from table_recheck_autovac(). However,
once we started reloading the config file more often, this no longer
works.

By separating the variables modified when the gucs are set and the ones
used the code, we can make sure we always have the original value the
guc was set to in vacuum_cost_limit and autovacuum_vacuum_cost_limit,
whenever we need to reference it.

That being said, perhaps we should document what extensions should do?
Do you think they will want to use the variables backing the gucs or to
be able to overwrite the variables being used in the code?

Oh, also I've annotated these with PGDLLIMPORT too.

> + * TODO: should VacuumCostLimit and VacuumCostDelay be initialized to valid or
> + * invalid values?
> + */
> +int                    VacuumCostLimit = 0;
> +double         VacuumCostDelay = -1;
>
> I think the important part is to make sure they are never accessed without
> VacuumUpdateCosts having been called first.  I think that's the case here, but
> it's not entirely clear.  Do you see a codepath where that could happen?  If
> they are initialized to a sentinel value we also need to check for that, so
> initializing to the defaults from the corresponding GUCs seems better.

I don't see a case where autovacuum could access these without calling
VacuumUpdateCosts() first. I think the other callers of
vacuum_delay_point() are the issue (gist/gin/hash/etc).

It might need a bit more thought.

My concern was that these variables correspond to multiple GUCs each
depending on the backend type, and those backends have different
defaults (e.g. autovac workers default cost delay is different than
client backend doing vacuum cost delay).

However, what I have done in this version is initialize them to the
defaults for a client backend executing VACUUM or ANALYZE, since I am
fairly confident that autovacuum will not use them without calling
VacuumUpdateCosts().

>
> +* Update VacuumCostLimit with the correct value for an autovacuum worker, given
>
> Trivial whitespace error in function comment.

Fixed.

> +static double av_relopt_cost_delay = -1;
> +static int av_relopt_cost_limit = 0;
>
> These need a comment IMO, ideally one that explain why they are initialized to
> those values.

I've added a comment.

> +       /* There is at least 1 autovac worker (this worker). */
> +       Assert(nworkers_for_balance > 0);
>
> Is there a scenario where this is expected to fail?  If so I think this should
> be handled and not just an Assert.

No, this isn't expected to happen because an autovacuum worker would
have called autovac_recalculate_workers_for_balance() before calling
VacuumUpdateCosts() (which calls AutoVacuumUpdateLimit()) in
do_autovacuum(). But, if someone were to move around or add a call to
VacuumUpdateCosts() there is a chance it could happen.

- Melanie

Attachment

Re: Should vacuum process config file reload more often

From
Masahiko Sawada
Date:
On Wed, Apr 5, 2023 at 5:05 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Tue, Apr 4, 2023 at 4:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > ---
> > -                if (worker->wi_proc != NULL)
> > -                        elog(DEBUG2, "autovac_balance_cost(pid=%d
> > db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d,
> > cost_delay=%g)",
> > -                                 worker->wi_proc->pid,
> > worker->wi_dboid, worker->wi_tableoid,
> > -                                 worker->wi_dobalance ? "yes" : "no",
> > -                                 worker->wi_cost_limit,
> > worker->wi_cost_limit_base,
> > -                                 worker->wi_cost_delay);
> >
> > I think it's better to keep this kind of log in some form for
> > debugging. For example, we can show these values of autovacuum workers
> > in VacuumUpdateCosts().
>
> I added a message to do_autovacuum() after calling VacuumUpdateCosts()
> in the loop vacuuming each table. That means it will happen once per
> table. It's not ideal that I had to move the call to VacuumUpdateCosts()
> behind the shared lock in that loop so that we could access the pid and
> such in the logging message after updating the cost and delay, but it is
> probably okay. Though noone is going to be changing those at this
> point, it still seemed better to access them under the lock.
>
> This does mean we won't log anything when we do change the values of
> VacuumCostDelay and VacuumCostLimit while vacuuming a table. Is it worth
> adding some code to do that in VacuumUpdateCosts() (only when the value
> has changed not on every call to VacuumUpdateCosts())? Or perhaps we
> could add it in the config reload branch that is already in
> vacuum_delay_point()?

Previously, we used to show the pid in the log since a worker/launcher
set other workers' delay costs. But now that the worker sets its delay
costs, we don't need to show the pid in the log. Also, I think it's
useful for debugging and investigating the system if we log it when
changing the values. The log I imagined to add was like:

@@ -1801,6 +1801,13 @@ VacuumUpdateCosts(void)
            VacuumCostDelay = vacuum_cost_delay;

        AutoVacuumUpdateLimit();
+
+       elog(DEBUG2, "autovacuum update costs (db=%u, rel=%u,
dobalance=%s, cost_limit=%d, cost_delay=%g active=%s failsafe=%s)",
+            MyWorkerInfo->wi_dboid, MyWorkerInfo->wi_tableoid,
+            pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance)
? "no" : "yes",
+            VacuumCostLimit, VacuumCostDelay,
+            VacuumCostDelay > 0 ? "yes" : "no",
+            VacuumFailsafeActive ? "yes" : "no");
    }
    else
    {

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Should vacuum process config file reload more often

From
Kyotaro Horiguchi
Date:
Hi.

About 0001:

+ * VacuumFailsafeActive is a defined as a global so that we can determine
+ * whether or not to re-enable cost-based vacuum delay when vacuuming a table.
+ * If failsafe mode has been engaged, we will not re-enable cost-based delay
+ * for the table until after vacuuming has completed, regardless of other
+ * settings. Only VACUUM code should inspect this variable and only table
+ * access methods should set it. In Table AM-agnostic VACUUM code, this
+ * variable controls whether or not to allow cost-based delays. Table AMs are
+ * free to use it if they desire this behavior.
+ */
+bool        VacuumFailsafeActive = false;

If I understand this correctly, there seems to be an issue. The
AM-agnostic VACUUM code is setting it and no table AMs actually do
that.


0003:
+
+            /*
+             * Ensure VacuumFailsafeActive has been reset before vacuuming the
+             * next relation.
+             */
+            VacuumFailsafeActive = false;
         }
     }
     PG_FINALLY();
     {
         in_vacuum = false;
         VacuumCostActive = false;
+        VacuumFailsafeActive = false;
+        VacuumCostBalance = 0;

There is no need to reset VacuumFailsafeActive in the PG_TRY() block.


+    /*
+     * Reload the configuration file if requested. This allows changes to
+     * autovacuum_vacuum_cost_limit and autovacuum_vacuum_cost_delay to take
+     * effect while a table is being vacuumed or analyzed.
+     */
+    if (ConfigReloadPending && IsAutoVacuumWorkerProcess())
+    {
+        ConfigReloadPending = false;
+        ProcessConfigFile(PGC_SIGHUP);
+        VacuumUpdateCosts();
+    }

I believe we should prevent unnecessary reloading when
VacuumFailsafeActive is true.


+        AutoVacuumUpdateLimit();

I'm not entirely sure, but it might be better to name this
AutoVacuumUpdateCostLimit().


+    pg_atomic_flag wi_dobalance;
...
+        /*
+         * We only expect this worker to ever set the flag, so don't bother
+         * checking the return value. We shouldn't have to retry.
+         */
+        if (tab->at_dobalance)
+            pg_atomic_test_set_flag(&MyWorkerInfo->wi_dobalance);
+        else
+            pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);

        LWLockAcquire(AutovacuumLock, LW_SHARED);

        autovac_recalculate_workers_for_balance();

I don't see the need for using atomic here. The code is executed
infrequently and we already take a lock while counting do_balance
workers. So sticking with the old locking method (taking LW_EXCLUSIVE
then set wi_dobalance then do balance) should be fine.


+void
+AutoVacuumUpdateLimit(void)
...
+    if (av_relopt_cost_limit > 0)
+        VacuumCostLimit = av_relopt_cost_limit;
+    else

I think we should use wi_dobalance to decide if we need to do balance
or not. We don't need to take a lock to do that since only the process
updates it.


/*
          * Remove my info from shared memory.  We could, but intentionally
-         * don't, clear wi_cost_limit and friends --- this is on the
-         * assumption that we probably have more to do with similar cost
-         * settings, so we don't want to give up our share of I/O for a very
-         * short interval and thereby thrash the global balance.
+         * don't, unset wi_dobalance on the assumption that we are more likely
+         * than not to vacuum a table with no table options next, so we don't
+         * want to give up our share of I/O for a very short interval and
+         * thereby thrash the global balance.
          */
         LWLockAcquire(AutovacuumScheduleLock, LW_EXCLUSIVE);
         MyWorkerInfo->wi_tableoid = InvalidOid;

The comment mentions wi_dobalance, but it doesn't appear here..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Should vacuum process config file reload more often

From
Daniel Gustafsson
Date:
> On 4 Apr 2023, at 22:04, Melanie Plageman <melanieplageman@gmail.com> wrote:
>
> On Tue, Apr 4, 2023 at 4:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

>> Also, I don't think there is any reason why we want to exclude only
>> the autovacuum launcher.
>
> My rationale is that the launcher is the only other process type which
> might reasonably be executing this code besides autovac workers, client
> backends doing VACUUM/ANALYZE, and parallel vacuum workers. Is it
> confusing to have the launcher have VacuumCostLimt and VacuumCostDelay
> set to the guc values for explicit VACUUM and ANALYZE -- even if the
> launcher doesn't use these variables?
>
> I've removed the check, because I do agree with you that it may be
> unnecessarily confusing in the code.

+1

> On Tue, Apr 4, 2023 at 9:36 AM Daniel Gustafsson <daniel@yesql.se> wrote:
>>> On 4 Apr 2023, at 00:35, Melanie Plageman <melanieplageman@gmail.com> wrote:

>> Thinking more on this I'm leaning towards going with allowing more frequent
>> reloads in autovacuum, and saving the same for VACUUM for more careful study.
>> The general case is probably fine but I'm not convinced that there aren't error
>> cases which can present unpleasant scenarios.
>
> In attached v15, I've dropped support for VACUUM and non-nested ANALYZE.
> It is like a 5 line change and could be added back at any time.

I think thats the best option for now.

>> +extern int VacuumCostLimit;
>> +extern double VacuumCostDelay;
>> ...
>> -extern PGDLLIMPORT int VacuumCostLimit;
>> -extern PGDLLIMPORT double VacuumCostDelay;
>>
>> Same with these, I don't think this is according to our default visibility.
>> Moreover, I'm not sure it's a good idea to perform this rename.  This will keep
>> VacuumCostLimit and VacuumCostDelay exported, but change their meaning.  Any
>> external code referring to these thinking they are backing the GUCs will still
>> compile, but may be broken in subtle ways.  Is there a reason for not keeping
>> the current GUC variables and instead add net new ones?
>
> When VacuumCostLimit was the same variable in the code and for the GUC
> vacuum_cost_limit, everytime we reload the config file, VacuumCostLimit
> is overwritten. Autovacuum workers have to overwrite this value with the
> appropriate one for themselves given the balancing logic and the value
> of autovacuum_vacuum_cost_limit. However, the problem is, because you
> can specify -1 for autovacuum_vacuum_cost_limit to indicate it should
> fall back to vacuum_cost_limit, we have to reference the value of
> VacuumCostLimit when calculating the new autovacuum worker's cost limit
> after a config reload.
>
> But, you have to be sure you *only* do this after a config reload when
> the value of VacuumCostLimit is fresh and unmodified or you risk
> dividing the value of VacuumCostLimit over and over. That means it is
> unsafe to call functions updating the cost limit more than once.
>
> This orchestration wasn't as difficult when we only reloaded the config
> file once every table. We were careful about it and also kept the
> original "base" cost limit around from table_recheck_autovac(). However,
> once we started reloading the config file more often, this no longer
> works.
>
> By separating the variables modified when the gucs are set and the ones
> used the code, we can make sure we always have the original value the
> guc was set to in vacuum_cost_limit and autovacuum_vacuum_cost_limit,
> whenever we need to reference it.
>
> That being said, perhaps we should document what extensions should do?
> Do you think they will want to use the variables backing the gucs or to
> be able to overwrite the variables being used in the code?

I think I wasn't clear in my comment, sorry.  I don't have a problem with
introducing a new variable to split the balanced value from the GUC value.
What I don't think we should do is repurpose an exported symbol into doing a
new thing.  In the case at hand I think VacuumCostLimit and VacuumCostDelay
should remain the backing variables for the GUCs, with vacuum_cost_limit and
vacuum_cost_delay carrying the balanced values.  So the inverse of what is in
the patch now.

The risk of these symbols being used in extensions might be very low but on
principle it seems unwise to alter a symbol and risk subtle breakage.

> Oh, also I've annotated these with PGDLLIMPORT too.
>
>> + * TODO: should VacuumCostLimit and VacuumCostDelay be initialized to valid or
>> + * invalid values?
>> + */
>> +int                    VacuumCostLimit = 0;
>> +double         VacuumCostDelay = -1;
>>
>> I think the important part is to make sure they are never accessed without
>> VacuumUpdateCosts having been called first.  I think that's the case here, but
>> it's not entirely clear.  Do you see a codepath where that could happen?  If
>> they are initialized to a sentinel value we also need to check for that, so
>> initializing to the defaults from the corresponding GUCs seems better.
>
> I don't see a case where autovacuum could access these without calling
> VacuumUpdateCosts() first. I think the other callers of
> vacuum_delay_point() are the issue (gist/gin/hash/etc).
>
> It might need a bit more thought.
>
> My concern was that these variables correspond to multiple GUCs each
> depending on the backend type, and those backends have different
> defaults (e.g. autovac workers default cost delay is different than
> client backend doing vacuum cost delay).
>
> However, what I have done in this version is initialize them to the
> defaults for a client backend executing VACUUM or ANALYZE, since I am
> fairly confident that autovacuum will not use them without calling
> VacuumUpdateCosts().

Another question along these lines, we only call AutoVacuumUpdateLimit() in
case there is a sleep in vacuum_delay_point():

+       /*
+        * Balance and update limit values for autovacuum workers. We must
+        * always do this in case the autovacuum launcher or another
+        * autovacuum worker has recalculated the number of workers across
+        * which we must balance the limit. This is done by the launcher when
+        * launching a new worker and by workers before vacuuming each table.
+        */
+       AutoVacuumUpdateLimit();

Shouldn't we always call that in case we had a config reload, or am I being
thick?

>> +static double av_relopt_cost_delay = -1;
>> +static int av_relopt_cost_limit = 0;

Sorry, I didn't catch this earlier, shouldn't this be -1 to match the default
value of autovacuum_vacuum_cost_limit?

>> These need a comment IMO, ideally one that explain why they are initialized to
>> those values.
>
> I've added a comment.

+ * Variables to save the cost-related table options for the current relation

The "table options" nomenclature is right now only used for FDW foreign table
options, I think we should use "storage parameters" or "relation options" here.

>> +       /* There is at least 1 autovac worker (this worker). */
>> +       Assert(nworkers_for_balance > 0);
>>
>> Is there a scenario where this is expected to fail?  If so I think this should
>> be handled and not just an Assert.
>
> No, this isn't expected to happen because an autovacuum worker would
> have called autovac_recalculate_workers_for_balance() before calling
> VacuumUpdateCosts() (which calls AutoVacuumUpdateLimit()) in
> do_autovacuum(). But, if someone were to move around or add a call to
> VacuumUpdateCosts() there is a chance it could happen.

Thinking more on this I'm tempted to recommend that we promote this to an
elog(), mainly due to the latter.  An accidental call to VacuumUpdateCosts()
doesn't seem entirely unlikely to happen

--
Daniel Gustafsson




Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
Thanks all for the reviews.

v16 attached. I put it together rather quickly, so there might be a few
spurious whitespaces or similar. There is one rather annoying pgindent
outlier that I have to figure out what to do about as well.

The remaining functional TODOs that I know of are:

- Resolve what to do about names of GUC and vacuum variables for cost
  limit and cost delay (since it may affect extensions)

- Figure out what to do about the logging message which accesses dboid
  and tableoid (lock/no lock, where to put it, etc)

- I see several places in docs which reference the balancing algorithm
  for autovac workers. I did not read them in great detail, but we may
  want to review them to see if any require updates.

- Consider whether or not the initial two commits should just be
  squashed with the third commit

- Anything else reviewers are still unhappy with


On Wed, Apr 5, 2023 at 1:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Apr 5, 2023 at 5:05 AM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> >
> > On Tue, Apr 4, 2023 at 4:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > ---
> > > -                if (worker->wi_proc != NULL)
> > > -                        elog(DEBUG2, "autovac_balance_cost(pid=%d
> > > db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d,
> > > cost_delay=%g)",
> > > -                                 worker->wi_proc->pid,
> > > worker->wi_dboid, worker->wi_tableoid,
> > > -                                 worker->wi_dobalance ? "yes" : "no",
> > > -                                 worker->wi_cost_limit,
> > > worker->wi_cost_limit_base,
> > > -                                 worker->wi_cost_delay);
> > >
> > > I think it's better to keep this kind of log in some form for
> > > debugging. For example, we can show these values of autovacuum workers
> > > in VacuumUpdateCosts().
> >
> > I added a message to do_autovacuum() after calling VacuumUpdateCosts()
> > in the loop vacuuming each table. That means it will happen once per
> > table. It's not ideal that I had to move the call to VacuumUpdateCosts()
> > behind the shared lock in that loop so that we could access the pid and
> > such in the logging message after updating the cost and delay, but it is
> > probably okay. Though noone is going to be changing those at this
> > point, it still seemed better to access them under the lock.
> >
> > This does mean we won't log anything when we do change the values of
> > VacuumCostDelay and VacuumCostLimit while vacuuming a table. Is it worth
> > adding some code to do that in VacuumUpdateCosts() (only when the value
> > has changed not on every call to VacuumUpdateCosts())? Or perhaps we
> > could add it in the config reload branch that is already in
> > vacuum_delay_point()?
>
> Previously, we used to show the pid in the log since a worker/launcher
> set other workers' delay costs. But now that the worker sets its delay
> costs, we don't need to show the pid in the log. Also, I think it's
> useful for debugging and investigating the system if we log it when
> changing the values. The log I imagined to add was like:
>
> @@ -1801,6 +1801,13 @@ VacuumUpdateCosts(void)
>             VacuumCostDelay = vacuum_cost_delay;
>
>         AutoVacuumUpdateLimit();
> +
> +       elog(DEBUG2, "autovacuum update costs (db=%u, rel=%u,
> dobalance=%s, cost_limit=%d, cost_delay=%g active=%s failsafe=%s)",
> +            MyWorkerInfo->wi_dboid, MyWorkerInfo->wi_tableoid,
> +            pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance)
> ? "no" : "yes",
> +            VacuumCostLimit, VacuumCostDelay,
> +            VacuumCostDelay > 0 ? "yes" : "no",
> +            VacuumFailsafeActive ? "yes" : "no");
>     }
>     else
>     {

Makes sense. I've updated the log message to roughly what you suggested.
I also realized I think it does make sense to call it in
VacuumUpdateCosts() -- only for autovacuum workers of course. I've done
this. I haven't taken the lock though and can't decide if I must since
they access dboid and tableoid -- those are not going to change at this
point, but I still don't know if I can access them lock-free...
Perhaps there is a way to condition it on the log level?

If I have to take a lock, then I don't know if we should put these in
VacuumUpdateCosts()...

On Wed, Apr 5, 2023 at 3:16 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
> About 0001:
>
> + * VacuumFailsafeActive is a defined as a global so that we can determine
> + * whether or not to re-enable cost-based vacuum delay when vacuuming a table.
> + * If failsafe mode has been engaged, we will not re-enable cost-based delay
> + * for the table until after vacuuming has completed, regardless of other
> + * settings. Only VACUUM code should inspect this variable and only table
> + * access methods should set it. In Table AM-agnostic VACUUM code, this
> + * variable controls whether or not to allow cost-based delays. Table AMs are
> + * free to use it if they desire this behavior.
> + */
> +bool           VacuumFailsafeActive = false;
>
> If I understand this correctly, there seems to be an issue. The
> AM-agnostic VACUUM code is setting it and no table AMs actually do
> that.

No, it is not set in table AM-agnostic VACUUM code. I meant it is
used/read from/inspected in table AM-agnostic VACUUM code. Table AMs can
set it if they want to avoid cost-based delays being re-enabled. It is
only set to true heap-specific code and is initialized to false and
reset in table AM-agnostic code back to false in between each relation
being vacuumed. I updated the comment to reflect this. Let me know if
you think it is clear.

> 0003:
> +
> +                       /*
> +                        * Ensure VacuumFailsafeActive has been reset before vacuuming the
> +                        * next relation.
> +                        */
> +                       VacuumFailsafeActive = false;
>                 }
>         }
>         PG_FINALLY();
>         {
>                 in_vacuum = false;
>                 VacuumCostActive = false;
> +               VacuumFailsafeActive = false;
> +               VacuumCostBalance = 0;
>
> There is no need to reset VacuumFailsafeActive in the PG_TRY() block.

I think that is true -- since it is initialized to false and reset to
false after vacuuming every relation. However, I am leaning toward
keeping it because I haven't thought through every codepath and
determined if there is ever a way where it could be true here.

> +       /*
> +        * Reload the configuration file if requested. This allows changes to
> +        * autovacuum_vacuum_cost_limit and autovacuum_vacuum_cost_delay to take
> +        * effect while a table is being vacuumed or analyzed.
> +        */
> +       if (ConfigReloadPending && IsAutoVacuumWorkerProcess())
> +       {
> +               ConfigReloadPending = false;
> +               ProcessConfigFile(PGC_SIGHUP);
> +               VacuumUpdateCosts();
> +       }
>
> I believe we should prevent unnecessary reloading when
> VacuumFailsafeActive is true.

This is in conflict with two of the other reviewers feedback:
Sawada-san:

> +         * Reload the configuration file if requested. This allows changes to
> +         * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay to
> +         * take effect while a table is being vacuumed or analyzed.
> +         */
> +        if (ConfigReloadPending && !analyze_in_outer_xact)
> +        {
> +                ConfigReloadPending = false;
> +                ProcessConfigFile(PGC_SIGHUP);
> +                AutoVacuumUpdateDelay();
> +                AutoVacuumUpdateLimit();
> +        }
>
> It makes sense to me that we need to reload the config file even when
> vacuum-delay is disabled. But I think it's not convenient for users
> that we don't reload the configuration file once the failsafe is
> triggered. I think users might want to change some GUCs such as
> log_autovacuum_min_duration.

and Daniel in response to this:

> > It makes sense to me that we need to reload the config file even when
> > vacuum-delay is disabled. But I think it's not convenient for users
> > that we don't reload the configuration file once the failsafe is
> > triggered. I think users might want to change some GUCs such as
> > log_autovacuum_min_duration.
>
> I agree with this.

> +               AutoVacuumUpdateLimit();
>
> I'm not entirely sure, but it might be better to name this
> AutoVacuumUpdateCostLimit().

I have made this change.

> +       pg_atomic_flag wi_dobalance;
> ...
> +               /*
> +                * We only expect this worker to ever set the flag, so don't bother
> +                * checking the return value. We shouldn't have to retry.
> +                */
> +               if (tab->at_dobalance)
> +                       pg_atomic_test_set_flag(&MyWorkerInfo->wi_dobalance);
> +               else
> +                       pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
>
>                 LWLockAcquire(AutovacuumLock, LW_SHARED);
>
>                 autovac_recalculate_workers_for_balance();
>
> I don't see the need for using atomic here. The code is executed
> infrequently and we already take a lock while counting do_balance
> workers. So sticking with the old locking method (taking LW_EXCLUSIVE
> then set wi_dobalance then do balance) should be fine.

We access wi_dobalance on every call to AutoVacuumUpdateLimit() which is
executed in vacuum_delay_point(). I do not think we can justify take a
shared lock in a function that is called so frequently.

> +void
> +AutoVacuumUpdateLimit(void)
> ...
> +       if (av_relopt_cost_limit > 0)
> +               VacuumCostLimit = av_relopt_cost_limit;
> +       else
>
> I think we should use wi_dobalance to decide if we need to do balance
> or not. We don't need to take a lock to do that since only the process
> updates it.

We do do that below in the "else" before balancing. But we for sure
don't need to balance if relopt for cost limit is set. We can save an
access to an atomic variable this way. I think the atomic is a
relatively cheap way of avoiding this whole locking question.

> /*
>                  * Remove my info from shared memory.  We could, but intentionally
> -                * don't, clear wi_cost_limit and friends --- this is on the
> -                * assumption that we probably have more to do with similar cost
> -                * settings, so we don't want to give up our share of I/O for a very
> -                * short interval and thereby thrash the global balance.
> +                * don't, unset wi_dobalance on the assumption that we are more likely
> +                * than not to vacuum a table with no table options next, so we don't
> +                * want to give up our share of I/O for a very short interval and
> +                * thereby thrash the global balance.
>                  */
>                 LWLockAcquire(AutovacuumScheduleLock, LW_EXCLUSIVE);
>                 MyWorkerInfo->wi_tableoid = InvalidOid;
>
> The comment mentions wi_dobalance, but it doesn't appear here..

The point of the comment is that we don't do anything with wi_dobalance
here. It is explaining why it doesn't appear. The previous comment
mentioned not doing anything with wi_cost_delay and wi_cost_limit which
also didn't appear here.

On Wed, Apr 5, 2023 at 9:10 AM Daniel Gustafsson <daniel@yesql.se> wrote:
>
> > On 4 Apr 2023, at 22:04, Melanie Plageman <melanieplageman@gmail.com> wrote:
> >> +extern int VacuumCostLimit;
> >> +extern double VacuumCostDelay;
> >> ...
> >> -extern PGDLLIMPORT int VacuumCostLimit;
> >> -extern PGDLLIMPORT double VacuumCostDelay;
> >>
> >> Same with these, I don't think this is according to our default visibility.
> >> Moreover, I'm not sure it's a good idea to perform this rename.  This will keep
> >> VacuumCostLimit and VacuumCostDelay exported, but change their meaning.  Any
> >> external code referring to these thinking they are backing the GUCs will still
> >> compile, but may be broken in subtle ways.  Is there a reason for not keeping
> >> the current GUC variables and instead add net new ones?
> >
> > When VacuumCostLimit was the same variable in the code and for the GUC
> > vacuum_cost_limit, everytime we reload the config file, VacuumCostLimit
> > is overwritten. Autovacuum workers have to overwrite this value with the
> > appropriate one for themselves given the balancing logic and the value
> > of autovacuum_vacuum_cost_limit. However, the problem is, because you
> > can specify -1 for autovacuum_vacuum_cost_limit to indicate it should
> > fall back to vacuum_cost_limit, we have to reference the value of
> > VacuumCostLimit when calculating the new autovacuum worker's cost limit
> > after a config reload.
> >
> > But, you have to be sure you *only* do this after a config reload when
> > the value of VacuumCostLimit is fresh and unmodified or you risk
> > dividing the value of VacuumCostLimit over and over. That means it is
> > unsafe to call functions updating the cost limit more than once.
> >
> > This orchestration wasn't as difficult when we only reloaded the config
> > file once every table. We were careful about it and also kept the
> > original "base" cost limit around from table_recheck_autovac(). However,
> > once we started reloading the config file more often, this no longer
> > works.
> >
> > By separating the variables modified when the gucs are set and the ones
> > used the code, we can make sure we always have the original value the
> > guc was set to in vacuum_cost_limit and autovacuum_vacuum_cost_limit,
> > whenever we need to reference it.
> >
> > That being said, perhaps we should document what extensions should do?
> > Do you think they will want to use the variables backing the gucs or to
> > be able to overwrite the variables being used in the code?
>
> I think I wasn't clear in my comment, sorry.  I don't have a problem with
> introducing a new variable to split the balanced value from the GUC value.
> What I don't think we should do is repurpose an exported symbol into doing a
> new thing.  In the case at hand I think VacuumCostLimit and VacuumCostDelay
> should remain the backing variables for the GUCs, with vacuum_cost_limit and
> vacuum_cost_delay carrying the balanced values.  So the inverse of what is in
> the patch now.
>
> The risk of these symbols being used in extensions might be very low but on
> principle it seems unwise to alter a symbol and risk subtle breakage.

I totally see what you are saying. The only complication is that all of
the other variables used in vacuum code are the camelcase and the gucs
follow the snake case -- as pointed out in a previous review comment by
Sawada-san:

> @@ -83,6 +84,7 @@ int                   vacuum_cost_limit;
>   */
>  int                    VacuumCostLimit = 0;
>  double         VacuumCostDelay = -1;
> +static bool vacuum_can_reload_config = false;
>
> In vacuum.c, we use snake case for GUC parameters and camel case for
> other global variables, so it seems better to rename it
> VacuumCanReloadConfig. Sorry, that's my fault.

This is less of a compelling argument than subtle breakage for extension
code, though.

I am, however, wondering if extensions expect to have access to the guc
variable or the global variable -- or both?

Left it as is in this version until we resolve the question.

> > Oh, also I've annotated these with PGDLLIMPORT too.
> >
> >> + * TODO: should VacuumCostLimit and VacuumCostDelay be initialized to valid or
> >> + * invalid values?
> >> + */
> >> +int                    VacuumCostLimit = 0;
> >> +double         VacuumCostDelay = -1;
> >>
> >> I think the important part is to make sure they are never accessed without
> >> VacuumUpdateCosts having been called first.  I think that's the case here, but
> >> it's not entirely clear.  Do you see a codepath where that could happen?  If
> >> they are initialized to a sentinel value we also need to check for that, so
> >> initializing to the defaults from the corresponding GUCs seems better.
> >
> > I don't see a case where autovacuum could access these without calling
> > VacuumUpdateCosts() first. I think the other callers of
> > vacuum_delay_point() are the issue (gist/gin/hash/etc).
> >
> > It might need a bit more thought.
> >
> > My concern was that these variables correspond to multiple GUCs each
> > depending on the backend type, and those backends have different
> > defaults (e.g. autovac workers default cost delay is different than
> > client backend doing vacuum cost delay).
> >
> > However, what I have done in this version is initialize them to the
> > defaults for a client backend executing VACUUM or ANALYZE, since I am
> > fairly confident that autovacuum will not use them without calling
> > VacuumUpdateCosts().
>
> Another question along these lines, we only call AutoVacuumUpdateLimit() in
> case there is a sleep in vacuum_delay_point():
>
> +       /*
> +        * Balance and update limit values for autovacuum workers. We must
> +        * always do this in case the autovacuum launcher or another
> +        * autovacuum worker has recalculated the number of workers across
> +        * which we must balance the limit. This is done by the launcher when
> +        * launching a new worker and by workers before vacuuming each table.
> +        */
> +       AutoVacuumUpdateLimit();
>
> Shouldn't we always call that in case we had a config reload, or am I being
> thick?

We actually also call it from inside VacuumUpdateCosts(), which is
always called in the case of a config reload.

> >> +static double av_relopt_cost_delay = -1;
> >> +static int av_relopt_cost_limit = 0;
>
> Sorry, I didn't catch this earlier, shouldn't this be -1 to match the default
> value of autovacuum_vacuum_cost_limit?

Yea, this is a bit tricky. Initial values of -1 and 0 have the same
effect when we are referencing av_relopt_vacuum_cost_limit in
AutoVacuumUpdateCostLimit(). However, I was trying to initialize both
av_relopt_vacuum_cost_limit and av_relopt_vacuum_cost_delay to "invalid"
values which were not the default for the associated autovacuum gucs,
since initializing av_relopt_cost_delay to the default for
autovacuum_vacuum_cost_delay (2 ms) would cause it to be used even if
storage params were not set for the relation.

I have updated the initial value to -1, as you suggested -- but I don't
know if it is more or less confusing the explain what I just explained
in the comment above it.

> >> These need a comment IMO, ideally one that explain why they are initialized to
> >> those values.
> >
> > I've added a comment.
>
> + * Variables to save the cost-related table options for the current relation
>
> The "table options" nomenclature is right now only used for FDW foreign table
> options, I think we should use "storage parameters" or "relation options" here.

I've updated these to "storage parameters" to match the docs. I poked
around looking for other places I referred to them as table options and
tried to fix those as well. I've also changed all relevant variable
names.

> >> +       /* There is at least 1 autovac worker (this worker). */
> >> +       Assert(nworkers_for_balance > 0);
> >>
> >> Is there a scenario where this is expected to fail?  If so I think this should
> >> be handled and not just an Assert.
> >
> > No, this isn't expected to happen because an autovacuum worker would
> > have called autovac_recalculate_workers_for_balance() before calling
> > VacuumUpdateCosts() (which calls AutoVacuumUpdateLimit()) in
> > do_autovacuum(). But, if someone were to move around or add a call to
> > VacuumUpdateCosts() there is a chance it could happen.
>
> Thinking more on this I'm tempted to recommend that we promote this to an
> elog(), mainly due to the latter.  An accidental call to VacuumUpdateCosts()
> doesn't seem entirely unlikely to happen

Makes sense. I've added a trivial elog ERROR, but I didn't spend quite
enough time thinking about what (if any) other context to include in it.

- Melanie

Attachment

Re: Should vacuum process config file reload more often

From
Daniel Gustafsson
Date:
> On 5 Apr 2023, at 17:29, Melanie Plageman <melanieplageman@gmail.com> wrote:
>>
>> I think I wasn't clear in my comment, sorry.  I don't have a problem with
>> introducing a new variable to split the balanced value from the GUC value.
>> What I don't think we should do is repurpose an exported symbol into doing a
>> new thing.  In the case at hand I think VacuumCostLimit and VacuumCostDelay
>> should remain the backing variables for the GUCs, with vacuum_cost_limit and
>> vacuum_cost_delay carrying the balanced values.  So the inverse of what is in
>> the patch now.
>>
>> The risk of these symbols being used in extensions might be very low but on
>> principle it seems unwise to alter a symbol and risk subtle breakage.
>
> I totally see what you are saying. The only complication is that all of
> the other variables used in vacuum code are the camelcase and the gucs
> follow the snake case -- as pointed out in a previous review comment by
> Sawada-san:

Fair point.

>> @@ -83,6 +84,7 @@ int                   vacuum_cost_limit;
>>  */
>> int                    VacuumCostLimit = 0;
>> double         VacuumCostDelay = -1;
>> +static bool vacuum_can_reload_config = false;
>>
>> In vacuum.c, we use snake case for GUC parameters and camel case for
>> other global variables, so it seems better to rename it
>> VacuumCanReloadConfig. Sorry, that's my fault.
>
> This is less of a compelling argument than subtle breakage for extension
> code, though.

How about if we rename the variable into something which also acts at bit as
self documenting why there are two in the first place?  Perhaps
BalancedVacuumCostLimit or something similar (I'm terrible with names)?

> I am, however, wondering if extensions expect to have access to the guc
> variable or the global variable -- or both?

Extensions have access to all exported symbols, and I think it's not uncommon
for extension authors to expect to have access to at least read GUC variables.

--
Daniel Gustafsson




Re: Should vacuum process config file reload more often

From
Robert Haas
Date:
On Wed, Apr 5, 2023 at 11:29 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> Thanks all for the reviews.
>
> v16 attached. I put it together rather quickly, so there might be a few
> spurious whitespaces or similar. There is one rather annoying pgindent
> outlier that I have to figure out what to do about as well.
>
> The remaining functional TODOs that I know of are:
>
> - Resolve what to do about names of GUC and vacuum variables for cost
>   limit and cost delay (since it may affect extensions)
>
> - Figure out what to do about the logging message which accesses dboid
>   and tableoid (lock/no lock, where to put it, etc)
>
> - I see several places in docs which reference the balancing algorithm
>   for autovac workers. I did not read them in great detail, but we may
>   want to review them to see if any require updates.
>
> - Consider whether or not the initial two commits should just be
>   squashed with the third commit
>
> - Anything else reviewers are still unhappy with

I really like having the first couple of patches split out -- it makes
them super-easy to understand. A committer can always choose to squash
at commit time if they want. I kind of wish the patch set were split
up more, for even easier understanding. I don't think that's a thing
to get hung up on, but it's an opinion that I have.

I strongly agree with the goals of the patch set, as I understand
them. Being able to change the config file and SIGHUP the server and
have the new values affect running autovacuum workers seems pretty
huge. It would make it possible to solve problems that currently can
only be solved by using gdb on a production instance, which is not a
fun thing to be doing.

+ /*
+ * Balance and update limit values for autovacuum workers. We must
+ * always do this in case the autovacuum launcher or another
+ * autovacuum worker has recalculated the number of workers across
+ * which we must balance the limit. This is done by the launcher when
+ * launching a new worker and by workers before vacuuming each table.
+ */

I don't quite understand what's going on here. A big reason that I'm
worried about this whole issue in the first place is that sometimes
there's a vacuum going on a giant table and you can't get it to go
fast. You want it to absorb new settings, and to do so quickly. I
realize that this is about the number of workers, not the actual cost
limit, so that makes what I'm about to say less important. But ... is
this often enough? Like, the time before we move onto the next table
could be super long. The time before a new worker is launched should
be ~autovacuum_naptime/autovacuum_max_workers or ~20s with default
settings, so that's not horrible, but I'm kind of struggling to
understand the rationale for this particular choice. Maybe it's fine.

To be honest, I think that the whole system where we divide the cost
limit across the workers is the wrong idea. Does anyone actually like
that behavior? This patch probably shouldn't touch that, just in the
interest of getting something done that is an improvement over where
we are now, but I think this behavior is really counterintuitive.
People expect that they can increase autovacuum_max_workers to get
more vacuuming done, and actually in most cases that does not work.
And if that behavior didn't exist, this patch would also be a whole
lot simpler. Again, I don't think this is something we should try to
address right now under time pressure, but in the future, I think we
should consider ripping this behavior out.

+               if (autovacuum_vac_cost_limit > 0)
+                       VacuumCostLimit = autovacuum_vac_cost_limit;
+               else
+                       VacuumCostLimit = vacuum_cost_limit;
+
+               /* Only balance limit if no cost-related storage
parameters specified */
+               if (pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance))
+                       return;
+               Assert(VacuumCostLimit > 0);
+
+               nworkers_for_balance = pg_atomic_read_u32(
+
&AutoVacuumShmem->av_nworkersForBalance);
+
+               /* There is at least 1 autovac worker (this worker). */
+               if (nworkers_for_balance <= 0)
+                       elog(ERROR, "nworkers_for_balance must be > 0");
+
+               VacuumCostLimit = Max(VacuumCostLimit /
nworkers_for_balance, 1);

I think it would be better stylistically to use a temporary variable
here and only assign the final value to VacuumCostLimit.

Daniel: Are you intending to commit this?

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Should vacuum process config file reload more often

From
Daniel Gustafsson
Date:
> On 5 Apr 2023, at 20:55, Robert Haas <robertmhaas@gmail.com> wrote:

> Again, I don't think this is something we should try to
> address right now under time pressure, but in the future, I think we
> should consider ripping this behavior out.

I would not be opposed to that, but I wholeheartedly agree that it's not the
job of this patch (or any patch at this point in the cycle).

> +               if (autovacuum_vac_cost_limit > 0)
> +                       VacuumCostLimit = autovacuum_vac_cost_limit;
> +               else
> +                       VacuumCostLimit = vacuum_cost_limit;
> +
> +               /* Only balance limit if no cost-related storage
> parameters specified */
> +               if (pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance))
> +                       return;
> +               Assert(VacuumCostLimit > 0);
> +
> +               nworkers_for_balance = pg_atomic_read_u32(
> +
> &AutoVacuumShmem->av_nworkersForBalance);
> +
> +               /* There is at least 1 autovac worker (this worker). */
> +               if (nworkers_for_balance <= 0)
> +                       elog(ERROR, "nworkers_for_balance must be > 0");
> +
> +               VacuumCostLimit = Max(VacuumCostLimit /
> nworkers_for_balance, 1);
>
> I think it would be better stylistically to use a temporary variable
> here and only assign the final value to VacuumCostLimit.

I can agree with that.  Another supertiny nitpick on the above is to not end a
single-line comment with a period.

> Daniel: Are you intending to commit this?

Yes, my plan is to get it in before feature freeze.  I notice now that I had
missed setting myself as committer in the CF to signal this intent, sorry about
that.

--
Daniel Gustafsson




Re: Should vacuum process config file reload more often

From
Robert Haas
Date:
On Wed, Apr 5, 2023 at 3:04 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> > Daniel: Are you intending to commit this?
>
> Yes, my plan is to get it in before feature freeze.

All right, let's make it happen! I think this is pretty close to ready
to ship, and it would solve a problem that is real, annoying, and
serious.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Wed, Apr 5, 2023 at 2:56 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> + /*
> + * Balance and update limit values for autovacuum workers. We must
> + * always do this in case the autovacuum launcher or another
> + * autovacuum worker has recalculated the number of workers across
> + * which we must balance the limit. This is done by the launcher when
> + * launching a new worker and by workers before vacuuming each table.
> + */
>
> I don't quite understand what's going on here. A big reason that I'm
> worried about this whole issue in the first place is that sometimes
> there's a vacuum going on a giant table and you can't get it to go
> fast. You want it to absorb new settings, and to do so quickly. I
> realize that this is about the number of workers, not the actual cost
> limit, so that makes what I'm about to say less important. But ... is
> this often enough? Like, the time before we move onto the next table
> could be super long. The time before a new worker is launched should
> be ~autovacuum_naptime/autovacuum_max_workers or ~20s with default
> settings, so that's not horrible, but I'm kind of struggling to
> understand the rationale for this particular choice. Maybe it's fine.

VacuumUpdateCosts() also calls AutoVacuumUpdateCostLimit(), so this will
happen if a config reload is pending the next time vacuum_delay_point()
is called (which is pretty often -- roughly once per block vacuumed but
definitely more than once per table).

Relevant code is at the top of vacuum_delay_point():

    if (ConfigReloadPending && IsAutoVacuumWorkerProcess())
    {
        ConfigReloadPending = false;
        ProcessConfigFile(PGC_SIGHUP);
        VacuumUpdateCosts();
    }

- Melanie



Re: Should vacuum process config file reload more often

From
Robert Haas
Date:
On Wed, Apr 5, 2023 at 3:44 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> VacuumUpdateCosts() also calls AutoVacuumUpdateCostLimit(), so this will
> happen if a config reload is pending the next time vacuum_delay_point()
> is called (which is pretty often -- roughly once per block vacuumed but
> definitely more than once per table).
>
> Relevant code is at the top of vacuum_delay_point():
>
>     if (ConfigReloadPending && IsAutoVacuumWorkerProcess())
>     {
>         ConfigReloadPending = false;
>         ProcessConfigFile(PGC_SIGHUP);
>         VacuumUpdateCosts();
>     }

Yeah, that all makes sense, and I did see that logic, but I'm
struggling to reconcile it with what that comment says.

Maybe I'm just confused about what that comment is trying to explain.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Should vacuum process config file reload more often

From
Peter Geoghegan
Date:
On Wed, Apr 5, 2023 at 11:56 AM Robert Haas <robertmhaas@gmail.com> wrote:
> To be honest, I think that the whole system where we divide the cost
> limit across the workers is the wrong idea. Does anyone actually like
> that behavior? This patch probably shouldn't touch that, just in the
> interest of getting something done that is an improvement over where
> we are now, but I think this behavior is really counterintuitive.
> People expect that they can increase autovacuum_max_workers to get
> more vacuuming done, and actually in most cases that does not work.

I disagree. Increasing autovacuum_max_workers as a method of
increasing the overall aggressiveness of autovacuum seems like the
wrong idea. I'm sure that users do that at times, but they really
ought to have a better way of getting the same result.

ISTM that autovacuum_max_workers confuses the question of what the
maximum possible number of workers should ever be (in extreme cases)
with the question of how many workers might be a good idea given
present conditions.

> And if that behavior didn't exist, this patch would also be a whole
> lot simpler.

Probably, but the fact remains that the system level view of things is
mostly what matters. The competition between the amount of vacuuming
that we can afford to do right now and the amount of vacuuming that
we'd ideally be able to do really matters. In fact, I'd argue that the
amount of vacuuming that we'd ideally be able to do isn't a
particularly meaningful concept on its own. It's just too hard to
model what we need to do accurately -- emphasizing what we can afford
to do seems much more promising.

> Again, I don't think this is something we should try to
> address right now under time pressure, but in the future, I think we
> should consider ripping this behavior out.

-1. The delay stuff might not work as well as it should, but it at
least seems like roughly the right idea. The bigger problem seems to
be everything else -- the way that tuning autovacuum_max_workers kinda
makes sense (it shouldn't be an interesting tunable), and the problems
with the autovacuum.c scheduling being so primitive.

--
Peter Geoghegan



Re: Should vacuum process config file reload more often

From
Daniel Gustafsson
Date:
> On 5 Apr 2023, at 22:19, Peter Geoghegan <pg@bowt.ie> wrote:

> The bigger problem seems to
> be everything else -- the way that tuning autovacuum_max_workers kinda
> makes sense (it shouldn't be an interesting tunable)

Not to derail this thread, and pre-empt a thread where this can be discussed in
its own context, but isn't that kind of the main problem?  Tuning autovacuum is
really complicated and one of the parameters that I think universally seem to
make sense to users is just autovacuum_max_workers.  I agree that it doesn't do
what most think it should, but a quick skim of the name and docs can probably
lead to a lot of folks trying to use it as hammer.

--
Daniel Gustafsson




Re: Should vacuum process config file reload more often

From
Peter Geoghegan
Date:
On Wed, Apr 5, 2023 at 1:38 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> Not to derail this thread, and pre-empt a thread where this can be discussed in
> its own context, but isn't that kind of the main problem?  Tuning autovacuum is
> really complicated and one of the parameters that I think universally seem to
> make sense to users is just autovacuum_max_workers.  I agree that it doesn't do
> what most think it should, but a quick skim of the name and docs can probably
> lead to a lot of folks trying to use it as hammer.

I think that I agree. I think that the difficulty of tuning autovacuum
is the actual real problem. (Or maybe it's just very closely related
to the real problem -- the precise definition doesn't seem important.)

There seems to be a kind of physics envy to some of these things.
False precision. The way that the mechanisms actually work (the
autovacuum scheduling, freeze_min_age, and quite a few other things)
*are* simple. But so are the rules of Conway's game of life, yet
people seem to have a great deal of difficulty predicting how it will
behave in any given situation. Any design that focuses on the
immediate consequences of any particular policy while ignoring second
order effects isn't going to work particularly well. Users ought to be
able to constrain the behavior of autovacuum using settings that
express what they want in high level terms. And VACUUM ought to have
much more freedom around finding the best way to meet those high level
goals over time (e.g., very loose rules about how much we need to
advance relfrozenxid by during any individual VACUUM).

--
Peter Geoghegan



Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Wed, Apr 5, 2023 at 3:43 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Wed, Apr 5, 2023 at 2:56 PM Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > + /*
> > + * Balance and update limit values for autovacuum workers. We must
> > + * always do this in case the autovacuum launcher or another
> > + * autovacuum worker has recalculated the number of workers across
> > + * which we must balance the limit. This is done by the launcher when
> > + * launching a new worker and by workers before vacuuming each table.
> > + */
> >
> > I don't quite understand what's going on here. A big reason that I'm
> > worried about this whole issue in the first place is that sometimes
> > there's a vacuum going on a giant table and you can't get it to go
> > fast. You want it to absorb new settings, and to do so quickly. I
> > realize that this is about the number of workers, not the actual cost
> > limit, so that makes what I'm about to say less important. But ... is
> > this often enough? Like, the time before we move onto the next table
> > could be super long. The time before a new worker is launched should
> > be ~autovacuum_naptime/autovacuum_max_workers or ~20s with default
> > settings, so that's not horrible, but I'm kind of struggling to
> > understand the rationale for this particular choice. Maybe it's fine.
>
> VacuumUpdateCosts() also calls AutoVacuumUpdateCostLimit(), so this will
> happen if a config reload is pending the next time vacuum_delay_point()
> is called (which is pretty often -- roughly once per block vacuumed but
> definitely more than once per table).
>
> Relevant code is at the top of vacuum_delay_point():
>
>     if (ConfigReloadPending && IsAutoVacuumWorkerProcess())
>     {
>         ConfigReloadPending = false;
>         ProcessConfigFile(PGC_SIGHUP);
>         VacuumUpdateCosts();
>     }
>

Gah, I think I misunderstood you. You are saying that only calling
AutoVacuumUpdateCostLimit() after napping while vacuuming a table may
not be enough. The frequency at which the number of workers changes will
likely be different. This is a good point.
It's kind of weird to call AutoVacuumUpdateCostLimit() only after napping...

Hmm. Well, I don't think we want to call AutoVacuumUpdateCostLimit() on
every call to vacuum_delay_point(), though, do we? It includes two
atomic operations. Maybe that pales in comparison to what we are doing
on each page we are vacuuming. I haven't properly thought about it.

Is there some other relevant condition we can use to determine whether
or not to call AutoVacuumUpdateCostLimit() on a given invocation of
vacuum_delay_point()? Maybe something with naptime/max workers?

I'm not sure if there is a more reliable place than vacuum_delay_point()
for us to do this. I poked around heap_vacuum_rel(), but I think we
would want this cost limit update to happen table AM-agnostically.

Thank you for bringing this up!

- Melanie



Re: Should vacuum process config file reload more often

From
Masahiko Sawada
Date:
On Thu, Apr 6, 2023 at 12:29 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> Thanks all for the reviews.
>
> v16 attached. I put it together rather quickly, so there might be a few
> spurious whitespaces or similar. There is one rather annoying pgindent
> outlier that I have to figure out what to do about as well.
>
> The remaining functional TODOs that I know of are:
>
> - Resolve what to do about names of GUC and vacuum variables for cost
>   limit and cost delay (since it may affect extensions)
>
> - Figure out what to do about the logging message which accesses dboid
>   and tableoid (lock/no lock, where to put it, etc)
>
> - I see several places in docs which reference the balancing algorithm
>   for autovac workers. I did not read them in great detail, but we may
>   want to review them to see if any require updates.
>
> - Consider whether or not the initial two commits should just be
>   squashed with the third commit
>
> - Anything else reviewers are still unhappy with
>
>
> On Wed, Apr 5, 2023 at 1:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Apr 5, 2023 at 5:05 AM Melanie Plageman
> > <melanieplageman@gmail.com> wrote:
> > >
> > > On Tue, Apr 4, 2023 at 4:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > ---
> > > > -                if (worker->wi_proc != NULL)
> > > > -                        elog(DEBUG2, "autovac_balance_cost(pid=%d
> > > > db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d,
> > > > cost_delay=%g)",
> > > > -                                 worker->wi_proc->pid,
> > > > worker->wi_dboid, worker->wi_tableoid,
> > > > -                                 worker->wi_dobalance ? "yes" : "no",
> > > > -                                 worker->wi_cost_limit,
> > > > worker->wi_cost_limit_base,
> > > > -                                 worker->wi_cost_delay);
> > > >
> > > > I think it's better to keep this kind of log in some form for
> > > > debugging. For example, we can show these values of autovacuum workers
> > > > in VacuumUpdateCosts().
> > >
> > > I added a message to do_autovacuum() after calling VacuumUpdateCosts()
> > > in the loop vacuuming each table. That means it will happen once per
> > > table. It's not ideal that I had to move the call to VacuumUpdateCosts()
> > > behind the shared lock in that loop so that we could access the pid and
> > > such in the logging message after updating the cost and delay, but it is
> > > probably okay. Though noone is going to be changing those at this
> > > point, it still seemed better to access them under the lock.
> > >
> > > This does mean we won't log anything when we do change the values of
> > > VacuumCostDelay and VacuumCostLimit while vacuuming a table. Is it worth
> > > adding some code to do that in VacuumUpdateCosts() (only when the value
> > > has changed not on every call to VacuumUpdateCosts())? Or perhaps we
> > > could add it in the config reload branch that is already in
> > > vacuum_delay_point()?
> >
> > Previously, we used to show the pid in the log since a worker/launcher
> > set other workers' delay costs. But now that the worker sets its delay
> > costs, we don't need to show the pid in the log. Also, I think it's
> > useful for debugging and investigating the system if we log it when
> > changing the values. The log I imagined to add was like:
> >
> > @@ -1801,6 +1801,13 @@ VacuumUpdateCosts(void)
> >             VacuumCostDelay = vacuum_cost_delay;
> >
> >         AutoVacuumUpdateLimit();
> > +
> > +       elog(DEBUG2, "autovacuum update costs (db=%u, rel=%u,
> > dobalance=%s, cost_limit=%d, cost_delay=%g active=%s failsafe=%s)",
> > +            MyWorkerInfo->wi_dboid, MyWorkerInfo->wi_tableoid,
> > +            pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance)
> > ? "no" : "yes",
> > +            VacuumCostLimit, VacuumCostDelay,
> > +            VacuumCostDelay > 0 ? "yes" : "no",
> > +            VacuumFailsafeActive ? "yes" : "no");
> >     }
> >     else
> >     {
>
> Makes sense. I've updated the log message to roughly what you suggested.
> I also realized I think it does make sense to call it in
> VacuumUpdateCosts() -- only for autovacuum workers of course. I've done
> this. I haven't taken the lock though and can't decide if I must since
> they access dboid and tableoid -- those are not going to change at this
> point, but I still don't know if I can access them lock-free...
> Perhaps there is a way to condition it on the log level?
>
> If I have to take a lock, then I don't know if we should put these in
> VacuumUpdateCosts()...

I think we don't need to acquire a lock there as both values are
updated only by workers reporting this message. Also I agree with
where to put the log but I think the log message should start with
lower cases:

+                elog(DEBUG2,
+                         "Autovacuum VacuumUpdateCosts(db=%u, rel=%u,
dobalance=%s, cost_limit=%d, cost_delay=%g active=%s failsafe=%s)",
+                         MyWorkerInfo->wi_dboid, MyWorkerInfo->wi_tableoid,
+
pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance) ? "no" :
"yes",
+                         VacuumCostLimit, VacuumCostDelay,
+                         VacuumCostDelay > 0 ? "yes" : "no",
+                         VacuumFailsafeActive ? "yes" : "no");

Some minor comments on 0003:

+/*
+ * autovac_recalculate_workers_for_balance
+ *             Recalculate the number of workers to consider, given
cost-related
+ *             storage parameters and the current number of active workers.
+ *
+ * Caller must hold the AutovacuumLock in at least shared mode to access
+ * worker->wi_proc.
+ */

Does it make sense to add Assert(LWLockHeldByMe(AutovacuumLock)) at
the beginning of this function?

---
                 /* rebalance in case the default cost parameters changed */
-                LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-                autovac_balance_cost();
+                LWLockAcquire(AutovacuumLock, LW_SHARED);
+                autovac_recalculate_workers_for_balance();
                 LWLockRelease(AutovacuumLock);

Do we really need to have the autovacuum launcher recalculate
av_nworkersForBalance after reloading the config file? Since the cost
parameters are not used inautovac_recalculate_workers_for_balance()
the comment also needs to be updated.

---
+                /*
+                 * Balance and update limit values for autovacuum
workers. We must
+                 * always do this in case the autovacuum launcher or another
+                 * autovacuum worker has recalculated the number of
workers across
+                 * which we must balance the limit. This is done by
the launcher when
+                 * launching a new worker and by workers before
vacuuming each table.
+                 */
+                AutoVacuumUpdateCostLimit();

I think the last sentence is not correct. IIUC recalculation of
av_nworkersForBalance is done by the launcher after a worker finished
and by workers before vacuuming each table.

---
It's not a problem of this patch, but IIUC since we don't reset
wi_dobalance after vacuuming each table we use the last value of
wi_dobalance for performing autovacuum items. At end of the loop for
tables in do_autovacuum() we have the following code that explains why
we don't reset wi_dobalance:

        /*
         * Remove my info from shared memory.  We could, but intentionally
         * don't, unset wi_dobalance on the assumption that we are more likely
         * than not to vacuum a table with no cost-related storage parameters
         * next, so we don't want to give up our share of I/O for a very short
         * interval and thereby thrash the global balance.
         */
        LWLockAcquire(AutovacuumScheduleLock, LW_EXCLUSIVE);
        MyWorkerInfo->wi_tableoid = InvalidOid;
        MyWorkerInfo->wi_sharedrel = false;
        LWLockRelease(AutovacuumScheduleLock);

Assuming we agree with that, probably we need to reset it to true
after vacuuming all tables?

0001 and 0002 patches look good to me except for the renaming GUCs
stuff as the discussion is ongoing.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Should vacuum process config file reload more often

From
Daniel Gustafsson
Date:
> On 6 Apr 2023, at 08:39, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

> Also I agree with
> where to put the log but I think the log message should start with
> lower cases:
>
> +                elog(DEBUG2,
> +                         "Autovacuum VacuumUpdateCosts(db=%u, rel=%u,

In principle I agree, but in this case Autovacuum is a name and should IMO in
userfacing messages start with capital A.

> +/*
> + * autovac_recalculate_workers_for_balance
> + *             Recalculate the number of workers to consider, given
> cost-related
> + *             storage parameters and the current number of active workers.
> + *
> + * Caller must hold the AutovacuumLock in at least shared mode to access
> + * worker->wi_proc.
> + */
>
> Does it make sense to add Assert(LWLockHeldByMe(AutovacuumLock)) at
> the beginning of this function?

That's probably not a bad idea.

> ---
>                 /* rebalance in case the default cost parameters changed */
> -                LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
> -                autovac_balance_cost();
> +                LWLockAcquire(AutovacuumLock, LW_SHARED);
> +                autovac_recalculate_workers_for_balance();
>                 LWLockRelease(AutovacuumLock);
>
> Do we really need to have the autovacuum launcher recalculate
> av_nworkersForBalance after reloading the config file? Since the cost
> parameters are not used inautovac_recalculate_workers_for_balance()
> the comment also needs to be updated.

If I understand this comment right; there was a discussion upthread that simply
doing it in both launcher and worker simplifies the code with little overhead.
A comment can reflect that choice though.

--
Daniel Gustafsson




Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Wed, Apr 5, 2023 at 11:10 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> On Wed, Apr 5, 2023 at 3:43 PM Melanie Plageman <melanieplageman@gmail.com> wrote:
> > On Wed, Apr 5, 2023 at 2:56 PM Robert Haas <robertmhaas@gmail.com> wrote:
> > >
> > > + /*
> > > + * Balance and update limit values for autovacuum workers. We must
> > > + * always do this in case the autovacuum launcher or another
> > > + * autovacuum worker has recalculated the number of workers across
> > > + * which we must balance the limit. This is done by the launcher when
> > > + * launching a new worker and by workers before vacuuming each table.
> > > + */
> > >
> > > I don't quite understand what's going on here. A big reason that I'm
> > > worried about this whole issue in the first place is that sometimes
> > > there's a vacuum going on a giant table and you can't get it to go
> > > fast. You want it to absorb new settings, and to do so quickly. I
> > > realize that this is about the number of workers, not the actual cost
> > > limit, so that makes what I'm about to say less important. But ... is
> > > this often enough? Like, the time before we move onto the next table
> > > could be super long. The time before a new worker is launched should
> > > be ~autovacuum_naptime/autovacuum_max_workers or ~20s with default
> > > settings, so that's not horrible, but I'm kind of struggling to
> > > understand the rationale for this particular choice. Maybe it's fine.
> >
> > VacuumUpdateCosts() also calls AutoVacuumUpdateCostLimit(), so this will
> > happen if a config reload is pending the next time vacuum_delay_point()
> > is called (which is pretty often -- roughly once per block vacuumed but
> > definitely more than once per table).
> >
> > Relevant code is at the top of vacuum_delay_point():
> >
> >     if (ConfigReloadPending && IsAutoVacuumWorkerProcess())
> >     {
> >         ConfigReloadPending = false;
> >         ProcessConfigFile(PGC_SIGHUP);
> >         VacuumUpdateCosts();
> >     }
> >
>
> Gah, I think I misunderstood you. You are saying that only calling
> AutoVacuumUpdateCostLimit() after napping while vacuuming a table may
> not be enough. The frequency at which the number of workers changes will
> likely be different. This is a good point.
> It's kind of weird to call AutoVacuumUpdateCostLimit() only after napping...

A not fully baked idea for a solution:

Why not keep the balanced limit in the atomic instead of the number of
workers for balance. If we expect all of the workers to have the same
value for cost limit, then why would we just count the workers and not
also do the division and store that in the atomic variable. We are
worried about the division not being done often enough, not the number
of workers being out of date. This solves that, right?

- Melanie



Re: Should vacuum process config file reload more often

From
Robert Haas
Date:
On Thu, Apr 6, 2023 at 11:52 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> > Gah, I think I misunderstood you. You are saying that only calling
> > AutoVacuumUpdateCostLimit() after napping while vacuuming a table may
> > not be enough. The frequency at which the number of workers changes will
> > likely be different. This is a good point.
> > It's kind of weird to call AutoVacuumUpdateCostLimit() only after napping...
>
> A not fully baked idea for a solution:
>
> Why not keep the balanced limit in the atomic instead of the number of
> workers for balance. If we expect all of the workers to have the same
> value for cost limit, then why would we just count the workers and not
> also do the division and store that in the atomic variable. We are
> worried about the division not being done often enough, not the number
> of workers being out of date. This solves that, right?

A bird in the hand is worth two in the bush, though. We don't really
have time to redesign the patch before feature freeze, and I can't
convince myself that there's a big enough problem with what you
already did that it would be worth putting off fixing this for another
year. Reading your newer emails, I think that the answer to my
original question is "we don't want to do it at every
vacuum_delay_point because it might be too costly," which is
reasonable.

I don't particularly like this new idea, either, I think. While it may
be true that we expect all the workers to come up with the same
answer, they need not, because rereading the configuration file isn't
synchronized. It would be pretty lame if a worker that had reread an
updated value from the configuration file recomputed the value, and
then another worker that still had an older value recalculated it
again just afterward. Keeping only the number of workers in memory
avoids the possibility of thrashing around in situations like that.

I do kind of wonder if it would be possible to rejigger things so that
we didn't have to keep recalculating av_nworkersForBalance, though.
Perhaps now is not the time due to the impending freeze, but maybe we
should explore maintaining that value in such a way that it is correct
at every instant, instead of recalculating it at intervals.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Should vacuum process config file reload more often

From
Robert Haas
Date:
On Wed, Apr 5, 2023 at 4:59 PM Peter Geoghegan <pg@bowt.ie> wrote:
> I think that I agree. I think that the difficulty of tuning autovacuum
> is the actual real problem. (Or maybe it's just very closely related
> to the real problem -- the precise definition doesn't seem important.)

I agree, and I think that bad choices around what the parameters do
are a big part of the problem. autovacuum_max_workers is one example
of that, but there are a bunch of others. It's not at all intuitive
that if your database gets really big you either need to raise
autovacuum_vacuum_cost_limit or lower autovacuum_vacuum_cost_delay.
And, it's not intuitive either that raising autovacuum_max_workers
doesn't increase the amount of vacuuming that gets done. In my
experience, it's very common for people to observe that autovacuum is
running constantly, and to figure out that the number of running
workers is equal to autovacuum_max_workers at all times, and to then
conclude that they need more workers. So they raise
autovacuum_max_workers and nothing gets any better. In fact, things
might get *worse*, because the time required to complete vacuuming of
a large table can increase if the available bandwidth is potentially
spread across more workers, and it's very often the time to vacuum the
largest tables that determines whether things hold together adequately
or not.

This kind of stuff drives me absolutely batty. It's impossible to make
every database behavior completely intuitive, but here we have a
parameter that seems like it is exactly the right thing to solve the
problem that the user knows they have, and it actually does nothing on
a good day and causes a regression on a bad one. That's incredibly
poor design.

The way it works at the implementation level is pretty kooky, too. The
available resources are split between the workers, but if any of the
relevant vacuum parameters are set for the table currently being
vacuumed, then that worker gets the full resources configured for that
table, and everyone else divides up the amount that's configured
globally. So if you went and set the cost delay and cost limit for all
of your tables to exactly the same values that are configured
globally, you'd vacuum 3 times faster than if you relied on the
identical global defaults (or N times faster, where N is the value
you've picked for autovacuum_max_workers). If you have one really big
table that requires continuous vacuuming, you could slow down
vacuuming on that table through manual configuration settings and
still end up speeding up vacuuming overall, because the remaining
workers would be dividing the budget implied by the default settings
among N-1 workers instead of N workers. As far as I can see, none of
this is documented, which is perhaps for the best, because IMV it
makes no sense.

I think we need to move more toward a model where VACUUM just keeps
up. Emergency mode is a step in that direction, because the definition
of an emergency is that we're definitely not keeping up, but I think
we need something less Boolean. If the database gets bigger or smaller
or more or less active, autovacuum should somehow just adjust to that,
without so much manual fiddling. I think it's good to have the
possibility of some manual fiddling to handle problematic situations,
but you shouldn't have to do it just because you made a table bigger.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Should vacuum process config file reload more often

From
Daniel Gustafsson
Date:
> On 6 Apr 2023, at 19:18, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Apr 6, 2023 at 11:52 AM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
>>> Gah, I think I misunderstood you. You are saying that only calling
>>> AutoVacuumUpdateCostLimit() after napping while vacuuming a table may
>>> not be enough. The frequency at which the number of workers changes will
>>> likely be different. This is a good point.
>>> It's kind of weird to call AutoVacuumUpdateCostLimit() only after napping...
>>
>> A not fully baked idea for a solution:
>>
>> Why not keep the balanced limit in the atomic instead of the number of
>> workers for balance. If we expect all of the workers to have the same
>> value for cost limit, then why would we just count the workers and not
>> also do the division and store that in the atomic variable. We are
>> worried about the division not being done often enough, not the number
>> of workers being out of date. This solves that, right?
>
> A bird in the hand is worth two in the bush, though. We don't really
> have time to redesign the patch before feature freeze, and I can't
> convince myself that there's a big enough problem with what you
> already did that it would be worth putting off fixing this for another
> year.

+1, I'd rather see we did a conservative version of the feature first and
expand upon it in the 17 cycle.

> Reading your newer emails, I think that the answer to my
> original question is "we don't want to do it at every
> vacuum_delay_point because it might be too costly," which is
> reasonable.

I think we kind of need to get to that granularity eventually, but it's not a
showstopper for this feature, and can probably benefit from being done in the
context of a larger av-worker re-think (the importance of which discussed
downthread).

--
Daniel Gustafsson




Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
v17 attached does not yet fix the logging problem or variable naming
problem.

I have not changed where AutoVacuumUpdateCostLimit() is called either.

This is effectively just a round of cleanup. I hope I have managed to
address all other code review feedback so far, though some may have
slipped through the cracks.

On Wed, Apr 5, 2023 at 2:56 PM Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Apr 5, 2023 at 11:29 AM Melanie Plageman <melanieplageman@gmail.com> wrote:
> + /*
> + * Balance and update limit values for autovacuum workers. We must
> + * always do this in case the autovacuum launcher or another
> + * autovacuum worker has recalculated the number of workers across
> + * which we must balance the limit. This is done by the launcher when
> + * launching a new worker and by workers before vacuuming each table.
> + */
>
> I don't quite understand what's going on here. A big reason that I'm
> worried about this whole issue in the first place is that sometimes
> there's a vacuum going on a giant table and you can't get it to go
> fast. You want it to absorb new settings, and to do so quickly. I
> realize that this is about the number of workers, not the actual cost
> limit, so that makes what I'm about to say less important. But ... is
> this often enough? Like, the time before we move onto the next table
> could be super long. The time before a new worker is launched should
> be ~autovacuum_naptime/autovacuum_max_workers or ~20s with default
> settings, so that's not horrible, but I'm kind of struggling to
> understand the rationale for this particular choice. Maybe it's fine.

I've at least updated this comment to be more correct/less misleading.

>
> +               if (autovacuum_vac_cost_limit > 0)
> +                       VacuumCostLimit = autovacuum_vac_cost_limit;
> +               else
> +                       VacuumCostLimit = vacuum_cost_limit;
> +
> +               /* Only balance limit if no cost-related storage
> parameters specified */
> +               if (pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance))
> +                       return;
> +               Assert(VacuumCostLimit > 0);
> +
> +               nworkers_for_balance = pg_atomic_read_u32(
> +
> &AutoVacuumShmem->av_nworkersForBalance);
> +
> +               /* There is at least 1 autovac worker (this worker). */
> +               if (nworkers_for_balance <= 0)
> +                       elog(ERROR, "nworkers_for_balance must be > 0");
> +
> +               VacuumCostLimit = Max(VacuumCostLimit /
> nworkers_for_balance, 1);
>
> I think it would be better stylistically to use a temporary variable
> here and only assign the final value to VacuumCostLimit.

I tried that and thought it adding confusing clutter. If it is a code
cleanliness issue, I am willing to change it, though.

On Wed, Apr 5, 2023 at 3:04 PM Daniel Gustafsson <daniel@yesql.se> wrote:
>
> > On 5 Apr 2023, at 20:55, Robert Haas <robertmhaas@gmail.com> wrote:
>
> > Again, I don't think this is something we should try to
> > address right now under time pressure, but in the future, I think we
> > should consider ripping this behavior out.
>
> I would not be opposed to that, but I wholeheartedly agree that it's not the
> job of this patch (or any patch at this point in the cycle).
>
> > +               if (autovacuum_vac_cost_limit > 0)
> > +                       VacuumCostLimit = autovacuum_vac_cost_limit;
> > +               else
> > +                       VacuumCostLimit = vacuum_cost_limit;
> > +
> > +               /* Only balance limit if no cost-related storage
> > parameters specified */
> > +               if (pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance))
> > +                       return;
> > +               Assert(VacuumCostLimit > 0);
> > +
> > +               nworkers_for_balance = pg_atomic_read_u32(
> > +
> > &AutoVacuumShmem->av_nworkersForBalance);
> > +
> > +               /* There is at least 1 autovac worker (this worker). */
> > +               if (nworkers_for_balance <= 0)
> > +                       elog(ERROR, "nworkers_for_balance must be > 0");
> > +
> > +               VacuumCostLimit = Max(VacuumCostLimit /
> > nworkers_for_balance, 1);
> >
> > I think it would be better stylistically to use a temporary variable
> > here and only assign the final value to VacuumCostLimit.
>
> I can agree with that.  Another supertiny nitpick on the above is to not end a
> single-line comment with a period.

I have fixed this.

On Thu, Apr 6, 2023 at 2:40 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Apr 6, 2023 at 12:29 AM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> >
> > Thanks all for the reviews.
> >
> > v16 attached. I put it together rather quickly, so there might be a few
> > spurious whitespaces or similar. There is one rather annoying pgindent
> > outlier that I have to figure out what to do about as well.
> >
> > The remaining functional TODOs that I know of are:
> >
> > - Resolve what to do about names of GUC and vacuum variables for cost
> >   limit and cost delay (since it may affect extensions)
> >
> > - Figure out what to do about the logging message which accesses dboid
> >   and tableoid (lock/no lock, where to put it, etc)
> >
> > - I see several places in docs which reference the balancing algorithm
> >   for autovac workers. I did not read them in great detail, but we may
> >   want to review them to see if any require updates.
> >
> > - Consider whether or not the initial two commits should just be
> >   squashed with the third commit
> >
> > - Anything else reviewers are still unhappy with
> >
> >
> > On Wed, Apr 5, 2023 at 1:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Wed, Apr 5, 2023 at 5:05 AM Melanie Plageman
> > > <melanieplageman@gmail.com> wrote:
> > > >
> > > > On Tue, Apr 4, 2023 at 4:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > ---
> > > > > -                if (worker->wi_proc != NULL)
> > > > > -                        elog(DEBUG2, "autovac_balance_cost(pid=%d
> > > > > db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d,
> > > > > cost_delay=%g)",
> > > > > -                                 worker->wi_proc->pid,
> > > > > worker->wi_dboid, worker->wi_tableoid,
> > > > > -                                 worker->wi_dobalance ? "yes" : "no",
> > > > > -                                 worker->wi_cost_limit,
> > > > > worker->wi_cost_limit_base,
> > > > > -                                 worker->wi_cost_delay);
> > > > >
> > > > > I think it's better to keep this kind of log in some form for
> > > > > debugging. For example, we can show these values of autovacuum workers
> > > > > in VacuumUpdateCosts().
> > > >
> > > > I added a message to do_autovacuum() after calling VacuumUpdateCosts()
> > > > in the loop vacuuming each table. That means it will happen once per
> > > > table. It's not ideal that I had to move the call to VacuumUpdateCosts()
> > > > behind the shared lock in that loop so that we could access the pid and
> > > > such in the logging message after updating the cost and delay, but it is
> > > > probably okay. Though noone is going to be changing those at this
> > > > point, it still seemed better to access them under the lock.
> > > >
> > > > This does mean we won't log anything when we do change the values of
> > > > VacuumCostDelay and VacuumCostLimit while vacuuming a table. Is it worth
> > > > adding some code to do that in VacuumUpdateCosts() (only when the value
> > > > has changed not on every call to VacuumUpdateCosts())? Or perhaps we
> > > > could add it in the config reload branch that is already in
> > > > vacuum_delay_point()?
> > >
> > > Previously, we used to show the pid in the log since a worker/launcher
> > > set other workers' delay costs. But now that the worker sets its delay
> > > costs, we don't need to show the pid in the log. Also, I think it's
> > > useful for debugging and investigating the system if we log it when
> > > changing the values. The log I imagined to add was like:
> > >
> > > @@ -1801,6 +1801,13 @@ VacuumUpdateCosts(void)
> > >             VacuumCostDelay = vacuum_cost_delay;
> > >
> > >         AutoVacuumUpdateLimit();
> > > +
> > > +       elog(DEBUG2, "autovacuum update costs (db=%u, rel=%u,
> > > dobalance=%s, cost_limit=%d, cost_delay=%g active=%s failsafe=%s)",
> > > +            MyWorkerInfo->wi_dboid, MyWorkerInfo->wi_tableoid,
> > > +            pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance)
> > > ? "no" : "yes",
> > > +            VacuumCostLimit, VacuumCostDelay,
> > > +            VacuumCostDelay > 0 ? "yes" : "no",
> > > +            VacuumFailsafeActive ? "yes" : "no");
> > >     }
> > >     else
> > >     {
> >
> > Makes sense. I've updated the log message to roughly what you suggested.
> > I also realized I think it does make sense to call it in
> > VacuumUpdateCosts() -- only for autovacuum workers of course. I've done
> > this. I haven't taken the lock though and can't decide if I must since
> > they access dboid and tableoid -- those are not going to change at this
> > point, but I still don't know if I can access them lock-free...
> > Perhaps there is a way to condition it on the log level?
> >
> > If I have to take a lock, then I don't know if we should put these in
> > VacuumUpdateCosts()...
>
> I think we don't need to acquire a lock there as both values are
> updated only by workers reporting this message.

I dunno. I just don't feel that comfortable saying, oh it's okay to
access these without a lock probably. I propose we do one of the
following:

- Take a shared lock inside VacuumUpdateCosts() (it is not called on every
    call to vacuum_delay_point()) before reading from these variables.

    Pros:
        - We will log whenever there is a change to these parameters
    Cons:
        - This adds overhead in the common case when log level is < DEBUG2.
            Is there a way to check the log level before taking the lock?
        - Acquiring the lock inside the function is inconsistent with the
            pattern that some of the other autovacuum functions requiring
            locks use (they assume you have a lock if needed inside of the
            function). But, we could assert that the lock is not already held.
        - If we later decide we don't like this choice and want to move the
            logging elsewhere, it will necessarily log less frequently which
            seems like a harder change to make than logging more frequently.

- Move this logging into the loop through relations in do_autovacuum()
    and the config reload code and take the shared lock before doing the
    logging.

    Pros:
        - Seems safe and not expensive
        - Covers most of the times we would want the logging
    Cons:
        - duplicates logging in two places

> Some minor comments on 0003:
>
> +/*
> + * autovac_recalculate_workers_for_balance
> + *             Recalculate the number of workers to consider, given
> cost-related
> + *             storage parameters and the current number of active workers.
> + *
> + * Caller must hold the AutovacuumLock in at least shared mode to access
> + * worker->wi_proc.
> + */
>
> Does it make sense to add Assert(LWLockHeldByMe(AutovacuumLock)) at
> the beginning of this function?

I've added this. It is called infrequently enough to be okay, I think.


>                  /* rebalance in case the default cost parameters changed */
> -                LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
> -                autovac_balance_cost();
> +                LWLockAcquire(AutovacuumLock, LW_SHARED);
> +                autovac_recalculate_workers_for_balance();
>                  LWLockRelease(AutovacuumLock);
>
> Do we really need to have the autovacuum launcher recalculate
> av_nworkersForBalance after reloading the config file? Since the cost
> parameters are not used inautovac_recalculate_workers_for_balance()
> the comment also needs to be updated.

Yep, almost certainly don't need this. I've removed this call to
autovac_recalculate_workers_for_balance().

> +                /*
> +                 * Balance and update limit values for autovacuum
> workers. We must
> +                 * always do this in case the autovacuum launcher or another
> +                 * autovacuum worker has recalculated the number of
> workers across
> +                 * which we must balance the limit. This is done by
> the launcher when
> +                 * launching a new worker and by workers before
> vacuuming each table.
> +                 */
> +                AutoVacuumUpdateCostLimit();
>
> I think the last sentence is not correct. IIUC recalculation of
> av_nworkersForBalance is done by the launcher after a worker finished
> and by workers before vacuuming each table.

Yes, you are right. However, I think the comment was generally
misleading and I have reworded it.

> It's not a problem of this patch, but IIUC since we don't reset
> wi_dobalance after vacuuming each table we use the last value of
> wi_dobalance for performing autovacuum items. At end of the loop for
> tables in do_autovacuum() we have the following code that explains why
> we don't reset wi_dobalance:
>
>         /*
>          * Remove my info from shared memory.  We could, but intentionally
>          * don't, unset wi_dobalance on the assumption that we are more likely
>          * than not to vacuum a table with no cost-related storage parameters
>          * next, so we don't want to give up our share of I/O for a very short
>          * interval and thereby thrash the global balance.
>          */
>         LWLockAcquire(AutovacuumScheduleLock, LW_EXCLUSIVE);
>         MyWorkerInfo->wi_tableoid = InvalidOid;
>         MyWorkerInfo->wi_sharedrel = false;
>         LWLockRelease(AutovacuumScheduleLock);
>
> Assuming we agree with that, probably we need to reset it to true
> after vacuuming all tables?

Ah, great point. I have done this.

On Thu, Apr 6, 2023 at 8:29 AM Daniel Gustafsson <daniel@yesql.se> wrote:
>
> > On 6 Apr 2023, at 08:39, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> > Also I agree with
> > where to put the log but I think the log message should start with
> > lower cases:
> >
> > +                elog(DEBUG2,
> > +                         "Autovacuum VacuumUpdateCosts(db=%u, rel=%u,
>
> In principle I agree, but in this case Autovacuum is a name and should IMO in
> userfacing messages start with capital A.

I've left this unchanged while I agonize over what to do with the
placement of the log message in general. But I am happy to keep it
uppercase.

> > +/*
> > + * autovac_recalculate_workers_for_balance
> > + *             Recalculate the number of workers to consider, given
> > cost-related
> > + *             storage parameters and the current number of active workers.
> > + *
> > + * Caller must hold the AutovacuumLock in at least shared mode to access
> > + * worker->wi_proc.
> > + */
> >
> > Does it make sense to add Assert(LWLockHeldByMe(AutovacuumLock)) at
> > the beginning of this function?
>
> That's probably not a bad idea.

Done.

> > ---
> >                 /* rebalance in case the default cost parameters changed */
> > -                LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
> > -                autovac_balance_cost();
> > +                LWLockAcquire(AutovacuumLock, LW_SHARED);
> > +                autovac_recalculate_workers_for_balance();
> >                 LWLockRelease(AutovacuumLock);
> >
> > Do we really need to have the autovacuum launcher recalculate
> > av_nworkersForBalance after reloading the config file? Since the cost
> > parameters are not used inautovac_recalculate_workers_for_balance()
> > the comment also needs to be updated.
>
> If I understand this comment right; there was a discussion upthread that simply
> doing it in both launcher and worker simplifies the code with little overhead.
> A comment can reflect that choice though.

Yes, but now that this function no longer deals with the cost limit and
delay values itself, we can remove it.

- Melanie

Attachment

Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
I think attached v18 addresses all outstanding issues except a run
through the docs making sure all mentions of the balancing algorithm are
still correct.

On Wed, Apr 5, 2023 at 9:10 AM Daniel Gustafsson <daniel@yesql.se> wrote:
> > On 4 Apr 2023, at 22:04, Melanie Plageman <melanieplageman@gmail.com> wrote:
> >> +extern int VacuumCostLimit;
> >> +extern double VacuumCostDelay;
> >> ...
> >> -extern PGDLLIMPORT int VacuumCostLimit;
> >> -extern PGDLLIMPORT double VacuumCostDelay;
> >>
> >> Same with these, I don't think this is according to our default visibility.
> >> Moreover, I'm not sure it's a good idea to perform this rename.  This will keep
> >> VacuumCostLimit and VacuumCostDelay exported, but change their meaning.  Any
> >> external code referring to these thinking they are backing the GUCs will still
> >> compile, but may be broken in subtle ways.  Is there a reason for not keeping
> >> the current GUC variables and instead add net new ones?
> >
> > When VacuumCostLimit was the same variable in the code and for the GUC
> > vacuum_cost_limit, everytime we reload the config file, VacuumCostLimit
> > is overwritten. Autovacuum workers have to overwrite this value with the
> > appropriate one for themselves given the balancing logic and the value
> > of autovacuum_vacuum_cost_limit. However, the problem is, because you
> > can specify -1 for autovacuum_vacuum_cost_limit to indicate it should
> > fall back to vacuum_cost_limit, we have to reference the value of
> > VacuumCostLimit when calculating the new autovacuum worker's cost limit
> > after a config reload.
> >
> > But, you have to be sure you *only* do this after a config reload when
> > the value of VacuumCostLimit is fresh and unmodified or you risk
> > dividing the value of VacuumCostLimit over and over. That means it is
> > unsafe to call functions updating the cost limit more than once.
> >
> > This orchestration wasn't as difficult when we only reloaded the config
> > file once every table. We were careful about it and also kept the
> > original "base" cost limit around from table_recheck_autovac(). However,
> > once we started reloading the config file more often, this no longer
> > works.
> >
> > By separating the variables modified when the gucs are set and the ones
> > used the code, we can make sure we always have the original value the
> > guc was set to in vacuum_cost_limit and autovacuum_vacuum_cost_limit,
> > whenever we need to reference it.
> >
> > That being said, perhaps we should document what extensions should do?
> > Do you think they will want to use the variables backing the gucs or to
> > be able to overwrite the variables being used in the code?
>
> I think I wasn't clear in my comment, sorry.  I don't have a problem with
> introducing a new variable to split the balanced value from the GUC value.
> What I don't think we should do is repurpose an exported symbol into doing a
> new thing.  In the case at hand I think VacuumCostLimit and VacuumCostDelay
> should remain the backing variables for the GUCs, with vacuum_cost_limit and
> vacuum_cost_delay carrying the balanced values.  So the inverse of what is in
> the patch now.
>
> The risk of these symbols being used in extensions might be very low but on
> principle it seems unwise to alter a symbol and risk subtle breakage.

In attached v18, I have flipped them. Existing (in master) GUCs which
were exported for VacuumCostLimit and VacuumCostDelay retain their names
and new globals vacuum_cost_limit and vacuum_cost_delay have been
introduced for use in the code.

Flipping these kind of melted my mind, so I could definitely use another
set of eyes double checking that the correct ones are being used in the
correct places throughout 0002 and 0003.

On Thu, Apr 6, 2023 at 3:09 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> v17 attached does not yet fix the logging problem or variable naming
> problem.
>
> I have not changed where AutoVacuumUpdateCostLimit() is called either.
>
> This is effectively just a round of cleanup. I hope I have managed to
> address all other code review feedback so far, though some may have
> slipped through the cracks.
>
> On Wed, Apr 5, 2023 at 2:56 PM Robert Haas <robertmhaas@gmail.com> wrote:
> > On Wed, Apr 5, 2023 at 11:29 AM Melanie Plageman <melanieplageman@gmail.com> wrote:
> > + /*
> > + * Balance and update limit values for autovacuum workers. We must
> > + * always do this in case the autovacuum launcher or another
> > + * autovacuum worker has recalculated the number of workers across
> > + * which we must balance the limit. This is done by the launcher when
> > + * launching a new worker and by workers before vacuuming each table.
> > + */
> >
> > I don't quite understand what's going on here. A big reason that I'm
> > worried about this whole issue in the first place is that sometimes
> > there's a vacuum going on a giant table and you can't get it to go
> > fast. You want it to absorb new settings, and to do so quickly. I
> > realize that this is about the number of workers, not the actual cost
> > limit, so that makes what I'm about to say less important. But ... is
> > this often enough? Like, the time before we move onto the next table
> > could be super long. The time before a new worker is launched should
> > be ~autovacuum_naptime/autovacuum_max_workers or ~20s with default
> > settings, so that's not horrible, but I'm kind of struggling to
> > understand the rationale for this particular choice. Maybe it's fine.
>
> I've at least updated this comment to be more correct/less misleading.
>
> >
> > +               if (autovacuum_vac_cost_limit > 0)
> > +                       VacuumCostLimit = autovacuum_vac_cost_limit;
> > +               else
> > +                       VacuumCostLimit = vacuum_cost_limit;
> > +
> > +               /* Only balance limit if no cost-related storage
> > parameters specified */
> > +               if (pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance))
> > +                       return;
> > +               Assert(VacuumCostLimit > 0);
> > +
> > +               nworkers_for_balance = pg_atomic_read_u32(
> > +
> > &AutoVacuumShmem->av_nworkersForBalance);
> > +
> > +               /* There is at least 1 autovac worker (this worker). */
> > +               if (nworkers_for_balance <= 0)
> > +                       elog(ERROR, "nworkers_for_balance must be > 0");
> > +
> > +               VacuumCostLimit = Max(VacuumCostLimit /
> > nworkers_for_balance, 1);
> >
> > I think it would be better stylistically to use a temporary variable
> > here and only assign the final value to VacuumCostLimit.
>
> I tried that and thought it adding confusing clutter. If it is a code
> cleanliness issue, I am willing to change it, though.
>
> On Wed, Apr 5, 2023 at 3:04 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> >
> > > On 5 Apr 2023, at 20:55, Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > > Again, I don't think this is something we should try to
> > > address right now under time pressure, but in the future, I think we
> > > should consider ripping this behavior out.
> >
> > I would not be opposed to that, but I wholeheartedly agree that it's not the
> > job of this patch (or any patch at this point in the cycle).
> >
> > > +               if (autovacuum_vac_cost_limit > 0)
> > > +                       VacuumCostLimit = autovacuum_vac_cost_limit;
> > > +               else
> > > +                       VacuumCostLimit = vacuum_cost_limit;
> > > +
> > > +               /* Only balance limit if no cost-related storage
> > > parameters specified */
> > > +               if (pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance))
> > > +                       return;
> > > +               Assert(VacuumCostLimit > 0);
> > > +
> > > +               nworkers_for_balance = pg_atomic_read_u32(
> > > +
> > > &AutoVacuumShmem->av_nworkersForBalance);
> > > +
> > > +               /* There is at least 1 autovac worker (this worker). */
> > > +               if (nworkers_for_balance <= 0)
> > > +                       elog(ERROR, "nworkers_for_balance must be > 0");
> > > +
> > > +               VacuumCostLimit = Max(VacuumCostLimit /
> > > nworkers_for_balance, 1);
> > >
> > > I think it would be better stylistically to use a temporary variable
> > > here and only assign the final value to VacuumCostLimit.
> >
> > I can agree with that.  Another supertiny nitpick on the above is to not end a
> > single-line comment with a period.
>
> I have fixed this.
>
> On Thu, Apr 6, 2023 at 2:40 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Apr 6, 2023 at 12:29 AM Melanie Plageman
> > <melanieplageman@gmail.com> wrote:
> > >
> > > Thanks all for the reviews.
> > >
> > > v16 attached. I put it together rather quickly, so there might be a few
> > > spurious whitespaces or similar. There is one rather annoying pgindent
> > > outlier that I have to figure out what to do about as well.
> > >
> > > The remaining functional TODOs that I know of are:
> > >
> > > - Resolve what to do about names of GUC and vacuum variables for cost
> > >   limit and cost delay (since it may affect extensions)
> > >
> > > - Figure out what to do about the logging message which accesses dboid
> > >   and tableoid (lock/no lock, where to put it, etc)
> > >
> > > - I see several places in docs which reference the balancing algorithm
> > >   for autovac workers. I did not read them in great detail, but we may
> > >   want to review them to see if any require updates.
> > >
> > > - Consider whether or not the initial two commits should just be
> > >   squashed with the third commit
> > >
> > > - Anything else reviewers are still unhappy with
> > >
> > >
> > > On Wed, Apr 5, 2023 at 1:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Wed, Apr 5, 2023 at 5:05 AM Melanie Plageman
> > > > <melanieplageman@gmail.com> wrote:
> > > > >
> > > > > On Tue, Apr 4, 2023 at 4:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > > ---
> > > > > > -                if (worker->wi_proc != NULL)
> > > > > > -                        elog(DEBUG2, "autovac_balance_cost(pid=%d
> > > > > > db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d,
> > > > > > cost_delay=%g)",
> > > > > > -                                 worker->wi_proc->pid,
> > > > > > worker->wi_dboid, worker->wi_tableoid,
> > > > > > -                                 worker->wi_dobalance ? "yes" : "no",
> > > > > > -                                 worker->wi_cost_limit,
> > > > > > worker->wi_cost_limit_base,
> > > > > > -                                 worker->wi_cost_delay);
> > > > > >
> > > > > > I think it's better to keep this kind of log in some form for
> > > > > > debugging. For example, we can show these values of autovacuum workers
> > > > > > in VacuumUpdateCosts().
> > > > >
> > > > > I added a message to do_autovacuum() after calling VacuumUpdateCosts()
> > > > > in the loop vacuuming each table. That means it will happen once per
> > > > > table. It's not ideal that I had to move the call to VacuumUpdateCosts()
> > > > > behind the shared lock in that loop so that we could access the pid and
> > > > > such in the logging message after updating the cost and delay, but it is
> > > > > probably okay. Though noone is going to be changing those at this
> > > > > point, it still seemed better to access them under the lock.
> > > > >
> > > > > This does mean we won't log anything when we do change the values of
> > > > > VacuumCostDelay and VacuumCostLimit while vacuuming a table. Is it worth
> > > > > adding some code to do that in VacuumUpdateCosts() (only when the value
> > > > > has changed not on every call to VacuumUpdateCosts())? Or perhaps we
> > > > > could add it in the config reload branch that is already in
> > > > > vacuum_delay_point()?
> > > >
> > > > Previously, we used to show the pid in the log since a worker/launcher
> > > > set other workers' delay costs. But now that the worker sets its delay
> > > > costs, we don't need to show the pid in the log. Also, I think it's
> > > > useful for debugging and investigating the system if we log it when
> > > > changing the values. The log I imagined to add was like:
> > > >
> > > > @@ -1801,6 +1801,13 @@ VacuumUpdateCosts(void)
> > > >             VacuumCostDelay = vacuum_cost_delay;
> > > >
> > > >         AutoVacuumUpdateLimit();
> > > > +
> > > > +       elog(DEBUG2, "autovacuum update costs (db=%u, rel=%u,
> > > > dobalance=%s, cost_limit=%d, cost_delay=%g active=%s failsafe=%s)",
> > > > +            MyWorkerInfo->wi_dboid, MyWorkerInfo->wi_tableoid,
> > > > +            pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance)
> > > > ? "no" : "yes",
> > > > +            VacuumCostLimit, VacuumCostDelay,
> > > > +            VacuumCostDelay > 0 ? "yes" : "no",
> > > > +            VacuumFailsafeActive ? "yes" : "no");
> > > >     }
> > > >     else
> > > >     {
> > >
> > > Makes sense. I've updated the log message to roughly what you suggested.
> > > I also realized I think it does make sense to call it in
> > > VacuumUpdateCosts() -- only for autovacuum workers of course. I've done
> > > this. I haven't taken the lock though and can't decide if I must since
> > > they access dboid and tableoid -- those are not going to change at this
> > > point, but I still don't know if I can access them lock-free...
> > > Perhaps there is a way to condition it on the log level?
> > >
> > > If I have to take a lock, then I don't know if we should put these in
> > > VacuumUpdateCosts()...
> >
> > I think we don't need to acquire a lock there as both values are
> > updated only by workers reporting this message.
>
> I dunno. I just don't feel that comfortable saying, oh it's okay to
> access these without a lock probably. I propose we do one of the
> following:
>
> - Take a shared lock inside VacuumUpdateCosts() (it is not called on every
>     call to vacuum_delay_point()) before reading from these variables.
>
>     Pros:
>         - We will log whenever there is a change to these parameters
>     Cons:
>         - This adds overhead in the common case when log level is < DEBUG2.
>             Is there a way to check the log level before taking the lock?
>         - Acquiring the lock inside the function is inconsistent with the
>             pattern that some of the other autovacuum functions requiring
>             locks use (they assume you have a lock if needed inside of the
>             function). But, we could assert that the lock is not already held.
>         - If we later decide we don't like this choice and want to move the
>             logging elsewhere, it will necessarily log less frequently which
>             seems like a harder change to make than logging more frequently.
>
> - Move this logging into the loop through relations in do_autovacuum()
>     and the config reload code and take the shared lock before doing the
>     logging.
>
>     Pros:
>         - Seems safe and not expensive
>         - Covers most of the times we would want the logging
>     Cons:
>         - duplicates logging in two places

Okay, in an attempt to wrap up this saga, I have made the following
change:

Autovacuum workers, at the end of VacuumUpdateCosts(), check if cost
limit or cost delay have been changed. If they have, they assert that
they don't already hold the AutovacuumLock, take it in shared mode, and
do the logging.

- Melanie

Attachment

Re: Should vacuum process config file reload more often

From
Daniel Gustafsson
Date:
> On 6 Apr 2023, at 23:06, Melanie Plageman <melanieplageman@gmail.com> wrote:

> Autovacuum workers, at the end of VacuumUpdateCosts(), check if cost
> limit or cost delay have been changed. If they have, they assert that
> they don't already hold the AutovacuumLock, take it in shared mode, and
> do the logging.

Another idea would be to copy the values to local temp variables while holding
the lock, and release the lock before calling elog() to avoid holding the lock
over potential IO.

--
Daniel Gustafsson




Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Thu, Apr 6, 2023 at 5:45 PM Daniel Gustafsson <daniel@yesql.se> wrote:
>
> > On 6 Apr 2023, at 23:06, Melanie Plageman <melanieplageman@gmail.com> wrote:
>
> > Autovacuum workers, at the end of VacuumUpdateCosts(), check if cost
> > limit or cost delay have been changed. If they have, they assert that
> > they don't already hold the AutovacuumLock, take it in shared mode, and
> > do the logging.
>
> Another idea would be to copy the values to local temp variables while holding
> the lock, and release the lock before calling elog() to avoid holding the lock
> over potential IO.

Good idea. I've done this in attached v19.
Also I looked through the docs and everything still looks correct for
balancing algo.

- Melanie

Attachment

Re: Should vacuum process config file reload more often

From
Daniel Gustafsson
Date:
> On 7 Apr 2023, at 00:12, Melanie Plageman <melanieplageman@gmail.com> wrote:
>
> On Thu, Apr 6, 2023 at 5:45 PM Daniel Gustafsson <daniel@yesql.se> wrote:
>>
>>> On 6 Apr 2023, at 23:06, Melanie Plageman <melanieplageman@gmail.com> wrote:
>>
>>> Autovacuum workers, at the end of VacuumUpdateCosts(), check if cost
>>> limit or cost delay have been changed. If they have, they assert that
>>> they don't already hold the AutovacuumLock, take it in shared mode, and
>>> do the logging.
>>
>> Another idea would be to copy the values to local temp variables while holding
>> the lock, and release the lock before calling elog() to avoid holding the lock
>> over potential IO.
>
> Good idea. I've done this in attached v19.
> Also I looked through the docs and everything still looks correct for
> balancing algo.

I had another read-through and test-through of this version, and have applied
it with some minor changes to comments and whitespace.  Thanks for the quick
turnaround times on reviews in this thread!

I opted for keeping the three individual commits, squashing them didn't seem
helpful enough to future commitlog readers and no other combination of the
three made more sense than what has been in the thread.

--
Daniel Gustafsson




Re: Should vacuum process config file reload more often

From
Masahiko Sawada
Date:
On Fri, Apr 7, 2023 at 8:08 AM Daniel Gustafsson <daniel@yesql.se> wrote:
>
> > On 7 Apr 2023, at 00:12, Melanie Plageman <melanieplageman@gmail.com> wrote:
> >
> > On Thu, Apr 6, 2023 at 5:45 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> >>
> >>> On 6 Apr 2023, at 23:06, Melanie Plageman <melanieplageman@gmail.com> wrote:
> >>
> >>> Autovacuum workers, at the end of VacuumUpdateCosts(), check if cost
> >>> limit or cost delay have been changed. If they have, they assert that
> >>> they don't already hold the AutovacuumLock, take it in shared mode, and
> >>> do the logging.
> >>
> >> Another idea would be to copy the values to local temp variables while holding
> >> the lock, and release the lock before calling elog() to avoid holding the lock
> >> over potential IO.
> >
> > Good idea. I've done this in attached v19.
> > Also I looked through the docs and everything still looks correct for
> > balancing algo.
>
> I had another read-through and test-through of this version, and have applied
> it with some minor changes to comments and whitespace.  Thanks for the quick
> turnaround times on reviews in this thread!

Cool!

Regarding the commit 7d71d3dd08, I have one comment:

+       /* Only log updates to cost-related variables */
+       if (vacuum_cost_delay == original_cost_delay &&
+           vacuum_cost_limit == original_cost_limit)
+           return;

IIUC by default, we log not only before starting the vacuum but also
when changing cost-related variables. Which is good, I think, because
logging the initial values would also be helpful for investigation.
However, I think that we don't log the initial vacuum cost values
depending on the values. For example, if the
autovacuum_vacuum_cost_delay storage option is set to 0, we don't log
the initial values. I think that instead of comparing old and new
values, we can write the log only if
message_level_is_interesting(DEBUG2) is true. That way, we don't need
to acquire the lwlock unnecessarily. And the code looks cleaner to me.
I've attached the patch (use_message_level_is_interesting.patch)

Also, while testing the autovacuum delay with relopt
autovacuum_vacuum_cost_delay = 0, I realized that even if we set
autovacuum_vacuum_cost_delay = 0 to a table, wi_dobalance is set to
true. wi_dobalance comes from the following expression:

        /*
         * If any of the cost delay parameters has been set individually for
         * this table, disable the balancing algorithm.
         */
        tab->at_dobalance =
            !(avopts && (avopts->vacuum_cost_limit > 0 ||
                         avopts->vacuum_cost_delay > 0));

The initial values of both avopts->vacuum_cost_limit and
avopts->vacuum_cost_delay are -1. I think we should use ">= 0" instead
of "> 0". Otherwise, we include the autovacuum worker working on a
table whose autovacuum_vacuum_cost_delay is 0 to the balancing
algorithm. Probably this behavior has existed also on back branches
but I haven't checked it yet.


Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Should vacuum process config file reload more often

From
Daniel Gustafsson
Date:
> On 7 Apr 2023, at 08:52, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> On Fri, Apr 7, 2023 at 8:08 AM Daniel Gustafsson <daniel@yesql.se> wrote:

>> I had another read-through and test-through of this version, and have applied
>> it with some minor changes to comments and whitespace.  Thanks for the quick
>> turnaround times on reviews in this thread!
>
> Cool!
>
> Regarding the commit 7d71d3dd08, I have one comment:
>
> +       /* Only log updates to cost-related variables */
> +       if (vacuum_cost_delay == original_cost_delay &&
> +           vacuum_cost_limit == original_cost_limit)
> +           return;
>
> IIUC by default, we log not only before starting the vacuum but also
> when changing cost-related variables. Which is good, I think, because
> logging the initial values would also be helpful for investigation.
> However, I think that we don't log the initial vacuum cost values
> depending on the values. For example, if the
> autovacuum_vacuum_cost_delay storage option is set to 0, we don't log
> the initial values. I think that instead of comparing old and new
> values, we can write the log only if
> message_level_is_interesting(DEBUG2) is true. That way, we don't need
> to acquire the lwlock unnecessarily. And the code looks cleaner to me.
> I've attached the patch (use_message_level_is_interesting.patch)

That's a good idea, unless Melanie has conflicting opinions I think we should
go ahead with this.  Avoiding taking a lock here is a good save.

> Also, while testing the autovacuum delay with relopt
> autovacuum_vacuum_cost_delay = 0, I realized that even if we set
> autovacuum_vacuum_cost_delay = 0 to a table, wi_dobalance is set to
> true. wi_dobalance comes from the following expression:
>
>        /*
>         * If any of the cost delay parameters has been set individually for
>         * this table, disable the balancing algorithm.
>         */
>        tab->at_dobalance =
>            !(avopts && (avopts->vacuum_cost_limit > 0 ||
>                         avopts->vacuum_cost_delay > 0));
>
> The initial values of both avopts->vacuum_cost_limit and
> avopts->vacuum_cost_delay are -1. I think we should use ">= 0" instead
> of "> 0". Otherwise, we include the autovacuum worker working on a
> table whose autovacuum_vacuum_cost_delay is 0 to the balancing
> algorithm. Probably this behavior has existed also on back branches
> but I haven't checked it yet.

Interesting, good find.  Looking quickly at the back branches I think there is
a variant of this for vacuum_cost_limit even there but needs more investigation.

--
Daniel Gustafsson




Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Fri, Apr 7, 2023 at 2:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Apr 7, 2023 at 8:08 AM Daniel Gustafsson <daniel@yesql.se> wrote:
> >
> > > On 7 Apr 2023, at 00:12, Melanie Plageman <melanieplageman@gmail.com> wrote:
> > >
> > > On Thu, Apr 6, 2023 at 5:45 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> > >>
> > >>> On 6 Apr 2023, at 23:06, Melanie Plageman <melanieplageman@gmail.com> wrote:
> > >>
> > >>> Autovacuum workers, at the end of VacuumUpdateCosts(), check if cost
> > >>> limit or cost delay have been changed. If they have, they assert that
> > >>> they don't already hold the AutovacuumLock, take it in shared mode, and
> > >>> do the logging.
> > >>
> > >> Another idea would be to copy the values to local temp variables while holding
> > >> the lock, and release the lock before calling elog() to avoid holding the lock
> > >> over potential IO.
> > >
> > > Good idea. I've done this in attached v19.
> > > Also I looked through the docs and everything still looks correct for
> > > balancing algo.
> >
> > I had another read-through and test-through of this version, and have applied
> > it with some minor changes to comments and whitespace.  Thanks for the quick
> > turnaround times on reviews in this thread!
>
> Cool!
>
> Regarding the commit 7d71d3dd08, I have one comment:
>
> +       /* Only log updates to cost-related variables */
> +       if (vacuum_cost_delay == original_cost_delay &&
> +           vacuum_cost_limit == original_cost_limit)
> +           return;
>
> IIUC by default, we log not only before starting the vacuum but also
> when changing cost-related variables. Which is good, I think, because
> logging the initial values would also be helpful for investigation.
> However, I think that we don't log the initial vacuum cost values
> depending on the values. For example, if the
> autovacuum_vacuum_cost_delay storage option is set to 0, we don't log
> the initial values. I think that instead of comparing old and new
> values, we can write the log only if
> message_level_is_interesting(DEBUG2) is true. That way, we don't need
> to acquire the lwlock unnecessarily. And the code looks cleaner to me.
> I've attached the patch (use_message_level_is_interesting.patch)

Thanks for coming up with the case you thought of with storage param for
cost delay = 0. In that case we wouldn't print the message initially and
we should fix that.

I disagree, however, that we should condition it only on
message_level_is_interesting().

Actually, outside of printing initial values when the autovacuum worker
first starts (before vacuuming all tables), I don't think we should log
these values except when they are being updated. Autovacuum workers
could vacuum tons of small tables and having this print out at least
once per table (which I know is how it is on master) would be
distracting. Also, you could be reloading the config to update some
other GUCs and be oblivious to an ongoing autovacuum and get these
messages printed out, which I would also find distracting.

You will have to stare very hard at the logs to tell if your changes to
vacuum cost delay and limit took effect when you reload config. I think
with our changes to update the values more often, we should take the
opportunity to make this logging more useful by making it happen only
when the values are changed.

I would be open to elevating the log level to DEBUG1 for logging only
updates and, perhaps, having an option if you set log level to DEBUG2,
for example, to always log these values in VacuumUpdateCosts().

I'd even argue that, potentially, having the cost-delay related
parameters printed at the beginning of vacuuming could be interesting to
regular VACUUM as well (even though it doesn't benefit from config
reload while in progress).

To fix the issue you mentioned and ensure the logging is printed when
autovacuum workers start up before vacuuming tables, we could either
initialize vacuum_cost_delay and vacuum_cost_limit to something invalid
that will always be different than what they are set to in
VacuumUpdateCosts() (not sure if this poses a problem for VACUUM using
these values since they are set to the defaults for VACUUM). Or, we
could duplicate this logging message in do_autovacuum().

Finally, one other point about message_level_is_interesting(). I liked
the idea of using it a lot, since log level DEBUG2 will not be the
common case. I thought of it but hesitated because all other users of
message_level_is_interesting() are avoiding some memory allocation or
string copying -- not avoiding take a lock. Making this conditioned on
log level made me a bit uncomfortable. I can't think of a situation when
it would be a problem, but it felt a bit off.

> Also, while testing the autovacuum delay with relopt
> autovacuum_vacuum_cost_delay = 0, I realized that even if we set
> autovacuum_vacuum_cost_delay = 0 to a table, wi_dobalance is set to
> true. wi_dobalance comes from the following expression:
>
>         /*
>          * If any of the cost delay parameters has been set individually for
>          * this table, disable the balancing algorithm.
>          */
>         tab->at_dobalance =
>             !(avopts && (avopts->vacuum_cost_limit > 0 ||
>                          avopts->vacuum_cost_delay > 0));
>
> The initial values of both avopts->vacuum_cost_limit and
> avopts->vacuum_cost_delay are -1. I think we should use ">= 0" instead
> of "> 0". Otherwise, we include the autovacuum worker working on a
> table whose autovacuum_vacuum_cost_delay is 0 to the balancing
> algorithm. Probably this behavior has existed also on back branches
> but I haven't checked it yet.

Thank you for catching this. Indeed this exists in master since
1021bd6a89b which was backpatched. I checked and it is true all the way
back through REL_11_STABLE.

Definitely seems worth fixing as it kind of defeats the purpose of the
original commit. I wish I had noticed before!

Your fix has:
            !(avopts && (avopts->vacuum_cost_limit >= 0 ||
                        avopts->vacuum_cost_delay >= 0));

And though delay is required to be >= 0
                        avopts->vacuum_cost_delay >= 0

Limit does not. It can just be > 0.

postgres=# create table foo (a int) with (autovacuum_vacuum_cost_limit = 0);
ERROR:  value 0 out of bounds for option "autovacuum_vacuum_cost_limit"
DETAIL:  Valid values are between "1" and "10000".

Though >= is also fine, the rest of the code in all versions always
checks if limit > 0 and delay >= 0 since 0 is a valid value for delay
and not for limit. Probably best we keep it consistent (though the whole
thing is quite confusing).

- Melanie



Re: Should vacuum process config file reload more often

From
Daniel Gustafsson
Date:
> On 7 Apr 2023, at 15:07, Melanie Plageman <melanieplageman@gmail.com> wrote:
> On Fri, Apr 7, 2023 at 2:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

>> +       /* Only log updates to cost-related variables */
>> +       if (vacuum_cost_delay == original_cost_delay &&
>> +           vacuum_cost_limit == original_cost_limit)
>> +           return;
>>
>> IIUC by default, we log not only before starting the vacuum but also
>> when changing cost-related variables. Which is good, I think, because
>> logging the initial values would also be helpful for investigation.
>> However, I think that we don't log the initial vacuum cost values
>> depending on the values. For example, if the
>> autovacuum_vacuum_cost_delay storage option is set to 0, we don't log
>> the initial values. I think that instead of comparing old and new
>> values, we can write the log only if
>> message_level_is_interesting(DEBUG2) is true. That way, we don't need
>> to acquire the lwlock unnecessarily. And the code looks cleaner to me.
>> I've attached the patch (use_message_level_is_interesting.patch)
>
> Thanks for coming up with the case you thought of with storage param for
> cost delay = 0. In that case we wouldn't print the message initially and
> we should fix that.
>
> I disagree, however, that we should condition it only on
> message_level_is_interesting().

I think we should keep the logging frequency as committed, but condition taking
the lock on message_level_is_interesting().

> Actually, outside of printing initial values when the autovacuum worker
> first starts (before vacuuming all tables), I don't think we should log
> these values except when they are being updated. Autovacuum workers
> could vacuum tons of small tables and having this print out at least
> once per table (which I know is how it is on master) would be
> distracting. Also, you could be reloading the config to update some
> other GUCs and be oblivious to an ongoing autovacuum and get these
> messages printed out, which I would also find distracting.
>
> You will have to stare very hard at the logs to tell if your changes to
> vacuum cost delay and limit took effect when you reload config. I think
> with our changes to update the values more often, we should take the
> opportunity to make this logging more useful by making it happen only
> when the values are changed.
>
> I would be open to elevating the log level to DEBUG1 for logging only
> updates and, perhaps, having an option if you set log level to DEBUG2,
> for example, to always log these values in VacuumUpdateCosts().
>
> I'd even argue that, potentially, having the cost-delay related
> parameters printed at the beginning of vacuuming could be interesting to
> regular VACUUM as well (even though it doesn't benefit from config
> reload while in progress).
>
> To fix the issue you mentioned and ensure the logging is printed when
> autovacuum workers start up before vacuuming tables, we could either
> initialize vacuum_cost_delay and vacuum_cost_limit to something invalid
> that will always be different than what they are set to in
> VacuumUpdateCosts() (not sure if this poses a problem for VACUUM using
> these values since they are set to the defaults for VACUUM). Or, we
> could duplicate this logging message in do_autovacuum().

Duplicating logging, maybe with a slightly tailored message, seem the least
bad option.

> Finally, one other point about message_level_is_interesting(). I liked
> the idea of using it a lot, since log level DEBUG2 will not be the
> common case. I thought of it but hesitated because all other users of
> message_level_is_interesting() are avoiding some memory allocation or
> string copying -- not avoiding take a lock. Making this conditioned on
> log level made me a bit uncomfortable. I can't think of a situation when
> it would be a problem, but it felt a bit off.

Considering how uncommon DEBUG2 will be in production, I think conditioning
taking a lock on it makes sense.

>> Also, while testing the autovacuum delay with relopt
>> autovacuum_vacuum_cost_delay = 0, I realized that even if we set
>> autovacuum_vacuum_cost_delay = 0 to a table, wi_dobalance is set to
>> true. wi_dobalance comes from the following expression:
>>
>>        /*
>>         * If any of the cost delay parameters has been set individually for
>>         * this table, disable the balancing algorithm.
>>         */
>>        tab->at_dobalance =
>>            !(avopts && (avopts->vacuum_cost_limit > 0 ||
>>                         avopts->vacuum_cost_delay > 0));
>>
>> The initial values of both avopts->vacuum_cost_limit and
>> avopts->vacuum_cost_delay are -1. I think we should use ">= 0" instead
>> of "> 0". Otherwise, we include the autovacuum worker working on a
>> table whose autovacuum_vacuum_cost_delay is 0 to the balancing
>> algorithm. Probably this behavior has existed also on back branches
>> but I haven't checked it yet.
>
> Thank you for catching this. Indeed this exists in master since
> 1021bd6a89b which was backpatched. I checked and it is true all the way
> back through REL_11_STABLE.
>
> Definitely seems worth fixing as it kind of defeats the purpose of the
> original commit. I wish I had noticed before!
>
> Your fix has:
>            !(avopts && (avopts->vacuum_cost_limit >= 0 ||
>                        avopts->vacuum_cost_delay >= 0));
>
> And though delay is required to be >= 0
>                        avopts->vacuum_cost_delay >= 0
>
> Limit does not. It can just be > 0.
>
> postgres=# create table foo (a int) with (autovacuum_vacuum_cost_limit = 0);
> ERROR:  value 0 out of bounds for option "autovacuum_vacuum_cost_limit"
> DETAIL:  Valid values are between "1" and "10000".
>
> Though >= is also fine, the rest of the code in all versions always
> checks if limit > 0 and delay >= 0 since 0 is a valid value for delay
> and not for limit. Probably best we keep it consistent (though the whole
> thing is quite confusing).

+1

--
Daniel Gustafsson




Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Fri, Apr 7, 2023 at 9:07 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Fri, Apr 7, 2023 at 2:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Fri, Apr 7, 2023 at 8:08 AM Daniel Gustafsson <daniel@yesql.se> wrote:
> > >
> > > > On 7 Apr 2023, at 00:12, Melanie Plageman <melanieplageman@gmail.com> wrote:
> > > >
> > > > On Thu, Apr 6, 2023 at 5:45 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> > > >>
> > > >>> On 6 Apr 2023, at 23:06, Melanie Plageman <melanieplageman@gmail.com> wrote:
> > > >>
> > > >>> Autovacuum workers, at the end of VacuumUpdateCosts(), check if cost
> > > >>> limit or cost delay have been changed. If they have, they assert that
> > > >>> they don't already hold the AutovacuumLock, take it in shared mode, and
> > > >>> do the logging.
> > > >>
> > > >> Another idea would be to copy the values to local temp variables while holding
> > > >> the lock, and release the lock before calling elog() to avoid holding the lock
> > > >> over potential IO.
> > > >
> > > > Good idea. I've done this in attached v19.
> > > > Also I looked through the docs and everything still looks correct for
> > > > balancing algo.
> > >
> > > I had another read-through and test-through of this version, and have applied
> > > it with some minor changes to comments and whitespace.  Thanks for the quick
> > > turnaround times on reviews in this thread!
> >
> > Cool!
> >
> > Regarding the commit 7d71d3dd08, I have one comment:
> >
> > +       /* Only log updates to cost-related variables */
> > +       if (vacuum_cost_delay == original_cost_delay &&
> > +           vacuum_cost_limit == original_cost_limit)
> > +           return;
> >
> > IIUC by default, we log not only before starting the vacuum but also
> > when changing cost-related variables. Which is good, I think, because
> > logging the initial values would also be helpful for investigation.
> > However, I think that we don't log the initial vacuum cost values
> > depending on the values. For example, if the
> > autovacuum_vacuum_cost_delay storage option is set to 0, we don't log
> > the initial values. I think that instead of comparing old and new
> > values, we can write the log only if
> > message_level_is_interesting(DEBUG2) is true. That way, we don't need
> > to acquire the lwlock unnecessarily. And the code looks cleaner to me.
> > I've attached the patch (use_message_level_is_interesting.patch)
>
> Thanks for coming up with the case you thought of with storage param for
> cost delay = 0. In that case we wouldn't print the message initially and
> we should fix that.
>
> I disagree, however, that we should condition it only on
> message_level_is_interesting().
>
> Actually, outside of printing initial values when the autovacuum worker
> first starts (before vacuuming all tables), I don't think we should log
> these values except when they are being updated. Autovacuum workers
> could vacuum tons of small tables and having this print out at least
> once per table (which I know is how it is on master) would be
> distracting. Also, you could be reloading the config to update some
> other GUCs and be oblivious to an ongoing autovacuum and get these
> messages printed out, which I would also find distracting.
>
> You will have to stare very hard at the logs to tell if your changes to
> vacuum cost delay and limit took effect when you reload config. I think
> with our changes to update the values more often, we should take the
> opportunity to make this logging more useful by making it happen only
> when the values are changed.
>
> I would be open to elevating the log level to DEBUG1 for logging only
> updates and, perhaps, having an option if you set log level to DEBUG2,
> for example, to always log these values in VacuumUpdateCosts().
>
> I'd even argue that, potentially, having the cost-delay related
> parameters printed at the beginning of vacuuming could be interesting to
> regular VACUUM as well (even though it doesn't benefit from config
> reload while in progress).
>
> To fix the issue you mentioned and ensure the logging is printed when
> autovacuum workers start up before vacuuming tables, we could either
> initialize vacuum_cost_delay and vacuum_cost_limit to something invalid
> that will always be different than what they are set to in
> VacuumUpdateCosts() (not sure if this poses a problem for VACUUM using
> these values since they are set to the defaults for VACUUM). Or, we
> could duplicate this logging message in do_autovacuum().
>
> Finally, one other point about message_level_is_interesting(). I liked
> the idea of using it a lot, since log level DEBUG2 will not be the
> common case. I thought of it but hesitated because all other users of
> message_level_is_interesting() are avoiding some memory allocation or
> string copying -- not avoiding take a lock. Making this conditioned on
> log level made me a bit uncomfortable. I can't think of a situation when
> it would be a problem, but it felt a bit off.
>
> > Also, while testing the autovacuum delay with relopt
> > autovacuum_vacuum_cost_delay = 0, I realized that even if we set
> > autovacuum_vacuum_cost_delay = 0 to a table, wi_dobalance is set to
> > true. wi_dobalance comes from the following expression:
> >
> >         /*
> >          * If any of the cost delay parameters has been set individually for
> >          * this table, disable the balancing algorithm.
> >          */
> >         tab->at_dobalance =
> >             !(avopts && (avopts->vacuum_cost_limit > 0 ||
> >                          avopts->vacuum_cost_delay > 0));
> >
> > The initial values of both avopts->vacuum_cost_limit and
> > avopts->vacuum_cost_delay are -1. I think we should use ">= 0" instead
> > of "> 0". Otherwise, we include the autovacuum worker working on a
> > table whose autovacuum_vacuum_cost_delay is 0 to the balancing
> > algorithm. Probably this behavior has existed also on back branches
> > but I haven't checked it yet.
>
> Thank you for catching this. Indeed this exists in master since
> 1021bd6a89b which was backpatched. I checked and it is true all the way
> back through REL_11_STABLE.
>
> Definitely seems worth fixing as it kind of defeats the purpose of the
> original commit. I wish I had noticed before!
>
> Your fix has:
>             !(avopts && (avopts->vacuum_cost_limit >= 0 ||
>                         avopts->vacuum_cost_delay >= 0));
>
> And though delay is required to be >= 0
>                         avopts->vacuum_cost_delay >= 0
>
> Limit does not. It can just be > 0.
>
> postgres=# create table foo (a int) with (autovacuum_vacuum_cost_limit = 0);
> ERROR:  value 0 out of bounds for option "autovacuum_vacuum_cost_limit"
> DETAIL:  Valid values are between "1" and "10000".
>
> Though >= is also fine, the rest of the code in all versions always
> checks if limit > 0 and delay >= 0 since 0 is a valid value for delay
> and not for limit. Probably best we keep it consistent (though the whole
> thing is quite confusing).

I have created an open item for each of these issues on the wiki
(one for 16 and one under the section "affects stable branches").

- Melanie



Re: Should vacuum process config file reload more often

From
Masahiko Sawada
Date:
On Fri, Apr 7, 2023 at 10:23 PM Daniel Gustafsson <daniel@yesql.se> wrote:
>
> > On 7 Apr 2023, at 15:07, Melanie Plageman <melanieplageman@gmail.com> wrote:
> > On Fri, Apr 7, 2023 at 2:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> >> +       /* Only log updates to cost-related variables */
> >> +       if (vacuum_cost_delay == original_cost_delay &&
> >> +           vacuum_cost_limit == original_cost_limit)
> >> +           return;
> >>
> >> IIUC by default, we log not only before starting the vacuum but also
> >> when changing cost-related variables. Which is good, I think, because
> >> logging the initial values would also be helpful for investigation.
> >> However, I think that we don't log the initial vacuum cost values
> >> depending on the values. For example, if the
> >> autovacuum_vacuum_cost_delay storage option is set to 0, we don't log
> >> the initial values. I think that instead of comparing old and new
> >> values, we can write the log only if
> >> message_level_is_interesting(DEBUG2) is true. That way, we don't need
> >> to acquire the lwlock unnecessarily. And the code looks cleaner to me.
> >> I've attached the patch (use_message_level_is_interesting.patch)
> >
> > Thanks for coming up with the case you thought of with storage param for
> > cost delay = 0. In that case we wouldn't print the message initially and
> > we should fix that.
> >
> > I disagree, however, that we should condition it only on
> > message_level_is_interesting().
>
> I think we should keep the logging frequency as committed, but condition taking
> the lock on message_level_is_interesting().
>
> > Actually, outside of printing initial values when the autovacuum worker
> > first starts (before vacuuming all tables), I don't think we should log
> > these values except when they are being updated. Autovacuum workers
> > could vacuum tons of small tables and having this print out at least
> > once per table (which I know is how it is on master) would be
> > distracting. Also, you could be reloading the config to update some
> > other GUCs and be oblivious to an ongoing autovacuum and get these
> > messages printed out, which I would also find distracting.
> >
> > You will have to stare very hard at the logs to tell if your changes to
> > vacuum cost delay and limit took effect when you reload config. I think
> > with our changes to update the values more often, we should take the
> > opportunity to make this logging more useful by making it happen only
> > when the values are changed.
> >

For debugging purposes, I think it could also be important information
that the cost values are not changed. Personally, I prefer to log the
current state rather than deciding for ourselves which events are
important. If always logging these values in DEBUG2 had been
distracting, we might want to lower it to DEBUG3.

> > I would be open to elevating the log level to DEBUG1 for logging only
> > updates and, perhaps, having an option if you set log level to DEBUG2,
> > for example, to always log these values in VacuumUpdateCosts().

I'm not really sure it's a good idea to change the log messages and
events depending on elevel. Do you know we have any precedents ?

> >
> > I'd even argue that, potentially, having the cost-delay related
> > parameters printed at the beginning of vacuuming could be interesting to
> > regular VACUUM as well (even though it doesn't benefit from config
> > reload while in progress).
> >
> > To fix the issue you mentioned and ensure the logging is printed when
> > autovacuum workers start up before vacuuming tables, we could either
> > initialize vacuum_cost_delay and vacuum_cost_limit to something invalid
> > that will always be different than what they are set to in
> > VacuumUpdateCosts() (not sure if this poses a problem for VACUUM using
> > these values since they are set to the defaults for VACUUM). Or, we
> > could duplicate this logging message in do_autovacuum().
>
> Duplicating logging, maybe with a slightly tailored message, seem the least
> bad option.
>
> > Finally, one other point about message_level_is_interesting(). I liked
> > the idea of using it a lot, since log level DEBUG2 will not be the
> > common case. I thought of it but hesitated because all other users of
> > message_level_is_interesting() are avoiding some memory allocation or
> > string copying -- not avoiding take a lock. Making this conditioned on
> > log level made me a bit uncomfortable. I can't think of a situation when
> > it would be a problem, but it felt a bit off.
>
> Considering how uncommon DEBUG2 will be in production, I think conditioning
> taking a lock on it makes sense.

The comment of message_level_is_interesting() says:

 * This is useful to short-circuit any expensive preparatory work that
 * might be needed for a logging message.

Which can apply to taking a lwlock, I think.

>
> >> Also, while testing the autovacuum delay with relopt
> >> autovacuum_vacuum_cost_delay = 0, I realized that even if we set
> >> autovacuum_vacuum_cost_delay = 0 to a table, wi_dobalance is set to
> >> true. wi_dobalance comes from the following expression:
> >>
> >>        /*
> >>         * If any of the cost delay parameters has been set individually for
> >>         * this table, disable the balancing algorithm.
> >>         */
> >>        tab->at_dobalance =
> >>            !(avopts && (avopts->vacuum_cost_limit > 0 ||
> >>                         avopts->vacuum_cost_delay > 0));
> >>
> >> The initial values of both avopts->vacuum_cost_limit and
> >> avopts->vacuum_cost_delay are -1. I think we should use ">= 0" instead
> >> of "> 0". Otherwise, we include the autovacuum worker working on a
> >> table whose autovacuum_vacuum_cost_delay is 0 to the balancing
> >> algorithm. Probably this behavior has existed also on back branches
> >> but I haven't checked it yet.
> >
> > Thank you for catching this. Indeed this exists in master since
> > 1021bd6a89b which was backpatched. I checked and it is true all the way
> > back through REL_11_STABLE.

Thanks for checking!

> >
> > Definitely seems worth fixing as it kind of defeats the purpose of the
> > original commit. I wish I had noticed before!
> >
> > Your fix has:
> >            !(avopts && (avopts->vacuum_cost_limit >= 0 ||
> >                        avopts->vacuum_cost_delay >= 0));
> >
> > And though delay is required to be >= 0
> >                        avopts->vacuum_cost_delay >= 0
> >
> > Limit does not. It can just be > 0.
> >
> > postgres=# create table foo (a int) with (autovacuum_vacuum_cost_limit = 0);
> > ERROR:  value 0 out of bounds for option "autovacuum_vacuum_cost_limit"
> > DETAIL:  Valid values are between "1" and "10000".
> >
> > Though >= is also fine, the rest of the code in all versions always
> > checks if limit > 0 and delay >= 0 since 0 is a valid value for delay
> > and not for limit. Probably best we keep it consistent (though the whole
> > thing is quite confusing).
>
> +1

+1. I misunderstood the initial value of autovacuum_vacuum_cost_limit reloption.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Should vacuum process config file reload more often

From
Daniel Gustafsson
Date:
> On 11 Apr 2023, at 17:05, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

> The comment of message_level_is_interesting() says:
>
> * This is useful to short-circuit any expensive preparatory work that
> * might be needed for a logging message.
>
> Which can apply to taking a lwlock, I think.

I agree that we can, and should, use message_level_is_interesting to skip
taking this lock.  Also, the more I think about the more I'm convinced that we
should not change the current logging frequency of once per table from what we
ship today.  In DEGUG2 the logs should tell the whole story without requiring
extrapolation based on missing entries.  So I think we should use your patch to
solve this open item.  If there is interest in reducing the logging frequency
we should discuss that in its own thread, insted of it being hidden in here.

--
Daniel Gustafsson




Re: Should vacuum process config file reload more often

From
Masahiko Sawada
Date:
On Wed, Apr 12, 2023 at 12:05 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Apr 7, 2023 at 10:23 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> >
> > > On 7 Apr 2023, at 15:07, Melanie Plageman <melanieplageman@gmail.com> wrote:
> > > On Fri, Apr 7, 2023 at 2:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > Definitely seems worth fixing as it kind of defeats the purpose of the
> > > original commit. I wish I had noticed before!
> > >
> > > Your fix has:
> > >            !(avopts && (avopts->vacuum_cost_limit >= 0 ||
> > >                        avopts->vacuum_cost_delay >= 0));
> > >
> > > And though delay is required to be >= 0
> > >                        avopts->vacuum_cost_delay >= 0
> > >
> > > Limit does not. It can just be > 0.
> > >
> > > postgres=# create table foo (a int) with (autovacuum_vacuum_cost_limit = 0);
> > > ERROR:  value 0 out of bounds for option "autovacuum_vacuum_cost_limit"
> > > DETAIL:  Valid values are between "1" and "10000".
> > >
> > > Though >= is also fine, the rest of the code in all versions always
> > > checks if limit > 0 and delay >= 0 since 0 is a valid value for delay
> > > and not for limit. Probably best we keep it consistent (though the whole
> > > thing is quite confusing).
> >
> > +1
>
> +1. I misunderstood the initial value of autovacuum_vacuum_cost_limit reloption.

I've attached an updated patch for fixing at_dobalance condition.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Should vacuum process config file reload more often

From
Daniel Gustafsson
Date:
> On 17 Apr 2023, at 04:04, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

> I've attached an updated patch for fixing at_dobalance condition.

I revisited this and pushed it to all supported branches after another round of
testing and reading.

--
Daniel Gustafsson




Re: Should vacuum process config file reload more often

From
Masahiko Sawada
Date:
On Tue, Apr 25, 2023 at 9:39 PM Daniel Gustafsson <daniel@yesql.se> wrote:
>
> > On 17 Apr 2023, at 04:04, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> > I've attached an updated patch for fixing at_dobalance condition.
>
> I revisited this and pushed it to all supported branches after another round of
> testing and reading.

Thanks!

Can we mark the open item "Can't disable autovacuum cost delay through
storage parameter" as resolved?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Should vacuum process config file reload more often

From
Daniel Gustafsson
Date:
> On 25 Apr 2023, at 15:31, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Apr 25, 2023 at 9:39 PM Daniel Gustafsson <daniel@yesql.se> wrote:
>>
>>> On 17 Apr 2023, at 04:04, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>
>>> I've attached an updated patch for fixing at_dobalance condition.
>>
>> I revisited this and pushed it to all supported branches after another round of
>> testing and reading.
>
> Thanks!
>
> Can we mark the open item "Can't disable autovacuum cost delay through
> storage parameter" as resolved?

Yes, I've gone ahead and done that now.

--
Daniel Gustafsson




Re: Should vacuum process config file reload more often

From
Masahiko Sawada
Date:
On Tue, Apr 25, 2023 at 10:35 PM Daniel Gustafsson <daniel@yesql.se> wrote:
>
> > On 25 Apr 2023, at 15:31, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Apr 25, 2023 at 9:39 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> >>
> >>> On 17 Apr 2023, at 04:04, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >>
> >>> I've attached an updated patch for fixing at_dobalance condition.
> >>
> >> I revisited this and pushed it to all supported branches after another round of
> >> testing and reading.
> >
> > Thanks!
> >
> > Can we mark the open item "Can't disable autovacuum cost delay through
> > storage parameter" as resolved?
>
> Yes, I've gone ahead and done that now.

Great, thank you!

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Should vacuum process config file reload more often

From
John Naylor
Date:

On Fri, Apr 7, 2023 at 6:08 AM Daniel Gustafsson <daniel@yesql.se> wrote:
>
> I had another read-through and test-through of this version, and have applied
> it with some minor changes to comments and whitespace.  Thanks for the quick
> turnaround times on reviews in this thread!

-       VacuumFailsafeActive = false;
+       Assert(!VacuumFailsafeActive);

I can trigger this assert added in commit 7d71d3dd08.

First build with the patch in [1], then:

session 1:

CREATE EXTENSION xid_wraparound ;

CREATE TABLE autovacuum_disabled(id serial primary key, data text) WITH (autovacuum_enabled=false);
INSERT INTO autovacuum_disabled(data) SELECT generate_series(1,1000);

-- I can trigger without this, but just make sure it doesn't get vacuumed
BEGIN;
DELETE FROM autovacuum_disabled WHERE id % 2 = 0;

session 2:

-- get to failsafe limit
SELECT consume_xids(1*1000*1000*1000);
INSERT INTO autovacuum_disabled(data) SELECT 1;
SELECT consume_xids(1*1000*1000*1000);
INSERT INTO autovacuum_disabled(data) SELECT 1;

VACUUM autovacuum_disabled;

WARNING:  cutoff for removing and freezing tuples is far in the past
HINT:  Close open transactions soon to avoid wraparound problems.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
WARNING:  bypassing nonessential maintenance of table "john.public.autovacuum_disabled" as a failsafe after 0 index scans
DETAIL:  The table's relfrozenxid or relminmxid is too far in the past.
HINT:  Consider increasing configuration parameter "maintenance_work_mem" or "autovacuum_work_mem".
You might also need to consider other ways for VACUUM to keep up with the allocation of transaction IDs.
server closed the connection unexpectedly

#0  0x00007ff31f68ebec in __pthread_kill_implementation ()
   from /lib64/libc.so.6
#1  0x00007ff31f63e956 in raise () from /lib64/libc.so.6
#2  0x00007ff31f6287f4 in abort () from /lib64/libc.so.6
#3  0x0000000000978032 in ExceptionalCondition (
    conditionName=conditionName@entry=0xa4e970 "!VacuumFailsafeActive",
    fileName=fileName@entry=0xa4da38 "../src/backend/access/heap/vacuumlazy.c", lineNumber=lineNumber@entry=392) at ../src/backend/utils/error/assert.c:66
#4  0x000000000058c598 in heap_vacuum_rel (rel=0x7ff31d8a97d0,
    params=<optimized out>, bstrategy=<optimized out>)
    at ../src/backend/access/heap/vacuumlazy.c:392
#5  0x000000000069af1f in table_relation_vacuum (bstrategy=0x14ddca8,
    params=0x7ffec28585f0, rel=0x7ff31d8a97d0)
    at ../src/include/access/tableam.h:1705
#6  vacuum_rel (relid=relid@entry=16402, relation=relation@entry=0x0,
    params=params@entry=0x7ffec28585f0, skip_privs=skip_privs@entry=true,
    bstrategy=bstrategy@entry=0x14ddca8)
    at ../src/backend/commands/vacuum.c:2202
#7  0x000000000069b0e4 in vacuum_rel (relid=16398, relation=<optimized out>,
    params=params@entry=0x7ffec2858850, skip_privs=skip_privs@entry=false,
    bstrategy=bstrategy@entry=0x14ddca8)
    at ../src/backend/commands/vacuum.c:2236
#8  0x000000000069c594 in vacuum (relations=0x14dde38,
    params=0x7ffec2858850, bstrategy=0x14ddca8, vac_context=0x14ddb90,
    isTopLevel=<optimized out>) at ../src/backend/commands/vacuum.c:623

[1] https://www.postgresql.org/message-id/CAD21AoAyYBZOiB1UPCPZJHTLk0-arrq5zqNGj%2BPrsbpdUy%3Dg-g%40mail.gmail.com

--
John Naylor
EDB: http://www.enterprisedb.com

Re: Should vacuum process config file reload more often

From
Daniel Gustafsson
Date:
> On 27 Apr 2023, at 11:29, John Naylor <john.naylor@enterprisedb.com> wrote:
> On Fri, Apr 7, 2023 at 6:08 AM Daniel Gustafsson <daniel@yesql.se> wrote:

> > I had another read-through and test-through of this version, and have applied
> > it with some minor changes to comments and whitespace.  Thanks for the quick
> > turnaround times on reviews in this thread!
>
> -       VacuumFailsafeActive = false;
> +       Assert(!VacuumFailsafeActive);
>
> I can trigger this assert added in commit 7d71d3dd08.
>
> First build with the patch in [1], then:

Interesting, thanks for the report! I'll look into it directly.

--
Daniel Gustafsson




Re: Should vacuum process config file reload more often

From
Masahiko Sawada
Date:
On Thu, Apr 27, 2023 at 6:30 PM John Naylor
<john.naylor@enterprisedb.com> wrote:
>
>
> On Fri, Apr 7, 2023 at 6:08 AM Daniel Gustafsson <daniel@yesql.se> wrote:
> >
> > I had another read-through and test-through of this version, and have applied
> > it with some minor changes to comments and whitespace.  Thanks for the quick
> > turnaround times on reviews in this thread!
>
> -       VacuumFailsafeActive = false;
> +       Assert(!VacuumFailsafeActive);
>
> I can trigger this assert added in commit 7d71d3dd08.
>
> First build with the patch in [1], then:
>
> session 1:
>
> CREATE EXTENSION xid_wraparound ;
>
> CREATE TABLE autovacuum_disabled(id serial primary key, data text) WITH (autovacuum_enabled=false);
> INSERT INTO autovacuum_disabled(data) SELECT generate_series(1,1000);
>
> -- I can trigger without this, but just make sure it doesn't get vacuumed
> BEGIN;
> DELETE FROM autovacuum_disabled WHERE id % 2 = 0;
>
> session 2:
>
> -- get to failsafe limit
> SELECT consume_xids(1*1000*1000*1000);
> INSERT INTO autovacuum_disabled(data) SELECT 1;
> SELECT consume_xids(1*1000*1000*1000);
> INSERT INTO autovacuum_disabled(data) SELECT 1;
>
> VACUUM autovacuum_disabled;
>
> WARNING:  cutoff for removing and freezing tuples is far in the past
> HINT:  Close open transactions soon to avoid wraparound problems.
> You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
> WARNING:  bypassing nonessential maintenance of table "john.public.autovacuum_disabled" as a failsafe after 0 index
scans
> DETAIL:  The table's relfrozenxid or relminmxid is too far in the past.
> HINT:  Consider increasing configuration parameter "maintenance_work_mem" or "autovacuum_work_mem".
> You might also need to consider other ways for VACUUM to keep up with the allocation of transaction IDs.
> server closed the connection unexpectedly
>
> #0  0x00007ff31f68ebec in __pthread_kill_implementation ()
>    from /lib64/libc.so.6
> #1  0x00007ff31f63e956 in raise () from /lib64/libc.so.6
> #2  0x00007ff31f6287f4 in abort () from /lib64/libc.so.6
> #3  0x0000000000978032 in ExceptionalCondition (
>     conditionName=conditionName@entry=0xa4e970 "!VacuumFailsafeActive",
>     fileName=fileName@entry=0xa4da38 "../src/backend/access/heap/vacuumlazy.c", lineNumber=lineNumber@entry=392) at
../src/backend/utils/error/assert.c:66
> #4  0x000000000058c598 in heap_vacuum_rel (rel=0x7ff31d8a97d0,
>     params=<optimized out>, bstrategy=<optimized out>)
>     at ../src/backend/access/heap/vacuumlazy.c:392
> #5  0x000000000069af1f in table_relation_vacuum (bstrategy=0x14ddca8,
>     params=0x7ffec28585f0, rel=0x7ff31d8a97d0)
>     at ../src/include/access/tableam.h:1705
> #6  vacuum_rel (relid=relid@entry=16402, relation=relation@entry=0x0,
>     params=params@entry=0x7ffec28585f0, skip_privs=skip_privs@entry=true,
>     bstrategy=bstrategy@entry=0x14ddca8)
>     at ../src/backend/commands/vacuum.c:2202
> #7  0x000000000069b0e4 in vacuum_rel (relid=16398, relation=<optimized out>,
>     params=params@entry=0x7ffec2858850, skip_privs=skip_privs@entry=false,
>     bstrategy=bstrategy@entry=0x14ddca8)
>     at ../src/backend/commands/vacuum.c:2236

Good catch. I think the problem is that vacuum_rel() is called
recursively and we don't reset VacuumFailsafeActive before vacuuming
the toast table. I think we should reset it in heap_vacuum_rel()
instead of Assert(). It's possible that we trigger the failsafe mode
only for either one.Please find the attached patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Should vacuum process config file reload more often

From
Daniel Gustafsson
Date:
> On 27 Apr 2023, at 14:10, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Apr 27, 2023 at 6:30 PM John Naylor
> <john.naylor@enterprisedb.com> wrote:
>>
>>
>> On Fri, Apr 7, 2023 at 6:08 AM Daniel Gustafsson <daniel@yesql.se> wrote:
>>>
>>> I had another read-through and test-through of this version, and have applied
>>> it with some minor changes to comments and whitespace.  Thanks for the quick
>>> turnaround times on reviews in this thread!
>>
>> -       VacuumFailsafeActive = false;
>> +       Assert(!VacuumFailsafeActive);
>>
>> I can trigger this assert added in commit 7d71d3dd08.
>>
>> First build with the patch in [1], then:
>>
>> session 1:
>>
>> CREATE EXTENSION xid_wraparound ;
>>
>> CREATE TABLE autovacuum_disabled(id serial primary key, data text) WITH (autovacuum_enabled=false);
>> INSERT INTO autovacuum_disabled(data) SELECT generate_series(1,1000);
>>
>> -- I can trigger without this, but just make sure it doesn't get vacuumed
>> BEGIN;
>> DELETE FROM autovacuum_disabled WHERE id % 2 = 0;
>>
>> session 2:
>>
>> -- get to failsafe limit
>> SELECT consume_xids(1*1000*1000*1000);
>> INSERT INTO autovacuum_disabled(data) SELECT 1;
>> SELECT consume_xids(1*1000*1000*1000);
>> INSERT INTO autovacuum_disabled(data) SELECT 1;
>>
>> VACUUM autovacuum_disabled;
>>
>> WARNING:  cutoff for removing and freezing tuples is far in the past
>> HINT:  Close open transactions soon to avoid wraparound problems.
>> You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
>> WARNING:  bypassing nonessential maintenance of table "john.public.autovacuum_disabled" as a failsafe after 0 index
scans
>> DETAIL:  The table's relfrozenxid or relminmxid is too far in the past.
>> HINT:  Consider increasing configuration parameter "maintenance_work_mem" or "autovacuum_work_mem".
>> You might also need to consider other ways for VACUUM to keep up with the allocation of transaction IDs.
>> server closed the connection unexpectedly
>>
>> #0  0x00007ff31f68ebec in __pthread_kill_implementation ()
>>   from /lib64/libc.so.6
>> #1  0x00007ff31f63e956 in raise () from /lib64/libc.so.6
>> #2  0x00007ff31f6287f4 in abort () from /lib64/libc.so.6
>> #3  0x0000000000978032 in ExceptionalCondition (
>>    conditionName=conditionName@entry=0xa4e970 "!VacuumFailsafeActive",
>>    fileName=fileName@entry=0xa4da38 "../src/backend/access/heap/vacuumlazy.c", lineNumber=lineNumber@entry=392) at
../src/backend/utils/error/assert.c:66
>> #4  0x000000000058c598 in heap_vacuum_rel (rel=0x7ff31d8a97d0,
>>    params=<optimized out>, bstrategy=<optimized out>)
>>    at ../src/backend/access/heap/vacuumlazy.c:392
>> #5  0x000000000069af1f in table_relation_vacuum (bstrategy=0x14ddca8,
>>    params=0x7ffec28585f0, rel=0x7ff31d8a97d0)
>>    at ../src/include/access/tableam.h:1705
>> #6  vacuum_rel (relid=relid@entry=16402, relation=relation@entry=0x0,
>>    params=params@entry=0x7ffec28585f0, skip_privs=skip_privs@entry=true,
>>    bstrategy=bstrategy@entry=0x14ddca8)
>>    at ../src/backend/commands/vacuum.c:2202
>> #7  0x000000000069b0e4 in vacuum_rel (relid=16398, relation=<optimized out>,
>>    params=params@entry=0x7ffec2858850, skip_privs=skip_privs@entry=false,
>>    bstrategy=bstrategy@entry=0x14ddca8)
>>    at ../src/backend/commands/vacuum.c:2236
>
> Good catch. I think the problem is that vacuum_rel() is called
> recursively and we don't reset VacuumFailsafeActive before vacuuming
> the toast table. I think we should reset it in heap_vacuum_rel()
> instead of Assert(). It's possible that we trigger the failsafe mode
> only for either one.Please find the attached patch.

Agreed, that matches my research and testing, I have the same diff here and it
passes testing and works as intended.  This was briefly discussed in [0] and
slightly upthread from there but then missed.  I will do some more looking and
testing but I'm fairly sure this is the right fix, so unless I find something
else I will go ahead with this.

xid_wraparound is a really nifty testing tool. Very cool.

--
Daniel Gustafsson

[0] CAAKRu_b1HjGCTsFpUnmwLNS8NeXJ+JnrDLhT1osP+Gq9HCU+Rw@mail.gmail.com


Re: Should vacuum process config file reload more often

From
Melanie Plageman
Date:
On Thu, Apr 27, 2023 at 8:55 AM Daniel Gustafsson <daniel@yesql.se> wrote:
>
> > On 27 Apr 2023, at 14:10, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Apr 27, 2023 at 6:30 PM John Naylor
> > <john.naylor@enterprisedb.com> wrote:
> >>
> >>
> >> On Fri, Apr 7, 2023 at 6:08 AM Daniel Gustafsson <daniel@yesql.se> wrote:
> >>>
> >>> I had another read-through and test-through of this version, and have applied
> >>> it with some minor changes to comments and whitespace.  Thanks for the quick
> >>> turnaround times on reviews in this thread!
> >>
> >> -       VacuumFailsafeActive = false;
> >> +       Assert(!VacuumFailsafeActive);
> >>
> >> I can trigger this assert added in commit 7d71d3dd08.
> >>
> >> First build with the patch in [1], then:
> >>
> >> session 1:
> >>
> >> CREATE EXTENSION xid_wraparound ;
> >>
> >> CREATE TABLE autovacuum_disabled(id serial primary key, data text) WITH (autovacuum_enabled=false);
> >> INSERT INTO autovacuum_disabled(data) SELECT generate_series(1,1000);
> >>
> >> -- I can trigger without this, but just make sure it doesn't get vacuumed
> >> BEGIN;
> >> DELETE FROM autovacuum_disabled WHERE id % 2 = 0;
> >>
> >> session 2:
> >>
> >> -- get to failsafe limit
> >> SELECT consume_xids(1*1000*1000*1000);
> >> INSERT INTO autovacuum_disabled(data) SELECT 1;
> >> SELECT consume_xids(1*1000*1000*1000);
> >> INSERT INTO autovacuum_disabled(data) SELECT 1;
> >>
> >> VACUUM autovacuum_disabled;
> >>
> >> WARNING:  cutoff for removing and freezing tuples is far in the past
> >> HINT:  Close open transactions soon to avoid wraparound problems.
> >> You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
> >> WARNING:  bypassing nonessential maintenance of table "john.public.autovacuum_disabled" as a failsafe after 0
indexscans 
> >> DETAIL:  The table's relfrozenxid or relminmxid is too far in the past.
> >> HINT:  Consider increasing configuration parameter "maintenance_work_mem" or "autovacuum_work_mem".
> >> You might also need to consider other ways for VACUUM to keep up with the allocation of transaction IDs.
> >> server closed the connection unexpectedly
> >>
> >> #0  0x00007ff31f68ebec in __pthread_kill_implementation ()
> >>   from /lib64/libc.so.6
> >> #1  0x00007ff31f63e956 in raise () from /lib64/libc.so.6
> >> #2  0x00007ff31f6287f4 in abort () from /lib64/libc.so.6
> >> #3  0x0000000000978032 in ExceptionalCondition (
> >>    conditionName=conditionName@entry=0xa4e970 "!VacuumFailsafeActive",
> >>    fileName=fileName@entry=0xa4da38 "../src/backend/access/heap/vacuumlazy.c", lineNumber=lineNumber@entry=392) at
../src/backend/utils/error/assert.c:66
> >> #4  0x000000000058c598 in heap_vacuum_rel (rel=0x7ff31d8a97d0,
> >>    params=<optimized out>, bstrategy=<optimized out>)
> >>    at ../src/backend/access/heap/vacuumlazy.c:392
> >> #5  0x000000000069af1f in table_relation_vacuum (bstrategy=0x14ddca8,
> >>    params=0x7ffec28585f0, rel=0x7ff31d8a97d0)
> >>    at ../src/include/access/tableam.h:1705
> >> #6  vacuum_rel (relid=relid@entry=16402, relation=relation@entry=0x0,
> >>    params=params@entry=0x7ffec28585f0, skip_privs=skip_privs@entry=true,
> >>    bstrategy=bstrategy@entry=0x14ddca8)
> >>    at ../src/backend/commands/vacuum.c:2202
> >> #7  0x000000000069b0e4 in vacuum_rel (relid=16398, relation=<optimized out>,
> >>    params=params@entry=0x7ffec2858850, skip_privs=skip_privs@entry=false,
> >>    bstrategy=bstrategy@entry=0x14ddca8)
> >>    at ../src/backend/commands/vacuum.c:2236
> >
> > Good catch. I think the problem is that vacuum_rel() is called
> > recursively and we don't reset VacuumFailsafeActive before vacuuming
> > the toast table. I think we should reset it in heap_vacuum_rel()
> > instead of Assert(). It's possible that we trigger the failsafe mode
> > only for either one.Please find the attached patch.
>
> Agreed, that matches my research and testing, I have the same diff here and it
> passes testing and works as intended.  This was briefly discussed in [0] and
> slightly upthread from there but then missed.  I will do some more looking and
> testing but I'm fairly sure this is the right fix, so unless I find something
> else I will go ahead with this.
>
> xid_wraparound is a really nifty testing tool. Very cool.Makes sense to me too.

Fix LGTM.
Though we previously set it to false before this series of patches,
perhaps it is
worth adding a comment about why VacuumFailsafeActive must be reset here
even though we reset it before vacuuming each table?

- Melanie



Re: Should vacuum process config file reload more often

From
Daniel Gustafsson
Date:
> On 27 Apr 2023, at 16:53, Melanie Plageman <melanieplageman@gmail.com> wrote:
> On Thu, Apr 27, 2023 at 8:55 AM Daniel Gustafsson <daniel@yesql.se> wrote:
>>
>>> On 27 Apr 2023, at 14:10, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

>>> Good catch. I think the problem is that vacuum_rel() is called
>>> recursively and we don't reset VacuumFailsafeActive before vacuuming
>>> the toast table. I think we should reset it in heap_vacuum_rel()
>>> instead of Assert(). It's possible that we trigger the failsafe mode
>>> only for either one.Please find the attached patch.
>>
>> Agreed, that matches my research and testing, I have the same diff here and it
>> passes testing and works as intended.  This was briefly discussed in [0] and
>> slightly upthread from there but then missed.  I will do some more looking and
>> testing but I'm fairly sure this is the right fix, so unless I find something
>> else I will go ahead with this.
>>
>> xid_wraparound is a really nifty testing tool. Very cool.Makes sense to me too.
>
> Fix LGTM.

Thanks for review. I plan to push this in the morning.

> Though we previously set it to false before this series of patches,
> perhaps it is
> worth adding a comment about why VacuumFailsafeActive must be reset here
> even though we reset it before vacuuming each table?

Agreed.

--
Daniel Gustafsson




Re: Should vacuum process config file reload more often

From
Daniel Gustafsson
Date:
> On 27 Apr 2023, at 23:25, Daniel Gustafsson <daniel@yesql.se> wrote:
>> On 27 Apr 2023, at 16:53, Melanie Plageman <melanieplageman@gmail.com> wrote:

>> Fix LGTM.
>
> Thanks for review. I plan to push this in the morning.

Done, thanks.

--
Daniel Gustafsson