Thread: POC: Parallel processing of indexes in autovacuum

POC: Parallel processing of indexes in autovacuum

From

Maxim Orlov

Date:

16 April, 14:04:53

Hi!

The VACUUM command can be executed with the parallel option. As documentation states, it will perform index vacuum and index cleanup phases of VACUUM in parallel using integer background workers. But such an interesting feature is not used for an autovacuum. After a quick look at the source codes, it became clear to me that when the parallel option was added, the corresponding option for autovacuum wasn't implemented, although there are no clear obstacles to this.

Actually, one of our customers step into a problem with autovacuum on a table with many indexes and relatively long transactions. Of course, long transactions are an ultimate evil and the problem can be solved by calling running vacuum and a cron task, but, I think, we can do better.

Anyhow, what about adding parallel option for an autovacuum? Here is a POC patch for proposed functionality. For the sake of simplicity's, several GUC's have been added. It would be good to think through the parallel launch condition without them.

As always, any thoughts and opinions are very welcome!

Best regards,

Maxim Orlov.

Attachment

WIP-Allow-autovacuum-to-process-indexes-of-single-table.patch

Re: POC: Parallel processing of indexes in autovacuum

From

wenhui qiu

Date:

17 April, 06:16:29

HI Maxim Orlov

Thank you for your working on this ，I like your idea ，but I have a suggestion ,autovacuum_max_workers is not need change requires restart , I think those guc are can like

autovacuum_max_workers

+#max_parallel_index_autovac_workers = 0 # this feature disabled by default
+ # (change requires restart)
+#autovac_idx_parallel_min_rows = 0
+ # (change requires restart)
+#autovac_idx_parallel_min_indexes = 2
+ # (change requires restart)

Thanks

On Wed, Apr 16, 2025 at 7:05 PM Maxim Orlov <orlovmg@gmail.com> wrote:

Hi!

The VACUUM command can be executed with the parallel option. As documentation states, it will perform index vacuum and index cleanup phases of VACUUM in parallel using integer background workers. But such an interesting feature is not used for an autovacuum. After a quick look at the source codes, it became clear to me that when the parallel option was added, the corresponding option for autovacuum wasn't implemented, although there are no clear obstacles to this.

Actually, one of our customers step into a problem with autovacuum on a table with many indexes and relatively long transactions. Of course, long transactions are an ultimate evil and the problem can be solved by calling running vacuum and a cron task, but, I think, we can do better.

Anyhow, what about adding parallel option for an autovacuum? Here is a POC patch for proposed functionality. For the sake of simplicity's, several GUC's have been added. It would be good to think through the parallel launch condition without them.

As always, any thoughts and opinions are very welcome!

--
Best regards,
Maxim Orlov.

Re: POC: Parallel processing of indexes in autovacuum

From

Masahiko Sawada

Date:

01 May, 04:02:21

Hi,

On Wed, Apr 16, 2025 at 4:05 AM Maxim Orlov <orlovmg@gmail.com> wrote:
>
> Hi!
>
> The VACUUM command can be executed with the parallel option. As documentation states, it will perform index vacuum
andindex cleanup phases of VACUUM in parallel using integer background workers. But such an interesting feature is not
usedfor an autovacuum. After a quick look at the source codes, it became clear to me that when the parallel option was
added,the corresponding option for autovacuum wasn't implemented, although there are no clear obstacles to this. 
>
> Actually, one of our customers step into a problem with autovacuum on a table with many indexes and relatively long
transactions.Of course, long transactions are an ultimate evil and the problem can be solved by calling running vacuum
anda cron task, but, I think, we can do better. 
>
> Anyhow, what about adding parallel option for an autovacuum? Here is a POC patch for proposed functionality. For the
sakeof simplicity's, several GUC's have been added. It would be good to think through the parallel launch condition
withoutthem. 
>
> As always, any thoughts and opinions are very welcome!

As I understand it, we initially disabled parallel vacuum for
autovacuum because their objectives are somewhat contradictory.
Parallel vacuum aims to accelerate the process by utilizing additional
resources, while autovacuum is designed to perform cleaning operations
with minimal impact on foreground transaction processing (e.g.,
through vacuum delay).

Nevertheless, I see your point about the potential benefits of using
parallel vacuum within autovacuum in specific scenarios. The crucial
consideration is determining appropriate criteria for triggering
parallel vacuum in autovacuum. Given that we currently support only
parallel index processing, suitable candidates might be autovacuum
operations on large tables that have a substantial number of
sufficiently large indexes and a high volume of garbage tuples.

Once we have parallel heap vacuum, as discussed in thread[1], it would
also likely be beneficial to incorporate it into autovacuum during
aggressive vacuum or failsafe mode.

Although the actual number of parallel workers ultimately depends on
the number of eligible indexes, it might be beneficial to introduce a
storage parameter, say parallel_vacuum_workers, that allows control
over the number of parallel vacuum workers on a per-table basis.

Regarding implementation: I notice the WIP patch implements its own
parallel vacuum mechanism for autovacuum. Have you considered simply
setting at_params.nworkers to a value greater than zero?

Regards,

[1] https://www.postgresql.org/message-id/CAD21AoAEfCNv-GgaDheDJ%2Bs-p_Lv1H24AiJeNoPGCmZNSwL1YA%40mail.gmail.com

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: POC: Parallel processing of indexes in autovacuum

From

Sami Imseih

Date:

02 May, 19:58:30

Thanks for raising this idea!

I am generally -1 on the idea of autovacuum performing parallel
index vacuum, because I always felt that the parallel option should
be employed in a targeted manner for a specific table. if you have a bunch
of large tables, some more important than others, a/c may end
up using parallel resources on the least important tables and you
will have to adjust a/v settings per table, etc to get the right table
to be parallel index vacuumed by a/v.

Also, with the TIDStore improvements for index cleanup, and the practical
elimination of multi-pass index vacuums, I see this being even less
convincing as something to add to a/v.

Now, If I am going to allocate extra workers to run vacuum in parallel, why
not just provide more autovacuum workers instead so I can get more tables
vacuumed within a span of time?

> Once we have parallel heap vacuum, as discussed in thread[1], it would
> also likely be beneficial to incorporate it into autovacuum during
> aggressive vacuum or failsafe mode.

IIRC, index cleanup is disabled by failsafe.


--
Sami Imseih
Amazon Web Services (AWS)

Re: POC: Parallel processing of indexes in autovacuum

From

Daniil Davydov

Date:

02 May, 21:12:49

On Thu, May 1, 2025 at 8:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> As I understand it, we initially disabled parallel vacuum for
> autovacuum because their objectives are somewhat contradictory.
> Parallel vacuum aims to accelerate the process by utilizing additional
> resources, while autovacuum is designed to perform cleaning operations
> with minimal impact on foreground transaction processing (e.g.,
> through vacuum delay).
>
Yep, we also decided that we must not create more a/v workers for
index processing.
In current implementation, the leader process sends a signal to the
a/v launcher, and the launcher tries to launch all requested workers.
But the number of workers never exceeds `autovacuum_max_workers`.
Thus, we will never have more a/v workers than in the standard case
(without this feature).

> Nevertheless, I see your point about the potential benefits of using
> parallel vacuum within autovacuum in specific scenarios. The crucial
> consideration is determining appropriate criteria for triggering
> parallel vacuum in autovacuum. Given that we currently support only
> parallel index processing, suitable candidates might be autovacuum
> operations on large tables that have a substantial number of
> sufficiently large indexes and a high volume of garbage tuples.
>
> Although the actual number of parallel workers ultimately depends on
> the number of eligible indexes, it might be beneficial to introduce a
> storage parameter, say parallel_vacuum_workers, that allows control
> over the number of parallel vacuum workers on a per-table basis.
>
For now, we have three GUC variables for this purpose:
max_parallel_index_autovac_workers, autovac_idx_parallel_min_rows,
autovac_idx_parallel_min_indexes.
That is, everything is as you said. But we are still conducting
research on this issue. I would like to get rid of some of these
parameters.

> Regarding implementation: I notice the WIP patch implements its own
> parallel vacuum mechanism for autovacuum. Have you considered simply
> setting at_params.nworkers to a value greater than zero?
>
About `at_params.nworkers = N` - that's exactly what we're doing (you
can see it in the `vacuum_rel` function). But we cannot fully reuse
code of VACUUM PARALLEL, because it creates its own processes via
dynamic bgworkers machinery.
As I said above - we don't want to consume additional resources. Also
we don't want to complicate communication between processes (the idea
is that a/v workers can only send signals to the a/v launcher).
As a result, we created our own implementation of parallel index
processing control - see changes in vacuumparallel.c and autovacuum.c.

--
Best regards,
Daniil Davydov

Re: POC: Parallel processing of indexes in autovacuum

From

Daniil Davydov

Date:

02 May, 21:49:58

On Fri, May 2, 2025 at 11:58 PM Sami Imseih <samimseih@gmail.com> wrote:
>
> I am generally -1 on the idea of autovacuum performing parallel
> index vacuum, because I always felt that the parallel option should
> be employed in a targeted manner for a specific table. if you have a bunch
> of large tables, some more important than others, a/c may end
> up using parallel resources on the least important tables and you
> will have to adjust a/v settings per table, etc to get the right table
> to be parallel index vacuumed by a/v.

Hm, this is a good point. I think I should clarify one moment - in
practice, there is a common situation when users have one huge table
among all databases (with 80+ indexes created on it). But, of course,
in general there may be few such tables.
But we can still adjust the autovac_idx_parallel_min_rows parameter.
If a table has a lot of dead tuples => it is actively used => table is
important (?).
Also, if the user can really determine the "importance" of each of the
tables - we can provide an appropriate table option. Tables with this
option set will be processed in parallel in priority order. What do
you think about such an idea?

>
> Also, with the TIDStore improvements for index cleanup, and the practical
> elimination of multi-pass index vacuums, I see this being even less
> convincing as something to add to a/v.

If I understood correctly, then we are talking about the fact that
TIDStore can store so many tuples that in fact a second pass is never
needed.
But the number of passes does not affect the presented optimization in
any way. We must think about a large number of indexes that must be
processed. Even within a single pass we can have a 40% increase in
speed.

>
> Now, If I am going to allocate extra workers to run vacuum in parallel, why
> not just provide more autovacuum workers instead so I can get more tables
> vacuumed within a span of time?

For now, only one process can clean up indexes, so I don't see how
increasing the number of a/v workers will help in the situation that I
mentioned above.
Also, we don't consume additional resources during autovacuum in this
patch - total number of a/v workers always <= autovacuum_max_workers.

BTW, see v2 patch, attached to this letter (bug fixes) :-)

--
Best regards,
Daniil Davydov

Attachment

v2-0001-WIP-Allow-autovacuum-to-process-indexes-of-single.patch

Re: POC: Parallel processing of indexes in autovacuum

From

Sami Imseih

Date:

02 May, 23:17:34

> On Fri, May 2, 2025 at 11:58 PM Sami Imseih <samimseih@gmail.com> wrote:
> >
> > I am generally -1 on the idea of autovacuum performing parallel
> > index vacuum, because I always felt that the parallel option should
> > be employed in a targeted manner for a specific table. if you have a bunch
> > of large tables, some more important than others, a/c may end
> > up using parallel resources on the least important tables and you
> > will have to adjust a/v settings per table, etc to get the right table
> > to be parallel index vacuumed by a/v.
>
> Hm, this is a good point. I think I should clarify one moment - in
> practice, there is a common situation when users have one huge table
> among all databases (with 80+ indexes created on it). But, of course,
> in general there may be few such tables.
> But we can still adjust the autovac_idx_parallel_min_rows parameter.
> If a table has a lot of dead tuples => it is actively used => table is
> important (?).
> Also, if the user can really determine the "importance" of each of the
> tables - we can provide an appropriate table option. Tables with this
> option set will be processed in parallel in priority order. What do
> you think about such an idea?

I think in most cases, the user will want to determine the priority of
a table getting parallel vacuum cycles rather than having the autovacuum
determine the priority. I also see users wanting to stagger
vacuums of large tables with many indexes through some time period,
and give the
tables the full amount of parallel workers they can afford at these
specific periods
of time. A/V currently does not really allow for this type of
scheduling, and if we
give some kind of GUC to prioritize tables, I think users will constantly have
to be modifying this priority.

I am basing my comments on the scenarios I have seen on the field, and others
may have a different opinion.

> > Also, with the TIDStore improvements for index cleanup, and the practical
> > elimination of multi-pass index vacuums, I see this being even less
> > convincing as something to add to a/v.
>
> If I understood correctly, then we are talking about the fact that
> TIDStore can store so many tuples that in fact a second pass is never
> needed.
> But the number of passes does not affect the presented optimization in
> any way. We must think about a large number of indexes that must be
> processed. Even within a single pass we can have a 40% increase in
> speed.

I am not discounting that a single table vacuum with many indexes will
maybe perform better with parallel index scan, I am merely saying that
the TIDStore optimization now makes index vacuums better and perhaps
there is less of an incentive to use parallel.

> > Now, If I am going to allocate extra workers to run vacuum in parallel, why
> > not just provide more autovacuum workers instead so I can get more tables
> > vacuumed within a span of time?
>
> For now, only one process can clean up indexes, so I don't see how
> increasing the number of a/v workers will help in the situation that I
> mentioned above.
> Also, we don't consume additional resources during autovacuum in this
> patch - total number of a/v workers always <= autovacuum_max_workers.

Increasing a/v workers will not help speed up a specific table, what I
am suggesting is that instead of speeding up one table, let's just allow
other tables to not be starved of a/v cycles due to lack of a/v workers.

--
Sami

Re: POC: Parallel processing of indexes in autovacuum

From

Masahiko Sawada

Date:

03 May, 01:06:41

On Fri, May 2, 2025 at 9:58 AM Sami Imseih <samimseih@gmail.com> wrote:
>
> > Once we have parallel heap vacuum, as discussed in thread[1], it would
> > also likely be beneficial to incorporate it into autovacuum during
> > aggressive vacuum or failsafe mode.
>
> IIRC, index cleanup is disabled by failsafe.

Yes. My idea is to use parallel *heap* vacuum in autovacuum during
failsafe mode. I think it would make sense as users want to complete
freezing tables as soon as possible in this situation.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: POC: Parallel processing of indexes in autovacuum

From

Masahiko Sawada

Date:

03 May, 01:27:40

On Fri, May 2, 2025 at 11:13 AM Daniil Davydov <3danissimo@gmail.com> wrote:
>
> On Thu, May 1, 2025 at 8:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > As I understand it, we initially disabled parallel vacuum for
> > autovacuum because their objectives are somewhat contradictory.
> > Parallel vacuum aims to accelerate the process by utilizing additional
> > resources, while autovacuum is designed to perform cleaning operations
> > with minimal impact on foreground transaction processing (e.g.,
> > through vacuum delay).
> >
> Yep, we also decided that we must not create more a/v workers for
> index processing.
> In current implementation, the leader process sends a signal to the
> a/v launcher, and the launcher tries to launch all requested workers.
> But the number of workers never exceeds `autovacuum_max_workers`.
> Thus, we will never have more a/v workers than in the standard case
> (without this feature).

I have concerns about this design. When autovacuuming on a single
table consumes all available autovacuum_max_workers slots with
parallel vacuum workers, the system becomes incapable of processing
other tables. This means that when determining the appropriate
autovacuum_max_workers value, users must consider not only the number
of tables to be processed concurrently but also the potential number
of parallel workers that might be launched. I think it would more make
sense to maintain the existing autovacuum_max_workers parameter while
introducing a new parameter that would either control the maximum
number of parallel vacuum workers per autovacuum worker or set a
system-wide cap on the total number of parallel vacuum workers.

>
> > Regarding implementation: I notice the WIP patch implements its own
> > parallel vacuum mechanism for autovacuum. Have you considered simply
> > setting at_params.nworkers to a value greater than zero?
> >
> About `at_params.nworkers = N` - that's exactly what we're doing (you
> can see it in the `vacuum_rel` function). But we cannot fully reuse
> code of VACUUM PARALLEL, because it creates its own processes via
> dynamic bgworkers machinery.
> As I said above - we don't want to consume additional resources. Also
> we don't want to complicate communication between processes (the idea
> is that a/v workers can only send signals to the a/v launcher).

Could you elaborate on the reasons why you don't want to use
background workers and avoid complicated communication between
processes? I'm not sure whether these concerns provide sufficient
justification for implementing its own parallel index processing.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: POC: Parallel processing of indexes in autovacuum

From

Sami Imseih

Date:

03 May, 01:59:46

> I think it would more make
> sense to maintain the existing autovacuum_max_workers parameter while
> introducing a new parameter that would either control the maximum
> number of parallel vacuum workers per autovacuum worker or set a
> system-wide cap on the total number of parallel vacuum workers.

+1, and would it make sense for parallel workers to come from
max_parallel_maintenance_workers? This is capped by
max_parallel_workers and max_worker_processes, so increasing
the defaults for all 3 will be needed as well.


--
Sami

Re: POC: Parallel processing of indexes in autovacuum

From

Daniil Davydov

Date:

03 May, 10:32:20

On Sat, May 3, 2025 at 3:17 AM Sami Imseih <samimseih@gmail.com> wrote:
>
> I think in most cases, the user will want to determine the priority of
> a table getting parallel vacuum cycles rather than having the autovacuum
> determine the priority. I also see users wanting to stagger
> vacuums of large tables with many indexes through some time period,
> and give the
> tables the full amount of parallel workers they can afford at these
> specific periods
> of time. A/V currently does not really allow for this type of
> scheduling, and if we
> give some kind of GUC to prioritize tables, I think users will constantly have
> to be modifying this priority.

If the user wants to determine priority himself, we anyway need to
introduce some parameter (GUC or table option) that will give us a
hint how we should schedule a/v work.
You think that we should think about a more comprehensive behavior for
such a parameter (so that the user doesn't have to change it often)? I
will be glad to know your thoughts.

> > If I understood correctly, then we are talking about the fact that
> > TIDStore can store so many tuples that in fact a second pass is never
> > needed.
> > But the number of passes does not affect the presented optimization in
> > any way. We must think about a large number of indexes that must be
> > processed. Even within a single pass we can have a 40% increase in
> > speed.
>
> I am not discounting that a single table vacuum with many indexes will
> maybe perform better with parallel index scan, I am merely saying that
> the TIDStore optimization now makes index vacuums better and perhaps
> there is less of an incentive to use parallel.

I still insist that this does not affect the parallel index vacuum,
because we don't get an advantage in repeated passes. We get the same
speed increase whether we have this optimization or not.
Although it's even possible that the opposite is true - the situation
will be better with the new TIDStore, but I can't say for sure.

> > > Now, If I am going to allocate extra workers to run vacuum in parallel, why
> > > not just provide more autovacuum workers instead so I can get more tables
> > > vacuumed within a span of time?
> >
> > For now, only one process can clean up indexes, so I don't see how
> > increasing the number of a/v workers will help in the situation that I
> > mentioned above.
> > Also, we don't consume additional resources during autovacuum in this
> > patch - total number of a/v workers always <= autovacuum_max_workers.
>
> Increasing a/v workers will not help speed up a specific table, what I
> am suggesting is that instead of speeding up one table, let's just allow
> other tables to not be starved of a/v cycles due to lack of a/v workers.

OK, I got it. But what if vacuuming of a single table will take (for
example) 60% of all time? This is still a possible situation, and the
fast vacuum of all other tables will not help us.

--
Best regards,
Daniil Davydov

Re: POC: Parallel processing of indexes in autovacuum

From

Daniil Davydov

Date:

03 May, 11:10:27

On Sat, May 3, 2025 at 5:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> > In current implementation, the leader process sends a signal to the
> > a/v launcher, and the launcher tries to launch all requested workers.
> > But the number of workers never exceeds `autovacuum_max_workers`.
> > Thus, we will never have more a/v workers than in the standard case
> > (without this feature).
>
> I have concerns about this design. When autovacuuming on a single
> table consumes all available autovacuum_max_workers slots with
> parallel vacuum workers, the system becomes incapable of processing
> other tables. This means that when determining the appropriate
> autovacuum_max_workers value, users must consider not only the number
> of tables to be processed concurrently but also the potential number
> of parallel workers that might be launched. I think it would more make
> sense to maintain the existing autovacuum_max_workers parameter while
> introducing a new parameter that would either control the maximum
> number of parallel vacuum workers per autovacuum worker or set a
> system-wide cap on the total number of parallel vacuum workers.
>

For now we have max_parallel_index_autovac_workers - this GUC limits
the number of parallel a/v workers that can process a single table. I
agree that the scenario you provided is problematic.
The proposal to limit the total number of supportive a/v workers seems
attractive to me (I'll implement it as an experiment).

It seems to me that this question is becoming a key one. First we need
to determine the role of the user in the whole scheduling mechanism.
Should we allow users to determine priority? Will this priority affect
only within a single vacuuming cycle, or it will be more 'global'?
I guess I don't have enough expertise to determine this alone. I will
be glad to receive any suggestions.

> > About `at_params.nworkers = N` - that's exactly what we're doing (you
> > can see it in the `vacuum_rel` function). But we cannot fully reuse
> > code of VACUUM PARALLEL, because it creates its own processes via
> > dynamic bgworkers machinery.
> > As I said above - we don't want to consume additional resources. Also
> > we don't want to complicate communication between processes (the idea
> > is that a/v workers can only send signals to the a/v launcher).
>
> Could you elaborate on the reasons why you don't want to use
> background workers and avoid complicated communication between
> processes? I'm not sure whether these concerns provide sufficient
> justification for implementing its own parallel index processing.
>

Here are my thoughts on this. A/v worker has a very simple role - it
is born after the launcher's request and must do exactly one 'task' -
vacuum table or participate in parallel index vacuum.
We also have a dedicated 'launcher' role, meaning the whole design
implies that only the launcher is able to launch processes.
If we allow a/v worker to use bgworkers, then :
1) A/v worker will go far beyond his responsibility.
2) Its functionality will overlap with the functionality of the launcher.
3) Resource consumption can jump dramatically, which is unexpected for
the user. Autovacuum will also be dependent on other resources
(bgworkers pool). The current design does not imply this.

I wanted to create a patch that would fit into the existing mechanism
without drastic innovations. But if you think that the above is not so
important, then we can reuse VACUUM PARALLEL code and it would
simplify the final implementation)

--
Best regards,
Daniil Davydov

Re: POC: Parallel processing of indexes in autovacuum

From

Daniil Davydov

Date:

03 May, 11:17:49

On Sat, May 3, 2025 at 5:59 AM Sami Imseih <samimseih@gmail.com> wrote:
>
> > I think it would more make
> > sense to maintain the existing autovacuum_max_workers parameter while
> > introducing a new parameter that would either control the maximum
> > number of parallel vacuum workers per autovacuum worker or set a
> > system-wide cap on the total number of parallel vacuum workers.
>
> +1, and would it make sense for parallel workers to come from
> max_parallel_maintenance_workers? This is capped by
> max_parallel_workers and max_worker_processes, so increasing
> the defaults for all 3 will be needed as well.

I may be wrong, but the `max_parallel_maintenance_workers` parameter
is only used for commands that are explicitly run by the user. We
already have `autovacuum_max_workers` and I think that code will be
more consistent, if we adapt this particular parameter (perhaps with
the addition of a new one, as I wrote in the previous letter).

--
Best regards,
Daniil Davydov

Re: POC: Parallel processing of indexes in autovacuum

From

Masahiko Sawada

Date:

06 May, 02:56:27

On Sat, May 3, 2025 at 1:10 AM Daniil Davydov <3danissimo@gmail.com> wrote:
>
> On Sat, May 3, 2025 at 5:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > > In current implementation, the leader process sends a signal to the
> > > a/v launcher, and the launcher tries to launch all requested workers.
> > > But the number of workers never exceeds `autovacuum_max_workers`.
> > > Thus, we will never have more a/v workers than in the standard case
> > > (without this feature).
> >
> > I have concerns about this design. When autovacuuming on a single
> > table consumes all available autovacuum_max_workers slots with
> > parallel vacuum workers, the system becomes incapable of processing
> > other tables. This means that when determining the appropriate
> > autovacuum_max_workers value, users must consider not only the number
> > of tables to be processed concurrently but also the potential number
> > of parallel workers that might be launched. I think it would more make
> > sense to maintain the existing autovacuum_max_workers parameter while
> > introducing a new parameter that would either control the maximum
> > number of parallel vacuum workers per autovacuum worker or set a
> > system-wide cap on the total number of parallel vacuum workers.
> >
>
> For now we have max_parallel_index_autovac_workers - this GUC limits
> the number of parallel a/v workers that can process a single table. I
> agree that the scenario you provided is problematic.
> The proposal to limit the total number of supportive a/v workers seems
> attractive to me (I'll implement it as an experiment).
>
> It seems to me that this question is becoming a key one. First we need
> to determine the role of the user in the whole scheduling mechanism.
> Should we allow users to determine priority? Will this priority affect
> only within a single vacuuming cycle, or it will be more 'global'?
> I guess I don't have enough expertise to determine this alone. I will
> be glad to receive any suggestions.

What I roughly imagined is that we don't need to change the entire
autovacuum scheduling, but would like autovacuum workers to decides
whether or not to use parallel vacuum during its vacuum operation
based on GUC parameters (having a global effect) or storage parameters
(having an effect on the particular table). The criteria of triggering
parallel vacuum in autovacuum might need to be somewhat pessimistic so
that we don't unnecessarily use parallel vacuum on many tables.

>
> > > About `at_params.nworkers = N` - that's exactly what we're doing (you
> > > can see it in the `vacuum_rel` function). But we cannot fully reuse
> > > code of VACUUM PARALLEL, because it creates its own processes via
> > > dynamic bgworkers machinery.
> > > As I said above - we don't want to consume additional resources. Also
> > > we don't want to complicate communication between processes (the idea
> > > is that a/v workers can only send signals to the a/v launcher).
> >
> > Could you elaborate on the reasons why you don't want to use
> > background workers and avoid complicated communication between
> > processes? I'm not sure whether these concerns provide sufficient
> > justification for implementing its own parallel index processing.
> >
>
> Here are my thoughts on this. A/v worker has a very simple role - it
> is born after the launcher's request and must do exactly one 'task' -
> vacuum table or participate in parallel index vacuum.
> We also have a dedicated 'launcher' role, meaning the whole design
> implies that only the launcher is able to launch processes.
>
> If we allow a/v worker to use bgworkers, then :
> 1) A/v worker will go far beyond his responsibility.
> 2) Its functionality will overlap with the functionality of the launcher.

While I agree that the launcher process is responsible for launching
autovacuum worker processes but I'm not sure it should be for
launching everything related autovacuums. It's quite possible that we
have parallel heap vacuum and processing the particular index with
parallel workers in the future. The code could get more complex if we
have the autovacuum launcher process launch such parallel workers too.
I believe it's more straightforward to divide the responsibility like
in a way that the autovacuum launcher is responsible for launching
autovacuum workers and autovacuum workers are responsible for
vacuuming tables no matter how to do that.

> 3) Resource consumption can jump dramatically, which is unexpected for
> the user.

What extra resources could be used if we use background workers
instead of autovacuum workers?

> Autovacuum will also be dependent on other resources
> (bgworkers pool). The current design does not imply this.

I see your point but I think it doesn't necessarily need to reflect it
at the infrastructure layer. For example, we can internally allocate
extra background worker slots for parallel vacuum workers based on
max_parallel_index_autovac_workers in addition to
max_worker_processes. Anyway we might need something to check or
validate max_worker_processes value to make sure that every autovacuum
worker can use the specified number of parallel workers for parallel
vacuum.

> I wanted to create a patch that would fit into the existing mechanism
> without drastic innovations. But if you think that the above is not so
> important, then we can reuse VACUUM PARALLEL code and it would
> simplify the final implementation)

I'd suggest using the existing infrastructure if we can achieve the
goal with it. If we find out there are some technical difficulties to
implement it without new infrastructure, we can revisit this approach.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: POC: Parallel processing of indexes in autovacuum

From

Sami Imseih

Date:

06 May, 03:21:07

On Sat, May 3, 2025 at 1:10 AM Daniil Davydov <3danissimo@gmail.com> wrote:
>
> On Sat, May 3, 2025 at 5:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > > In current implementation, the leader process sends a signal to the
> > > a/v launcher, and the launcher tries to launch all requested workers.
> > > But the number of workers never exceeds `autovacuum_max_workers`.
> > > Thus, we will never have more a/v workers than in the standard case
> > > (without this feature).
> >
> > I have concerns about this design. When autovacuuming on a single
> > table consumes all available autovacuum_max_workers slots with
> > parallel vacuum workers, the system becomes incapable of processing
> > other tables. This means that when determining the appropriate
> > autovacuum_max_workers value, users must consider not only the number
> > of tables to be processed concurrently but also the potential number
> > of parallel workers that might be launched. I think it would more make
> > sense to maintain the existing autovacuum_max_workers parameter while
> > introducing a new parameter that would either control the maximum
> > number of parallel vacuum workers per autovacuum worker or set a
> > system-wide cap on the total number of parallel vacuum workers.
> >
>
> For now we have max_parallel_index_autovac_workers - this GUC limits
> the number of parallel a/v workers that can process a single table. I
> agree that the scenario you provided is problematic.
> The proposal to limit the total number of supportive a/v workers seems
> attractive to me (I'll implement it as an experiment).
>
> It seems to me that this question is becoming a key one. First we need
> to determine the role of the user in the whole scheduling mechanism.
> Should we allow users to determine priority? Will this priority affect
> only within a single vacuuming cycle, or it will be more 'global'?
> I guess I don't have enough expertise to determine this alone. I will
> be glad to receive any suggestions.

What I roughly imagined is that we don't need to change the entire
autovacuum scheduling, but would like autovacuum workers to decides
whether or not to use parallel vacuum during its vacuum operation
based on GUC parameters (having a global effect) or storage parameters
(having an effect on the particular table). The criteria of triggering
parallel vacuum in autovacuum might need to be somewhat pessimistic so
that we don't unnecessarily use parallel vacuum on many tables.

Perhaps we should only provide a reloption, therefore only tables specified

by the user via the reloption can be autovacuumed in parallel?

This gives a targeted approach. Of course if multiple of these allowed tables

are to be autovacuumed at the same time, some may not get all the workers,

But that’s not different from if you are to manually vacuum in parallel the tables

at the same time.

What do you think ?

—

Sami

Re: POC: Parallel processing of indexes in autovacuum

From

Daniil Davydov

Date:

06 May, 07:54:49

On Tue, May 6, 2025 at 6:57 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> What I roughly imagined is that we don't need to change the entire
> autovacuum scheduling, but would like autovacuum workers to decides
> whether or not to use parallel vacuum during its vacuum operation
> based on GUC parameters (having a global effect) or storage parameters
> (having an effect on the particular table). The criteria of triggering
> parallel vacuum in autovacuum might need to be somewhat pessimistic so
> that we don't unnecessarily use parallel vacuum on many tables.
>

+1, I think about it in the same way. I will expand on this topic in
more detail in response to Sami's letter [1], so as not to repeat
myself.

> > Here are my thoughts on this. A/v worker has a very simple role - it
> > is born after the launcher's request and must do exactly one 'task' -
> > vacuum table or participate in parallel index vacuum.
> > We also have a dedicated 'launcher' role, meaning the whole design
> > implies that only the launcher is able to launch processes.
> >
> > If we allow a/v worker to use bgworkers, then :
> > 1) A/v worker will go far beyond his responsibility.
> > 2) Its functionality will overlap with the functionality of the launcher.
>
> While I agree that the launcher process is responsible for launching
> autovacuum worker processes but I'm not sure it should be for
> launching everything related autovacuums. It's quite possible that we
> have parallel heap vacuum and processing the particular index with
> parallel workers in the future. The code could get more complex if we
> have the autovacuum launcher process launch such parallel workers too.
> I believe it's more straightforward to divide the responsibility like
> in a way that the autovacuum launcher is responsible for launching
> autovacuum workers and autovacuum workers are responsible for
> vacuuming tables no matter how to do that.

It sounds very tempting. At the very beginning I did exactly that (to
make sure that nothing would break in a parallel autovacuum). Only
later it was decided to abandon the use of bgworkers.
For now both approaches look fair for me. What do you think - will
others agree that we can provide more responsibility to a/v workers?

> > 3) Resource consumption can jump dramatically, which is unexpected for
> > the user.
>
> What extra resources could be used if we use background workers
> instead of autovacuum workers?

I meant that more processes are starting to participate in the
autovacuum than indicated in autovacuum_max_workers. And if a/v worker
will use additional bgworkers => other operations cannot get these
resources.

> > Autovacuum will also be dependent on other resources
> > (bgworkers pool). The current design does not imply this.
>
> I see your point but I think it doesn't necessarily need to reflect it
> at the infrastructure layer. For example, we can internally allocate
> extra background worker slots for parallel vacuum workers based on
> max_parallel_index_autovac_workers in addition to
> max_worker_processes. Anyway we might need something to check or
> validate max_worker_processes value to make sure that every autovacuum
> worker can use the specified number of parallel workers for parallel
> vacuum.

I don't think that we can provide all supportive workers for each
parallel index vacuuming request. But I got your point - always keep
several bgworkers that only a/v workers can use if needed and the size
of this additional pool (depending on max_worker_processes) must be
user-configurable.

> > I wanted to create a patch that would fit into the existing mechanism
> > without drastic innovations. But if you think that the above is not so
> > important, then we can reuse VACUUM PARALLEL code and it would
> > simplify the final implementation)
>
> I'd suggest using the existing infrastructure if we can achieve the
> goal with it. If we find out there are some technical difficulties to
> implement it without new infrastructure, we can revisit this approach.

OK, in the near future I'll implement it and send a new patch to this
thread. I'll be glad if you will take a look on it)

[1] https://www.postgresql.org/message-id/CAA5RZ0vfBg%3Dc_0Sa1Tpxv8tueeBk8C5qTf9TrxKBbXUqPc99Ag%40mail.gmail.com

--
Best regards,
Daniil Davydov

Re: POC: Parallel processing of indexes in autovacuum

From

Masahiko Sawada

Date:

06 May, 08:15:43

On Mon, May 5, 2025 at 5:21 PM Sami Imseih <samimseih@gmail.com> wrote:
>
>
>> On Sat, May 3, 2025 at 1:10 AM Daniil Davydov <3danissimo@gmail.com> wrote:
>> >
>> > On Sat, May 3, 2025 at 5:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>> > >
>> > > > In current implementation, the leader process sends a signal to the
>> > > > a/v launcher, and the launcher tries to launch all requested workers.
>> > > > But the number of workers never exceeds `autovacuum_max_workers`.
>> > > > Thus, we will never have more a/v workers than in the standard case
>> > > > (without this feature).
>> > >
>> > > I have concerns about this design. When autovacuuming on a single
>> > > table consumes all available autovacuum_max_workers slots with
>> > > parallel vacuum workers, the system becomes incapable of processing
>> > > other tables. This means that when determining the appropriate
>> > > autovacuum_max_workers value, users must consider not only the number
>> > > of tables to be processed concurrently but also the potential number
>> > > of parallel workers that might be launched. I think it would more make
>> > > sense to maintain the existing autovacuum_max_workers parameter while
>> > > introducing a new parameter that would either control the maximum
>> > > number of parallel vacuum workers per autovacuum worker or set a
>> > > system-wide cap on the total number of parallel vacuum workers.
>> > >
>> >
>> > For now we have max_parallel_index_autovac_workers - this GUC limits
>> > the number of parallel a/v workers that can process a single table. I
>> > agree that the scenario you provided is problematic.
>> > The proposal to limit the total number of supportive a/v workers seems
>> > attractive to me (I'll implement it as an experiment).
>> >
>> > It seems to me that this question is becoming a key one. First we need
>> > to determine the role of the user in the whole scheduling mechanism.
>> > Should we allow users to determine priority? Will this priority affect
>> > only within a single vacuuming cycle, or it will be more 'global'?
>> > I guess I don't have enough expertise to determine this alone. I will
>> > be glad to receive any suggestions.
>>
>> What I roughly imagined is that we don't need to change the entire
>> autovacuum scheduling, but would like autovacuum workers to decides
>> whether or not to use parallel vacuum during its vacuum operation
>> based on GUC parameters (having a global effect) or storage parameters
>> (having an effect on the particular table). The criteria of triggering
>> parallel vacuum in autovacuum might need to be somewhat pessimistic so
>> that we don't unnecessarily use parallel vacuum on many tables.
>
>
> Perhaps we should only provide a reloption, therefore only tables specified
> by the user via the reloption can be autovacuumed  in parallel?
>
> This gives a targeted approach. Of course if multiple of these allowed tables
> are to be autovacuumed at the same time, some may not get all the workers,
> But that’s not different from if you are to manually vacuum in parallel the tables
> at the same time.
>
> What do you think ?

+1. I think that's a good starting point. We can later introduce a new
GUC parameter that globally controls the maximum number of parallel
vacuum workers used in autovacuum, if necessary.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: POC: Parallel processing of indexes in autovacuum

From

Daniil Davydov

Date:

06 May, 08:16:06

On Tue, May 6, 2025 at 7:21 AM Sami Imseih <samimseih@gmail.com> wrote:
>
> Perhaps we should only provide a reloption, therefore only tables specified
> by the user via the reloption can be autovacuumed  in parallel?

Аfter your comments (earlier in this thread) I decided to do just
that. For now we have reloption, so the user can decide which tables
are "important" for parallel index vacuuming.
We also set lower bounds (hardcoded) on the number of indexes and the
number of dead tuples. For example, there is no need to use a parallel
vacuum if the table has only one index.
The situation is more complicated with the number of dead tuples - we
need tests that would show the optimal minimum value. This issue is
still being worked out.

> This gives a targeted approach. Of course if multiple of these allowed tables
> are to be autovacuumed at the same time, some may not get all the workers,
> But that’s not different from if you are to manually vacuum in parallel the tables
> at the same time.

I fully agree. Recently v2 patch has been supplemented with a new
feature [1] - multiple tables in a cluster can be processed in
parallel during autovacuum. And of course, not every a/v worker can
get enough supportive processes, but this is considered normal
behavior.
Maximum number of supportive workers is limited by the GUC variable.

[1] I guess that I'll send it within the v3 patch, that will also
contain logic that was discussed in the letter above - using bgworkers
instead of additional a/v workers. BTW, what do you think about this
idea?

--
Best regards,
Daniil Davydov

Re: POC: Parallel processing of indexes in autovacuum

From

Sami Imseih

Date:

06 May, 23:11:38

> On Mon, May 5, 2025 at 5:21 PM Sami Imseih <samimseih@gmail.com> wrote:
> >
> >
> >> On Sat, May 3, 2025 at 1:10 AM Daniil Davydov <3danissimo@gmail.com> wrote:
> >> >
> >> > On Sat, May 3, 2025 at 5:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >> > >
> >> > > > In current implementation, the leader process sends a signal to the
> >> > > > a/v launcher, and the launcher tries to launch all requested workers.
> >> > > > But the number of workers never exceeds `autovacuum_max_workers`.
> >> > > > Thus, we will never have more a/v workers than in the standard case
> >> > > > (without this feature).
> >> > >
> >> > > I have concerns about this design. When autovacuuming on a single
> >> > > table consumes all available autovacuum_max_workers slots with
> >> > > parallel vacuum workers, the system becomes incapable of processing
> >> > > other tables. This means that when determining the appropriate
> >> > > autovacuum_max_workers value, users must consider not only the number
> >> > > of tables to be processed concurrently but also the potential number
> >> > > of parallel workers that might be launched. I think it would more make
> >> > > sense to maintain the existing autovacuum_max_workers parameter while
> >> > > introducing a new parameter that would either control the maximum
> >> > > number of parallel vacuum workers per autovacuum worker or set a
> >> > > system-wide cap on the total number of parallel vacuum workers.
> >> > >
> >> >
> >> > For now we have max_parallel_index_autovac_workers - this GUC limits
> >> > the number of parallel a/v workers that can process a single table. I
> >> > agree that the scenario you provided is problematic.
> >> > The proposal to limit the total number of supportive a/v workers seems
> >> > attractive to me (I'll implement it as an experiment).
> >> >
> >> > It seems to me that this question is becoming a key one. First we need
> >> > to determine the role of the user in the whole scheduling mechanism.
> >> > Should we allow users to determine priority? Will this priority affect
> >> > only within a single vacuuming cycle, or it will be more 'global'?
> >> > I guess I don't have enough expertise to determine this alone. I will
> >> > be glad to receive any suggestions.
> >>
> >> What I roughly imagined is that we don't need to change the entire
> >> autovacuum scheduling, but would like autovacuum workers to decides
> >> whether or not to use parallel vacuum during its vacuum operation
> >> based on GUC parameters (having a global effect) or storage parameters
> >> (having an effect on the particular table). The criteria of triggering
> >> parallel vacuum in autovacuum might need to be somewhat pessimistic so
> >> that we don't unnecessarily use parallel vacuum on many tables.
> >
> >
> > Perhaps we should only provide a reloption, therefore only tables specified
> > by the user via the reloption can be autovacuumed  in parallel?
> >
> > This gives a targeted approach. Of course if multiple of these allowed tables
> > are to be autovacuumed at the same time, some may not get all the workers,
> > But that’s not different from if you are to manually vacuum in parallel the tables
> > at the same time.
> >
> > What do you think ?
>
> +1. I think that's a good starting point. We can later introduce a new
> GUC parameter that globally controls the maximum number of parallel
> vacuum workers used in autovacuum, if necessary.

and I this reloption should also apply to parallel heap vacuum in
non-failsafe scenarios.
In the failsafe case however, all tables will be eligible for parallel
vacuum. Anyhow, that
discussion could be taken in that thread, but wanted to point that out.

--
Sami Imseih
Amazon Web Services (AWS)

Re: POC: Parallel processing of indexes in autovacuum

From

Daniil Davydov

Date:

16 May, 08:10:10

Hi,

On Fri, May 16, 2025 at 4:06 AM Matheus Alcantara
<matheusssilv97@gmail.com> wrote:
> I've reviewed the v1-0001 patch, the build on MacOS using meson+ninja is
> failing:
> ❯❯❯ ninja -C build install
> ninja: Entering directory `build'
> [1/126] Compiling C object
> src/backend/postgres_lib.a.p/utils_misc_guc_tables.c.o
> FAILED: src/backend/postgres_lib.a.p/utils_misc_guc_tables.c.o
> ../src/backend/utils/misc/guc_tables.c:3613:4: error: incompatible
> pointer to integer conversion initializing 'int' with an expression of
> type 'void *' [-Wint-conversion]
>  3613 |                         NULL,
>       |                         ^~~~
>

Thank you for reviewing this patch!

> It seems that the "autovacuum_reserved_workers_num" declaration on
> guc_tables.c has an extra gettext_noop() call?

Good catch, I fixed this warning in the v2 version.

>
> One other point is that as you've added TAP tests for the autovacuum I
> think you also need to create a meson.build file as you already create
> the Makefile.
>
> You also need to update the src/test/modules/meson.build and
> src/test/modules/Makefile to include the new test/modules/autovacuum
> path.
>

OK, I should clarify this moment : modules/autovacuum is not a normal
test but a sandbox - just an example of how we can trigger parallel
index autovacuum. Also it may be used for debugging purposes.
In fact, 001_autovac_parallel.pl is not verifying anything.
I'll do as you asked (add all meson and Make stuff), but please don't
focus on it. The creation of the real test is still in progress. (I'll
try to complete it as soon as possible).

In this letter I will divide the patch into 2 parts : implementation
and sandbox. What do you think about implementation?

--
Best regards,
Daniil Davydov

Attachment

Re: POC: Parallel processing of indexes in autovacuum

From

Masahiko Sawada

Date:

23 May, 02:12:06

On Thu, May 22, 2025 at 12:44 AM Daniil Davydov <3danissimo@gmail.com> wrote:
>
> Hi,
>
> On Wed, May 21, 2025 at 5:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I have some comments on v2-0001 patch
>
> Thank you for reviewing this patch!
>
> > +   {
> > +       {"autovacuum_reserved_workers_num", PGC_USERSET,
> > RESOURCES_WORKER_PROCESSES,
> > +           gettext_noop("Number of worker processes, reserved for
> > participation in parallel index processing during autovacuum."),
> > +           gettext_noop("This parameter is depending on
> > \"max_worker_processes\" (not on \"autovacuum_max_workers\"). "
> > +                        "*Only* autovacuum workers can use these
> > additional processes. "
> > +                        "Also, these processes are taken into account
> > in \"max_parallel_workers\"."),
> > +       },
> > +       &av_reserved_workers_num,
> > +       0, 0, MAX_BACKENDS,
> > +       check_autovacuum_reserved_workers_num, NULL, NULL
> > +   },
> >
> > I find that the name "autovacuum_reserved_workers_num" is generic. It
> > would be better to have a more specific name for parallel vacuum such
> > as autovacuum_max_parallel_workers. This parameter is related to
> > neither autovacuum_worker_slots nor autovacuum_max_workers, which
> > seems fine to me. Also, max_parallel_maintenance_workers doesn't
> > affect this parameter.
> > .......
> > I've also considered some alternative names. If we were to use
> > parallel_maintenance_workers, it sounds like it controls the parallel
> > degree for all operations using max_parallel_maintenance_workers,
> > including CREATE INDEX. Similarly, vacuum_parallel_workers could be
> > interpreted as affecting both autovacuum and manual VACUUM commands,
> > suggesting that when users run "VACUUM (PARALLEL) t", the system would
> > use their specified value for the parallel degree. I prefer
> > autovacuum_parallel_workers or vacuum_parallel_workers.
> >
>
> This was my headache when I created names for variables. Autovacuum
> initially implies parallelism, because we have several parallel a/v
> workers.

I'm not sure if it's parallelism. We can have multiple autovacuum
workers simultaneously working on different tables, which seems not
parallelism to me.

> So I think that parameter like
> `autovacuum_max_parallel_workers` will confuse somebody.
> If we want to have a more specific name, I would prefer
> `max_parallel_index_autovacuum_workers`.

It's better not to use 'index' as we're trying to extend parallel
vacuum to heap scanning/vacuuming as well[1].

>
> > +   /*
> > +    * If we are running autovacuum - decide whether we need to process indexes
> > +    * of table with given oid in parallel.
> > +    */
> > +   if (AmAutoVacuumWorkerProcess() &&
> > +       params->index_cleanup != VACOPTVALUE_DISABLED &&
> > +       RelationAllowsParallelIdxAutovac(rel))
> >
> > I think that this should be done in autovacuum code.
>
> We need params->index cleanup variable to decide whether we need to
> use parallel index a/v. In autovacuum.c we have this code :
> ***
> /*
>  * index_cleanup and truncate are unspecified at first in autovacuum.
>  * They will be filled in with usable values using their reloptions
>  * (or reloption defaults) later.
>  */
> tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
> tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
> ***
> This variable is filled in inside the `vacuum_rel` function, so I
> think we should keep the above logic in vacuum.c.

I guess that we can specify the parallel degree even if index_cleanup
is still UNSPECIFIED. vacuum_rel() would then decide whether to use
index vacuuming and vacuumlazy.c would decide whether to use parallel
vacuum based on the specified parallel degree and index_cleanup value.

>
> > +#define AV_PARALLEL_DEADTUP_THRESHOLD  1024
> >
> > These fixed values really useful in common cases? I think we already
> > have an optimization where we skip vacuum indexes if the table has
> > fewer dead tuples (see BYPASS_THRESHOLD_PAGES).
>
> When we allocate dead items (and optionally init parallel autocuum) we
> don't have sane value for `vacrel->lpdead_item_pages` (which should be
> compared with BYPASS_THRESHOLD_PAGES).
> The only criterion that we can focus on is the number of dead tuples
> indicated in the PgStat_StatTabEntry.

My point is that this criterion might not be useful. We have the
bypass optimization for index vacuuming and having many dead tuples
doesn't necessarily mean index vacuuming taking a long time. For
example, even if the table has a few dead tuples, index vacuuming
could take a very long time and parallel index vacuuming would help
the situation, if the table is very large and has many indexes.

>
> ----
>
> > I guess we can implement this parameter as an integer parameter so
> > that the user can specify the number of parallel vacuum workers for
> > the table. For example, we can have a reloption
> > autovacuum_parallel_workers. Setting 0 (by default) means to disable
> > parallel vacuum during autovacuum, and setting special value -1 means
> > to let PostgreSQL calculate the parallel degree for the table (same as
> > the default VACUUM command behavior).
> > ...........
> > The patch includes the changes to bgworker.c so that we can reserve
> > some slots for autovacuums. I guess that this change is not
> > necessarily necessary because if the user sets the related GUC
> > parameters correctly the autovacuum workers can use parallel vacuum as
> > expected.  Even if we need this change, I would suggest implementing
> > it as a separate patch.
> > ..........
> > +#define AV_PARALLEL_DEADTUP_THRESHOLD  1024
> > +#define NUM_INDEXES_PER_PARALLEL_WORKER 30
> >
> > These fixed values really useful in common cases? Given that we rely on
> > users' heuristics which table needs to use parallel vacuum during
> > autovacuum, I think we don't need to apply these conditions.
> > ..........
>
> I grouped these comments together, because they all relate to a single
> question : how much freedom will we give to the user?
> Your opinion (as far as I understand) is that we allow users to
> specify any number of parallel workers for tables, and it is the
> user's responsibility to configure appropriate GUC variables, so that
> autovacuum can always process indexes in parallel.
> And we don't need to think about thresholds. Even if the table has a
> small number of indexes and dead rows - if the user specified table
> option, we must do a parallel index a/v with requested number of
> parallel workers.
> Please correct me if I messed something up.
>
> I think that this logic is well suited for the `VACUUM (PARALLEL)` sql
> command, which is manually called by the user.

The current idea that users can use parallel vacuum on particular
tables based on their heuristic makes sense to me as the first
implementation.

> But autovacuum (as I think) should work as stable as possible and
> `unnoticed` by other processes. Thus, we must :
> 1) Compute resources (such as the number of parallel workers for a
> single table's indexes vacuuming) as efficiently as possible.
> 2) Provide a guarantee that as many tables as possible (among
> requested) will be processed in parallel.

I think these ideas could be implemented on top of the current idea.

> (1) can be achieved by calculating the parameters on the fly.
> NUM_INDEXES_PER_PARALLEL_WORKER is a rough mock. I can provide more
> accurate value in the near future.

I think it requires more things than the number of indexes on the
table to achieve (1). Suppose that there is a very large table that
gets updates heavily and has a few indexes. If users want to avoid the
table from being bloated, it would be a reasonable idea to use
parallel vacuum during autovacuum and it would not be a good idea to
disallow using parallel vacuum solely because it doesn't have more
than 30 indexes. On the other hand, if the table had got many updates
but not so now, users might want to use resources for autovacuums on
other tables. We might need to consider autovacuum frequencies per
table, the statistics of the previous autovacuum, or system loads etc.
So I think that in order to achieve (1) we might need more statistics
and using only NUM_INDEXES_PER_PARALLEL_WORKER would not work fine.

> (2) can be achieved by workers reserving - we know that N workers
> (from bgworkers pool) are *always* at our disposal. And when we use
> such workers we are not dependent on other operations in the cluster
> and we don't interfere with other operations by taking resources away
> from them.

Reserving some bgworkers for autovacuum could make sense. But I think
it's better to implement it in a general way as it could be useful in
other use cases too. That is, it might be a good to implement
infrastructure so that any PostgreSQL code (possibly including
extensions) can request allocating a pool of bgworkers for specific
usage and use bgworkers from them.

Regards,

[1] https://www.postgresql.org/message-id/CAD21AoAEfCNv-GgaDheDJ%2Bs-p_Lv1H24AiJeNoPGCmZNSwL1YA%40mail.gmail.com

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: POC: Parallel processing of indexes in autovacuum

From

Masahiko Sawada

Date:

23 May, 02:20:30

On Thu, May 22, 2025 at 10:48 AM Sami Imseih <samimseih@gmail.com> wrote:
>
> I started looking at the patch but I have some high level thoughts I would
> like to share before looking further.
>
> > > I find that the name "autovacuum_reserved_workers_num" is generic. It
> > > would be better to have a more specific name for parallel vacuum such
> > > as autovacuum_max_parallel_workers. This parameter is related to
> > > neither autovacuum_worker_slots nor autovacuum_max_workers, which
> > > seems fine to me. Also, max_parallel_maintenance_workers doesn't
> > > affect this parameter.
> > > .......
> > > I've also considered some alternative names. If we were to use
> > > parallel_maintenance_workers, it sounds like it controls the parallel
> > > degree for all operations using max_parallel_maintenance_workers,
> > > including CREATE INDEX. Similarly, vacuum_parallel_workers could be
> > > interpreted as affecting both autovacuum and manual VACUUM commands,
> > > suggesting that when users run "VACUUM (PARALLEL) t", the system would
> > > use their specified value for the parallel degree. I prefer
> > > autovacuum_parallel_workers or vacuum_parallel_workers.
> > >
> >
> > This was my headache when I created names for variables. Autovacuum
> > initially implies parallelism, because we have several parallel a/v
> > workers. So I think that parameter like
> > `autovacuum_max_parallel_workers` will confuse somebody.
> > If we want to have a more specific name, I would prefer
> > `max_parallel_index_autovacuum_workers`.
>
> I don't think we should have a separate pool of parallel workers for those
> that are used to support parallel autovacuum. At the end of the day, these
> are parallel workers and they should be capped by max_parallel_workers. I think
> it will be confusing if we claim these are parallel workers, but they
> are coming from
> a different pool.

I agree that parallel vacuum workers used during autovacuum should be
capped by the max_parallel_workers.

>
> I envision we have another GUC such as "max_parallel_autovacuum_workers"
> (which I think is a better name) that matches the behavior of
> "max_parallel_maintenance_worker". Meaning that the autovacuum workers
> still maintain their existing behavior ( launching a worker per table
> ), and if they do need
> to vacuum in parallel, they can draw from a pool of parallel workers.
>
> With the above said, I therefore think the reloption should actually be a number
> of parallel workers rather than a boolean. Let's take an example of a
> user that has 3 tables
> they wish to (auto)vacuum can process in parallel, and if available
> they wish each of these tables
> could be autovacuumed with 4 parallel workers. However, as to not
> overload the system, they
> cap the 'max_parallel_maintenance_worker' to something like 8. If it
> so happens that all
> 3 tables are auto-vacuumed at the same time, there may not be enough
> parallel workers,
> so one table will be a loser and be vacuumed in serial.

+1 for the reloption having a number of parallel workers, leaving
aside the name competition.

> That is
> acceptable, and a/v logging
> ( and perhaps other stat views ) should display this behavior: workers
> planned vs workers launched.

Agreed. The workers planned vs. launched is reported only with VERBOSE
option so we need to change it so that autovacuum can log it at least.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: POC: Parallel processing of indexes in autovacuum

From

Daniil Davydov

Date:

25 May, 20:22:42

Hi,

On Fri, May 23, 2025 at 6:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, May 22, 2025 at 12:44 AM Daniil Davydov <3danissimo@gmail.com> wrote:
> >
> > On Wed, May 21, 2025 at 5:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > I find that the name "autovacuum_reserved_workers_num" is generic. It
> > > would be better to have a more specific name for parallel vacuum such
> > > as autovacuum_max_parallel_workers. This parameter is related to
> > > neither autovacuum_worker_slots nor autovacuum_max_workers, which
> > > seems fine to me. Also, max_parallel_maintenance_workers doesn't
> > > affect this parameter.
> >
> > This was my headache when I created names for variables. Autovacuum
> > initially implies parallelism, because we have several parallel a/v
> > workers.
>
> I'm not sure if it's parallelism. We can have multiple autovacuum
> workers simultaneously working on different tables, which seems not
> parallelism to me.

Hm, I didn't thought about the 'parallelism' definition in this way.
But I see your point - the next v4 patch will contain the naming that
you suggest.

>
> > So I think that parameter like
> > `autovacuum_max_parallel_workers` will confuse somebody.
> > If we want to have a more specific name, I would prefer
> > `max_parallel_index_autovacuum_workers`.
>
> It's better not to use 'index' as we're trying to extend parallel
> vacuum to heap scanning/vacuuming as well[1].

OK, I'll fix it.

> > > +   /*
> > > +    * If we are running autovacuum - decide whether we need to process indexes
> > > +    * of table with given oid in parallel.
> > > +    */
> > > +   if (AmAutoVacuumWorkerProcess() &&
> > > +       params->index_cleanup != VACOPTVALUE_DISABLED &&
> > > +       RelationAllowsParallelIdxAutovac(rel))
> > >
> > > I think that this should be done in autovacuum code.
> >
> > We need params->index cleanup variable to decide whether we need to
> > use parallel index a/v. In autovacuum.c we have this code :
> > ***
> > /*
> >  * index_cleanup and truncate are unspecified at first in autovacuum.
> >  * They will be filled in with usable values using their reloptions
> >  * (or reloption defaults) later.
> >  */
> > tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
> > tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
> > ***
> > This variable is filled in inside the `vacuum_rel` function, so I
> > think we should keep the above logic in vacuum.c.
>
> I guess that we can specify the parallel degree even if index_cleanup
> is still UNSPECIFIED. vacuum_rel() would then decide whether to use
> index vacuuming and vacuumlazy.c would decide whether to use parallel
> vacuum based on the specified parallel degree and index_cleanup value.
>
> >
> > > +#define AV_PARALLEL_DEADTUP_THRESHOLD  1024
> > >
> > > These fixed values really useful in common cases? I think we already
> > > have an optimization where we skip vacuum indexes if the table has
> > > fewer dead tuples (see BYPASS_THRESHOLD_PAGES).
> >
> > When we allocate dead items (and optionally init parallel autocuum) we
> > don't have sane value for `vacrel->lpdead_item_pages` (which should be
> > compared with BYPASS_THRESHOLD_PAGES).
> > The only criterion that we can focus on is the number of dead tuples
> > indicated in the PgStat_StatTabEntry.
>
> My point is that this criterion might not be useful. We have the
> bypass optimization for index vacuuming and having many dead tuples
> doesn't necessarily mean index vacuuming taking a long time. For
> example, even if the table has a few dead tuples, index vacuuming
> could take a very long time and parallel index vacuuming would help
> the situation, if the table is very large and has many indexes.

That sounds reasonable. I'll fix it.

> > But autovacuum (as I think) should work as stable as possible and
> > `unnoticed` by other processes. Thus, we must :
> > 1) Compute resources (such as the number of parallel workers for a
> > single table's indexes vacuuming) as efficiently as possible.
> > 2) Provide a guarantee that as many tables as possible (among
> > requested) will be processed in parallel.
> >
> > (1) can be achieved by calculating the parameters on the fly.
> > NUM_INDEXES_PER_PARALLEL_WORKER is a rough mock. I can provide more
> > accurate value in the near future.
>
> I think it requires more things than the number of indexes on the
> table to achieve (1). Suppose that there is a very large table that
> gets updates heavily and has a few indexes. If users want to avoid the
> table from being bloated, it would be a reasonable idea to use
> parallel vacuum during autovacuum and it would not be a good idea to
> disallow using parallel vacuum solely because it doesn't have more
> than 30 indexes. On the other hand, if the table had got many updates
> but not so now, users might want to use resources for autovacuums on
> other tables. We might need to consider autovacuum frequencies per
> table, the statistics of the previous autovacuum, or system loads etc.
> So I think that in order to achieve (1) we might need more statistics
> and using only NUM_INDEXES_PER_PARALLEL_WORKER would not work fine.
>

It's hard for me to imagine exactly how extended statistics will help
us track such situations.
It seems that for any of our heuristics, it will be possible to come
up with a counter example.
Maybe we can give advices (via logs) to the user? But for such an
idea, tests should be conducted so that we can understand when
resource consumption becomes ineffective.
I guess that we need to agree on an implementation before conducting such tests.

> > (2) can be achieved by workers reserving - we know that N workers
> > (from bgworkers pool) are *always* at our disposal. And when we use
> > such workers we are not dependent on other operations in the cluster
> > and we don't interfere with other operations by taking resources away
> > from them.
>
> Reserving some bgworkers for autovacuum could make sense. But I think
> it's better to implement it in a general way as it could be useful in
> other use cases too. That is, it might be a good to implement
> infrastructure so that any PostgreSQL code (possibly including
> extensions) can request allocating a pool of bgworkers for specific
> usage and use bgworkers from them.

Reserving infrastructure is an ambitious idea. I am not sure that we
should implement it within this thread and feature.
Maybe we should create a separate thread for it and as a
justification, refer to parallel autovacuum?

-----
Thanks everybody for feedback! I attach a v4 patch to this letter.
Main features :
1) 'parallel_autovacuum_workers' reloption - integer value, that sets
the maximum number of parallel a/v workers that can be taken from
bgworkers pool in order to process this table.
2) 'max_parallel_autovacuum_workers' - GUC variable, that sets the
maximum total number of parallel a/v workers, that can be taken from
bgworkers pool.
3) Parallel autovacuum does not try to use thresholds like
NUM_INDEXES_PER_PARALLEL_WORKER and AV_PARALLEL_DEADTUP_THRESHOLD.
4) Parallel autovacuum now can report statistics like "planned vs. launched".
5) For now I got rid of the 'reserving' idea, so now autovacuum
leaders are competing with everyone for parallel workers from the
bgworkers pool.

What do you think about this implementation?

--
Best regards,
Daniil Davydov

Attachment

Re: POC: Parallel processing of indexes in autovacuum

From

Masahiko Sawada

Date:

18 June, 01:36:41

On Sun, May 25, 2025 at 10:22 AM Daniil Davydov <3danissimo@gmail.com> wrote:
>
> Hi,
>
> On Fri, May 23, 2025 at 6:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, May 22, 2025 at 12:44 AM Daniil Davydov <3danissimo@gmail.com> wrote:
> > >
> > > On Wed, May 21, 2025 at 5:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > I find that the name "autovacuum_reserved_workers_num" is generic. It
> > > > would be better to have a more specific name for parallel vacuum such
> > > > as autovacuum_max_parallel_workers. This parameter is related to
> > > > neither autovacuum_worker_slots nor autovacuum_max_workers, which
> > > > seems fine to me. Also, max_parallel_maintenance_workers doesn't
> > > > affect this parameter.
> > >
> > > This was my headache when I created names for variables. Autovacuum
> > > initially implies parallelism, because we have several parallel a/v
> > > workers.
> >
> > I'm not sure if it's parallelism. We can have multiple autovacuum
> > workers simultaneously working on different tables, which seems not
> > parallelism to me.
>
> Hm, I didn't thought about the 'parallelism' definition in this way.
> But I see your point - the next v4 patch will contain the naming that
> you suggest.
>
> >
> > > So I think that parameter like
> > > `autovacuum_max_parallel_workers` will confuse somebody.
> > > If we want to have a more specific name, I would prefer
> > > `max_parallel_index_autovacuum_workers`.
> >
> > It's better not to use 'index' as we're trying to extend parallel
> > vacuum to heap scanning/vacuuming as well[1].
>
> OK, I'll fix it.
>
> > > > +   /*
> > > > +    * If we are running autovacuum - decide whether we need to process indexes
> > > > +    * of table with given oid in parallel.
> > > > +    */
> > > > +   if (AmAutoVacuumWorkerProcess() &&
> > > > +       params->index_cleanup != VACOPTVALUE_DISABLED &&
> > > > +       RelationAllowsParallelIdxAutovac(rel))
> > > >
> > > > I think that this should be done in autovacuum code.
> > >
> > > We need params->index cleanup variable to decide whether we need to
> > > use parallel index a/v. In autovacuum.c we have this code :
> > > ***
> > > /*
> > >  * index_cleanup and truncate are unspecified at first in autovacuum.
> > >  * They will be filled in with usable values using their reloptions
> > >  * (or reloption defaults) later.
> > >  */
> > > tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
> > > tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
> > > ***
> > > This variable is filled in inside the `vacuum_rel` function, so I
> > > think we should keep the above logic in vacuum.c.
> >
> > I guess that we can specify the parallel degree even if index_cleanup
> > is still UNSPECIFIED. vacuum_rel() would then decide whether to use
> > index vacuuming and vacuumlazy.c would decide whether to use parallel
> > vacuum based on the specified parallel degree and index_cleanup value.
> >
> > >
> > > > +#define AV_PARALLEL_DEADTUP_THRESHOLD  1024
> > > >
> > > > These fixed values really useful in common cases? I think we already
> > > > have an optimization where we skip vacuum indexes if the table has
> > > > fewer dead tuples (see BYPASS_THRESHOLD_PAGES).
> > >
> > > When we allocate dead items (and optionally init parallel autocuum) we
> > > don't have sane value for `vacrel->lpdead_item_pages` (which should be
> > > compared with BYPASS_THRESHOLD_PAGES).
> > > The only criterion that we can focus on is the number of dead tuples
> > > indicated in the PgStat_StatTabEntry.
> >
> > My point is that this criterion might not be useful. We have the
> > bypass optimization for index vacuuming and having many dead tuples
> > doesn't necessarily mean index vacuuming taking a long time. For
> > example, even if the table has a few dead tuples, index vacuuming
> > could take a very long time and parallel index vacuuming would help
> > the situation, if the table is very large and has many indexes.
>
> That sounds reasonable. I'll fix it.
>
> > > But autovacuum (as I think) should work as stable as possible and
> > > `unnoticed` by other processes. Thus, we must :
> > > 1) Compute resources (such as the number of parallel workers for a
> > > single table's indexes vacuuming) as efficiently as possible.
> > > 2) Provide a guarantee that as many tables as possible (among
> > > requested) will be processed in parallel.
> > >
> > > (1) can be achieved by calculating the parameters on the fly.
> > > NUM_INDEXES_PER_PARALLEL_WORKER is a rough mock. I can provide more
> > > accurate value in the near future.
> >
> > I think it requires more things than the number of indexes on the
> > table to achieve (1). Suppose that there is a very large table that
> > gets updates heavily and has a few indexes. If users want to avoid the
> > table from being bloated, it would be a reasonable idea to use
> > parallel vacuum during autovacuum and it would not be a good idea to
> > disallow using parallel vacuum solely because it doesn't have more
> > than 30 indexes. On the other hand, if the table had got many updates
> > but not so now, users might want to use resources for autovacuums on
> > other tables. We might need to consider autovacuum frequencies per
> > table, the statistics of the previous autovacuum, or system loads etc.
> > So I think that in order to achieve (1) we might need more statistics
> > and using only NUM_INDEXES_PER_PARALLEL_WORKER would not work fine.
> >
>
> It's hard for me to imagine exactly how extended statistics will help
> us track such situations.
> It seems that for any of our heuristics, it will be possible to come
> up with a counter example.
> Maybe we can give advices (via logs) to the user? But for such an
> idea, tests should be conducted so that we can understand when
> resource consumption becomes ineffective.
> I guess that we need to agree on an implementation before conducting such tests.
>
> > > (2) can be achieved by workers reserving - we know that N workers
> > > (from bgworkers pool) are *always* at our disposal. And when we use
> > > such workers we are not dependent on other operations in the cluster
> > > and we don't interfere with other operations by taking resources away
> > > from them.
> >
> > Reserving some bgworkers for autovacuum could make sense. But I think
> > it's better to implement it in a general way as it could be useful in
> > other use cases too. That is, it might be a good to implement
> > infrastructure so that any PostgreSQL code (possibly including
> > extensions) can request allocating a pool of bgworkers for specific
> > usage and use bgworkers from them.
>
> Reserving infrastructure is an ambitious idea. I am not sure that we
> should implement it within this thread and feature.
> Maybe we should create a separate thread for it and as a
> justification, refer to parallel autovacuum?
>
> -----
> Thanks everybody for feedback! I attach a v4 patch to this letter.
> Main features :
> 1) 'parallel_autovacuum_workers' reloption - integer value, that sets
> the maximum number of parallel a/v workers that can be taken from
> bgworkers pool in order to process this table.
> 2) 'max_parallel_autovacuum_workers' - GUC variable, that sets the
> maximum total number of parallel a/v workers, that can be taken from
> bgworkers pool.
> 3) Parallel autovacuum does not try to use thresholds like
 > NUM_INDEXES_PER_PARALLEL_WORKER and AV_PARALLEL_DEADTUP_THRESHOLD.
> 4) Parallel autovacuum now can report statistics like "planned vs. launched".
> 5) For now I got rid of the 'reserving' idea, so now autovacuum
> leaders are competing with everyone for parallel workers from the
> bgworkers pool.
>
> What do you think about this implementation?
>

I think it basically makes sense to me. A few comments:

---
The patch implements max_parallel_autovacuum_workers as a
PGC_POSTMASTER parameter but can we make it PGC_SIGHUP? I think we
don't necessarily need to make it a PGC_POSTMATER since it actually
doesn't affect how much shared memory we need to allocate.

---
I think it's better to have the prefix "autovacuum" for the new GUC
parameter for better consistency with other autovacuum-related GUC
parameters.

---
 #include "storage/spin.h"
@@ -514,6 +515,11 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
    {
        WaitForParallelWorkersToFinish(pcxt);
        WaitForParallelWorkersToExit(pcxt);
+
+       /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+       if (AmAutoVacuumWorkerProcess())
+           ParallelAutoVacuumReleaseWorkers(pcxt->nworkers_launched);
+
        pcxt->nworkers_launched = 0;
        if (pcxt->known_attached_workers)
        {
@@ -1002,6 +1008,11 @@ DestroyParallelContext(ParallelContext *pcxt)
     */
    HOLD_INTERRUPTS();
    WaitForParallelWorkersToExit(pcxt);
+
+   /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+   if (AmAutoVacuumWorkerProcess())
+       ParallelAutoVacuumReleaseWorkers(pcxt->nworkers_launched);
+
    RESUME_INTERRUPTS();

I think that it's better to release workers in vacuumparallel.c rather
than parallel.c.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com