Re: POC: Parallel processing of indexes in autovacuum - Mailing list pgsql-hackers

From Daniil Davydov
Subject Re: POC: Parallel processing of indexes in autovacuum
Date
Msg-id CAJDiXghP2kXnEz+cj3rAWNM3NdKSB_4WtnngFXpVz2omPhGr5A@mail.gmail.com
Whole thread Raw
In response to Re: POC: Parallel processing of indexes in autovacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
Hi,

On Tue, Sep 16, 2025 at 1:50 AM Alexander Korotkov
<aekorotkov@gmail.com> wrote:
>
> I've rebased this patchset to the current master.
> That required me to move the new GUC definition to guc_parameters.dat.
> Also, I adjusted typedefs.list and made pgindent.

Thank you for looking into it!

>
> +   {
> +       {
> +           "autovacuum_parallel_workers",
> +           "Maximum number of parallel autovacuum workers that can be used for processing this table.",
> +           RELOPT_KIND_HEAP,
> +           ShareUpdateExclusiveLock
> +       },
> +       -1, -1, 1024
> +   },
>
> Should we use MAX_PARALLEL_WORKER_LIMIT instead of hard-coded 1024 here?

I'm afraid that we will have to include an additional header file to do this.
As far as I know, we are trying not to do so. For now, I will leave it
hardcoded.

>
> - *   Support routines for parallel vacuum execution.
> + *   Support routines for parallel vacuum and autovacuum execution. In the
> + *   future comments, the word "vacuum" will refer to both vacuum and
> + *   autovacuum.
>
> Not sure about the usage of word "future" here.
> It doesn't look clear what it means.
> Could we use "below" or "within this file"?

Agree, fixed.

> I see parallel_vacuum_process_all_indexes() have a TRY/CATCH block.
> As I heard, the overhead of setting/doing jumps is platform-dependent, and
> not harmless on some platforms. Therefore, can we skip TRY/CATCH block
> for non-autovacuum vacuum?  Possibly we could move it to AutoVacWorkerMain(),
> that would save us from repeatedly setting a jump in autovacuum workers too.

Good idea. I found try/catch block inside the "do_autovacuum" function that is
obviously called only inside the autovacuum. I decided to move ReleaseAllWorkers
call there.

>
> In general, I think this patchset badly lack of testing.  I think it needs tap tests
> checking from the logs that autovacuum has been done in parallel.  Also, it
> would be good to set up some injection points, and check that reserved
> autovacuum parallel workers are getting released correctly in the case of errors.

Some time ago I tried to write a test, but it looked very ugly. Your
idea with injection points
helped me to write much more reliable tests - see it in a new (v12)
pack of patches.

On Wed, Sep 17, 2025 at 1:31 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Sep 15, 2025 at 11:50 AM Alexander Korotkov
> <aekorotkov@gmail.com> wrote:
> >
> > I see parallel_vacuum_process_all_indexes() have a TRY/CATCH block.  As I heard,
> > the overhead of setting/doing jumps is platform-dependent, and not harmless on some
> > platforms.  Therefore, can we skip TRY/CATCH block for non-autovacuum vacuum?
> > Possibly we could move it to AutoVacWorkerMain(), that would save us from repeatedly
> > setting a jump in autovacuum workers too.
>
> I wonder if using the TRY/CATCH block is not enough to ensure that
> autovacuum workers release the reserved parallel workers in FATAL
> cases.
>

That's true. I'll register "before_shmem_exit" callback for autovacuum,
which will release workers if there are any reserved and if the a/v workers
exits abnormally.

>
> IIUC the patch still has one problem in terms of reloading the
> configuration parameters during parallel mode as I mentioned
> before[1].
>

Yep. I was happy to see that you think that config file processing is OK for
autovacuum :)
I'll allow it for a/v leader. I've also thought about "compute_parallel_delay".
The simplest solution that I see is to move cost-based delay parameters to
shared state (PVShared) and create some variables such a
VacuumSharedCostBalance, so we can use them inside vacuum_delay_point.
What do you think about this idea?

Another approaches like a "tell parallel workers that they should
reload config"
looks a bit too invasive IMO.


Thanks everybody for the review! Please, see v12 patches :
1) Implement tests for parallel autovacuum
2) Fix error with unreleased workers - see try/catch block in do_autovacuum
and before_shmem_exit callback registration in AutoVacWorkerMain
3) Allow a/v leader to process config file (see guc.c)

--
Best regards,
Daniil Davydov

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Add wal_fpi_bytes_[un]compressed to pg_stat_wal
Next
From: Michael Paquier
Date:
Subject: Re: Consistently use the XLogRecPtrIsInvalid() macro