Re: Parallel heap vacuum - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Parallel heap vacuum
Date
Msg-id CAD21AoAjfm0PDF-RyMos-KZuuS73mFYYWRNj0m4AuOM_6wA9BA@mail.gmail.com
Whole thread Raw
In response to RE: Parallel heap vacuum  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
List pgsql-hackers
On Wed, Nov 13, 2024 at 3:10 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Sawada-san,
>
> > TidStoreBeginIterateShared() is designed for multiple parallel workers
> > to iterate a shared TidStore. During an iteration, parallel workers
> > share the iteration state and iterate the underlying radix tree while
> > taking appropriate locks. Therefore, it's available only for a shared
> > TidStore. This is required to implement the parallel heap vacuum,
> > where multiple parallel workers do the iteration on the shared
> > TidStore.
> >
> > On the other hand, TidStoreBeginIterate() is designed for a single
> > process to iterate a TidStore. It accepts even a shared TidStore as
> > you mentioned, but during an iteration there is no inter-process
> > coordination such as locking. When it comes to parallel vacuum,
> > supporting TidStoreBeginIterate() on a shared TidStore is necessary to
> > cover the case where we use only parallel index vacuum but not
> > parallel heap scan/vacuum. In this case, we need to store dead tuple
> > TIDs on the shared TidStore during heap scan so parallel workers can
> > use it during index vacuum. But it's not necessary to use
> > TidStoreBeginIterateShared() because only one (leader) process does
> > heap vacuum.
>
> Okay, thanks for the description. I felt it is OK to keep.
>
> I read 0001 again and here are comments.

Thank you for the review comments!

>
> 01. vacuumlazy.c
> ```
> +#define LV_PARALLEL_SCAN_SHARED         0xFFFF0001
> +#define LV_PARALLEL_SCAN_DESC           0xFFFF0002
> +#define LV_PARALLEL_SCAN_DESC_WORKER    0xFFFF0003
> ```
>
> I checked other DMS keys used for parallel work, and they seems to have name
> like PARALEL_KEY_XXX. Can we follow it?

Yes. How about LV_PARALLEL_KEY_XXX?

>
> 02. LVRelState
> ```
> +    BlockNumber next_fsm_block_to_vacuum;
> ```
>
> Only the attribute does not have comments Can we add like:
> "Next freespace map page to be checked"?

Agreed. I'll add a comment "next block to check for FSM vacuum*.

>
> 03. parallel_heap_vacuum_gather_scan_stats
> ```
> +        vacrel->scan_stats->vacuumed_pages += ss->vacuumed_pages;
> ```
>
> Note that `scan_stats->vacuumed_pages` does not exist in 0001, it is defined
> in 0004. Can you move it?

Will remove.

>
> 04. heap_parallel_vacuum_estimate
> ```
> +
> +    heap_parallel_estimate_shared_memory_size(rel, nworkers, &pscan_len,
> +                                              &shared_len, &pscanwork_len);
> +
> +    /* space for PHVShared */
> +    shm_toc_estimate_chunk(&pcxt->estimator, shared_len);
> +    shm_toc_estimate_keys(&pcxt->estimator, 1);
> +
> +    /* space for ParallelBlockTableScanDesc */
> +    pscan_len = table_block_parallelscan_estimate(rel);
> +    shm_toc_estimate_chunk(&pcxt->estimator, pscan_len);
> +    shm_toc_estimate_keys(&pcxt->estimator, 1);
> +
> +    /* space for per-worker scan state, PHVScanWorkerState */
> +    pscanwork_len = mul_size(sizeof(PHVScanWorkerState), nworkers);
> +    shm_toc_estimate_chunk(&pcxt->estimator, pscanwork_len);
> +    shm_toc_estimate_keys(&pcxt->estimator, 1);
> ```
>
> I feel pscan_len and pscanwork_len are calclated in heap_parallel_estimate_shared_memory_size().
> Can we remove table_block_parallelscan_estimate() and mul_size() from here?

Yes, it's an oversight. Will remove.

>
> 05. Idea
>
> Can you update documentations?

Will update the doc as well.

>
> 06. Idea
>
> AFAICS pg_stat_progress_vacuum does not contain information related with the
> parallel execution. How do you think adding an attribute which shows a list of pids?
> Not sure it is helpful for users but it can show the parallelism.

I think it's possible to show the parallelism even today (for parallel
index vacuuming):

=# select leader.pid, leader.query, array_agg(worker.pid) from
pg_stat_activity as leader, pg_stat_activity as worker,
pg_stat_progress_vacuum as v where leader.pid = worker.leader_pid and
leader.pid = v.pid group by 1, 2;
   pid   |        query        |     array_agg
---------+---------------------+-------------------
 2952103 | vacuum (verbose) t; | {2952257,2952258}
(1 row)

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Potential ABI breakage in upcoming minor releases
Next
From: Noah Misch
Date:
Subject: Re: Potential ABI breakage in upcoming minor releases