Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: [HACKERS] Block level parallel vacuum
Date
Msg-id CAFiTN-uPDNz9SYr_8rZo7C3K-Vf2cZJa2WUtgApY26Detwf1-w@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Block level parallel vacuum  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: [HACKERS] Block level parallel vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On Fri, Oct 4, 2019 at 3:35 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Oct 4, 2019 at 11:01 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Oct 4, 2019 at 10:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >>
> Some more comments..
> 1.
> + for (idx = 0; idx < nindexes; idx++)
> + {
> + if (!for_cleanup)
> + lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
> +   vacrelstats->old_live_tuples);
> + else
> + {
> + /* Cleanup one index and update index statistics */
> + lazy_cleanup_index(Irel[idx], &stats[idx], vacrelstats->new_rel_tuples,
> +    vacrelstats->tupcount_pages < vacrelstats->rel_pages);
> +
> + lazy_update_index_statistics(Irel[idx], stats[idx]);
> +
> + if (stats[idx])
> + pfree(stats[idx]);
> + }
>
> I think instead of checking for_cleanup variable for every index of
> the loop we better move loop inside, like shown below?
>
> if (!for_cleanup)
> for (idx = 0; idx < nindexes; idx++)
> lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
> else
> for (idx = 0; idx < nindexes; idx++)
> {
> lazy_cleanup_index
> lazy_update_index_statistics
> ...
> }
>
> 2.
> +static void
> +lazy_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
> +    int nindexes, IndexBulkDeleteResult **stats,
> +    LVParallelState *lps, bool for_cleanup)
> +{
> + int idx;
> +
> + Assert(!IsParallelWorker());
> +
> + /* no job if the table has no index */
> + if (nindexes <= 0)
> + return;
>
> Wouldn't it be good idea to call this function only if nindexes > 0?
>
> 3.
> +/*
> + * Vacuum or cleanup indexes with parallel workers. This function must be used
> + * by the parallel vacuum leader process.
> + */
> +static void
> +lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats,
> Relation *Irel,
> + int nindexes, IndexBulkDeleteResult **stats,
> + LVParallelState *lps, bool for_cleanup)
>
> If you see this function there is no much common code between
> for_cleanup and without for_cleanup except these 3-4 statement.
> LaunchParallelWorkers(lps->pcxt);
> /* Create the log message to report */
> initStringInfo(&buf);
> ...
> /* Wait for all vacuum workers to finish */
> WaitForParallelWorkersToFinish(lps->pcxt);
>
> Other than that you have got a lot of checks like this
> + if (!for_cleanup)
> + {
> + }
> + else
> + {
> }
>
> I think code would be much redable if we have 2 functions one for
> vaccum (lazy_parallel_vacuum_indexes) and another for
> cleanup(lazy_parallel_cleanup_indexes).
>
> 4.
>  * of index scans performed.  So we don't use maintenance_work_mem memory for
>   * the TID array, just enough to hold as many heap tuples as fit on one page.
>   *
> + * Lazy vacuum supports parallel execution with parallel worker processes. In
> + * parallel lazy vacuum, we perform both index vacuuming and index cleanup with
> + * parallel worker processes. Individual indexes are processed by one vacuum
>
> Spacing after the "." is not uniform, previous comment is using 2
> space and newly
> added is using 1 space.

Few more comments
----------------------------

1.
+static int
+compute_parallel_workers(Relation onerel, int nrequested, int nindexes)
+{
+ int parallel_workers;
+ bool leaderparticipates = true;

Seems like this function is not using onerel parameter so we can remove this.


2.
+
+ /* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
+ maxtuples = compute_max_dead_tuples(nblocks, true);
+ est_deadtuples = MAXALIGN(add_size(SizeOfLVDeadTuples,
+    mul_size(sizeof(ItemPointerData), maxtuples)));
+ shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+ /* Finally, estimate VACUUM_KEY_QUERY_TEXT space */
+ querylen = strlen(debug_query_string);

for consistency with other comments change
VACUUM_KEY_QUERY_TEXT  to PARALLEL_VACUUM_KEY_QUERY_TEXT


3.
@@ -2888,6 +2888,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
  (!wraparound ? VACOPT_SKIP_LOCKED : 0);
  tab->at_params.index_cleanup = VACOPT_TERNARY_DEFAULT;
  tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
+ /* parallel lazy vacuum is not supported for autovacuum */
+ tab->at_params.nworkers = -1;

What is the reason for the same?  Can we explain in the comments?


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: format of pg_upgrade loadable_libraries warning
Next
From: Dent John
Date:
Subject: Re: The flinfo->fn_extra question, from me this time.