Re: Parallel heap vacuum - Mailing list pgsql-hackers

From Peter Smith
Subject Re: Parallel heap vacuum
Date
Msg-id CAHut+Ps9yTGk2TWdjgLQEzGfPTWafKT0H-Q6WvYh5UKNW0pCvQ@mail.gmail.com
Whole thread Raw
In response to Re: Parallel heap vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
Hi Sawada-San.

Here are some review comments for patch v11-0002

======
Commit message.

1.
Heap table AM disables the parallel heap vacuuming for now, but an
upcoming patch uses it.

This function implementation was moved into patch 0001, so probably
this part of the commit message comment also belongs now in patch
0001.

======
src/backend/commands/vacuumparallel.c

2.
+ * For processing indexes in parallel, individual indexes are processed by one
+ * vacuum process. We launch parallel worker processes at the start
of parallel index
+ * bulk-deletion and index cleanup and once all indexes are
processed, the parallel
+ * worker processes exit.
+ *

"are processed by one vacuum process." -- Did you mean "are processed
by separate vacuum processes." ?

~~~

3.
+ *
+ * Each time we process table or indexes in parallel, the parallel context is
+ * re-initialized so that the same DSM can be used for multiple
passes of table vacuum
+ * or index bulk-deletion and index cleanup.

Maybe I am mistaken, but it seems like the logic is almost always
re-initializing this. I wonder if it might be simpler to just remove
the 'need_reinitialize_dsm' field and initialize unconditionally.

~~~

4.
+ * nrequested_workers is >= 0 number and the requested parallel degree. 0
+ * means that the parallel degrees for table and indexes vacuum are decided
+ * differently. See the comments of parallel_vacuum_compute_workers() for
+ * details.
+ *
  * On success, return parallel vacuum state.  Otherwise return NULL.
  */

SUGGESTION
nrequested_workers is the requested parallel degree (>=0). 0 means that...

~~~

5.
 static int
-parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int
nrequested,
- bool *will_parallel_vacuum)
+parallel_vacuum_compute_workers(Relation rel, Relation *indrels, int nindexes,
+ int nrequested, int *nworkers_table_p,
+ bool *idx_will_parallel_vacuum)

IIUC the returns for this function seem inconsistent. AFAIK, it
previously would return the number of workers for parallel index
vacuuming. But now (after this patch) the return value is returned
Max(nworkers_table, nworkers_index). Meanwhile, the number of workers
for parallel table vacuuming is returned as a by-reference parameter
'nworkers_table_p'. In other words, it is returning the number of
workers in 2 different ways.

Why not make this a void function, but introduce another parameter
'nworkers_index_p', similar to 'nworkers_table_p'?

======
Kind Regards,
Peter Smith.
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: Parallel heap vacuum
Next
From: Masahiko Sawada
Date:
Subject: Re: maintenance_work_mem = 64kB doesn't work for vacuum