Re: Add index scan progress to pg_stat_progress_vacuum - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: Add index scan progress to pg_stat_progress_vacuum
Date
Msg-id 20220221190910.GA3740885@nathanxps13
Whole thread Raw
In response to Re: Add index scan progress to pg_stat_progress_vacuum  ("Imseih (AWS), Sami" <simseih@amazon.com>)
Responses Re: Add index scan progress to pg_stat_progress_vacuum
List pgsql-hackers
On Mon, Feb 21, 2022 at 07:03:39PM +0000, Imseih (AWS), Sami wrote:
> Sending again with patch files renamed to ensure correct apply order.

I haven't had a chance to test this too much, but I did look through the
patch set and have a couple of small comments.

+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>indexes_total</structfield> <type>bigint</type>
+      </para>
+      <para>
+       The number of indexes to be processed in the
+       <literal>vacuuming indexes</literal> or <literal>cleaning up indexes</literal> phase
+       of the vacuum.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>indexes_processed</structfield> <type>bigint</type>
+      </para>
+      <para>
+       The number of indexes processed in the
+       <literal>vacuuming indexes</literal> or <literal>cleaning up indexes</literal> phase.
+       At the start of an index vacuum cycle, this value is set to <literal>0</literal>.
+      </para></entry>
+     </row>

Will these be set to 0 for failsafe vacuums and vacuums with INDEX_CLEANUP
turned off?

+typedef struct VacWorkerProgressInfo
+{
+    int                     num_vacuums;    /* number of active VACUUMS with parallel workers */
+    int                     max_vacuums;    /* max number of VACUUMS with parallel workers */
+    slock_t         mutex;
+    VacOneWorkerProgressInfo vacuums[FLEXIBLE_ARRAY_MEMBER];
+} VacWorkerProgressInfo;

max_vacuums appears to just be a local copy of MaxBackends.  Does this
information really need to be stored here?  Also, is there a strong reason
for using a spinlock instead of an LWLock?

+void
+vacuum_worker_end(int leader_pid)
+{
+    SpinLockAcquire(&vacworkerprogress->mutex);
+    for (int i = 0; i < vacworkerprogress->num_vacuums; i++)
+    {
+        VacOneWorkerProgressInfo *vac = &vacworkerprogress->vacuums[i];
+
+        if (vac->leader_pid == leader_pid)
+        {
+            *vac = vacworkerprogress->vacuums[vacworkerprogress->num_vacuums - 1];
+            vacworkerprogress->num_vacuums--;
+            SpinLockRelease(&vacworkerprogress->mutex);
+            break;
+        }
+    }
+    SpinLockRelease(&vacworkerprogress->mutex);
+}

I see this loop pattern in a couple of places, and it makes me wonder if
this information would fit more naturally in a hash table.

+        if (callback)
+            callback(values, 3);

Why does this need to be set up as a callback function?  Could we just call
the function if cmdtype == PROGRESS_COMMAND_VACUUM?  ISTM that is pretty
much all this is doing.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: "Imseih (AWS), Sami"
Date:
Subject: Re: Add index scan progress to pg_stat_progress_vacuum
Next
From: Tom Lane
Date:
Subject: Re: Emit a warning if the extension's GUC is set incorrectly