Re: Add index scan progress to pg_stat_progress_vacuum - Mailing list pgsql-hackers

From Imseih (AWS), Sami
Subject Re: Add index scan progress to pg_stat_progress_vacuum
Date
Msg-id 28F7536A-3E62-4A6B-9705-A2377A2D03AE@amazon.com
Whole thread Raw
In response to Re: Add index scan progress to pg_stat_progress_vacuum  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: Add index scan progress to pg_stat_progress_vacuum
List pgsql-hackers
+     <row>
    +      <entry role="catalog_table_entry"><para role="column_definition">
    +       <structfield>indexes_total</structfield> <type>bigint</type>
    +      </para>
    +      <para>
    +       The number of indexes to be processed in the
    +       <literal>vacuuming indexes</literal> or <literal>cleaning up indexes</literal> phase
    +       of the vacuum.
    +      </para></entry>
    +     </row>
    +
    +     <row>
    +      <entry role="catalog_table_entry"><para role="column_definition">
    +       <structfield>indexes_processed</structfield> <type>bigint</type>
    +      </para>
    +      <para>
    +       The number of indexes processed in the
    +       <literal>vacuuming indexes</literal> or <literal>cleaning up indexes</literal> phase.
    +       At the start of an index vacuum cycle, this value is set to <literal>0</literal>.
    +      </para></entry>
    +     </row>

    > Will these be set to 0 for failsafe vacuums and vacuums with INDEX_CLEANUP
    > turned off?

If the failsafe kicks in midway through a vacuum, the number indexes_total will not be reset to 0. If INDEX_CLEANUP is
turnedoff, then the value will be 0 at the start of the vacuum.
 

    +typedef struct VacWorkerProgressInfo
    +{
    +    int                     num_vacuums;    /* number of active VACUUMS with parallel workers */
    +    int                     max_vacuums;    /* max number of VACUUMS with parallel workers */
    +    slock_t         mutex;
    +    VacOneWorkerProgressInfo vacuums[FLEXIBLE_ARRAY_MEMBER];
    +} VacWorkerProgressInfo;

    > max_vacuums appears to just be a local copy of MaxBackends.  Does this
    > information really need to be stored here?  Also, is there a strong reason
    > for using a spinlock instead of an LWLock?

First, The BTVacInfo code in backend/access/nbtree/nbtutils.c inspired this, so I wanted to follow this pattern. With
thatsaid, I do see max_vacuums being redundant here, and I am inclined to replace it with a MaxBackends() call. 
 

Second, There is no strong reason to use spinlock here except I incorrectly assumed it will be better for this case.
Afterreading more about this and reading up src/backend/storage/lmgr/README, an LWLock will be better.
 

    +void
    +vacuum_worker_end(int leader_pid)
    +{
    +    SpinLockAcquire(&vacworkerprogress->mutex);
    +    for (int i = 0; i < vacworkerprogress->num_vacuums; i++)
    +    {
    +        VacOneWorkerProgressInfo *vac = &vacworkerprogress->vacuums[i];
    +
    +        if (vac->leader_pid == leader_pid)
    +        {
    +            *vac = vacworkerprogress->vacuums[vacworkerprogress->num_vacuums - 1];
    +            vacworkerprogress->num_vacuums--;
    +            SpinLockRelease(&vacworkerprogress->mutex);
    +            break;
    +        }
    +    }
    +    SpinLockRelease(&vacworkerprogress->mutex);
    +}

  >  I see this loop pattern in a couple of places, and it makes me wonder if
  >  this information would fit more naturally in a hash table.

Followed the pattern in backend/access/nbtree/nbtutils.c for this as well. Using dynahash may make sense here if it
simplifiesthe code. Will look.
 

    +        if (callback)
    +            callback(values, 3);

  >  Why does this need to be set up as a callback function?  Could we just call
  >  the function if cmdtype == PROGRESS_COMMAND_VACUUM?  ISTM that is pretty
  >  much all this is doing.

The intention will be for the caller to set the callback early on in the function using the existing " if
(pg_strcasecmp(cmd,"VACUUM") == 0), etc." statement. This way we avoid having to add another if/else block before
tuplestore_putvaluesis called.
 

--
Sami Imseih
Amazon Web Services


pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: Allow file inclusion in pg_hba and pg_ident files
Next
From: Jacob Champion
Date:
Subject: [PATCH] Enable SSL library detection via PQsslAttribute