Re: Add index scan progress to pg_stat_progress_vacuum - Mailing list pgsql-hackers
From | Imseih (AWS), Sami |
---|---|
Subject | Re: Add index scan progress to pg_stat_progress_vacuum |
Date | |
Msg-id | 28F7536A-3E62-4A6B-9705-A2377A2D03AE@amazon.com Whole thread Raw |
In response to | Re: Add index scan progress to pg_stat_progress_vacuum (Nathan Bossart <nathandbossart@gmail.com>) |
Responses |
Re: Add index scan progress to pg_stat_progress_vacuum
|
List | pgsql-hackers |
+ <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>indexes_total</structfield> <type>bigint</type> + </para> + <para> + The number of indexes to be processed in the + <literal>vacuuming indexes</literal> or <literal>cleaning up indexes</literal> phase + of the vacuum. + </para></entry> + </row> + + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>indexes_processed</structfield> <type>bigint</type> + </para> + <para> + The number of indexes processed in the + <literal>vacuuming indexes</literal> or <literal>cleaning up indexes</literal> phase. + At the start of an index vacuum cycle, this value is set to <literal>0</literal>. + </para></entry> + </row> > Will these be set to 0 for failsafe vacuums and vacuums with INDEX_CLEANUP > turned off? If the failsafe kicks in midway through a vacuum, the number indexes_total will not be reset to 0. If INDEX_CLEANUP is turnedoff, then the value will be 0 at the start of the vacuum. +typedef struct VacWorkerProgressInfo +{ + int num_vacuums; /* number of active VACUUMS with parallel workers */ + int max_vacuums; /* max number of VACUUMS with parallel workers */ + slock_t mutex; + VacOneWorkerProgressInfo vacuums[FLEXIBLE_ARRAY_MEMBER]; +} VacWorkerProgressInfo; > max_vacuums appears to just be a local copy of MaxBackends. Does this > information really need to be stored here? Also, is there a strong reason > for using a spinlock instead of an LWLock? First, The BTVacInfo code in backend/access/nbtree/nbtutils.c inspired this, so I wanted to follow this pattern. With thatsaid, I do see max_vacuums being redundant here, and I am inclined to replace it with a MaxBackends() call. Second, There is no strong reason to use spinlock here except I incorrectly assumed it will be better for this case. Afterreading more about this and reading up src/backend/storage/lmgr/README, an LWLock will be better. +void +vacuum_worker_end(int leader_pid) +{ + SpinLockAcquire(&vacworkerprogress->mutex); + for (int i = 0; i < vacworkerprogress->num_vacuums; i++) + { + VacOneWorkerProgressInfo *vac = &vacworkerprogress->vacuums[i]; + + if (vac->leader_pid == leader_pid) + { + *vac = vacworkerprogress->vacuums[vacworkerprogress->num_vacuums - 1]; + vacworkerprogress->num_vacuums--; + SpinLockRelease(&vacworkerprogress->mutex); + break; + } + } + SpinLockRelease(&vacworkerprogress->mutex); +} > I see this loop pattern in a couple of places, and it makes me wonder if > this information would fit more naturally in a hash table. Followed the pattern in backend/access/nbtree/nbtutils.c for this as well. Using dynahash may make sense here if it simplifiesthe code. Will look. + if (callback) + callback(values, 3); > Why does this need to be set up as a callback function? Could we just call > the function if cmdtype == PROGRESS_COMMAND_VACUUM? ISTM that is pretty > much all this is doing. The intention will be for the caller to set the callback early on in the function using the existing " if (pg_strcasecmp(cmd,"VACUUM") == 0), etc." statement. This way we avoid having to add another if/else block before tuplestore_putvaluesis called. -- Sami Imseih Amazon Web Services
pgsql-hackers by date: