Re: Add index scan progress to pg_stat_progress_vacuum - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Add index scan progress to pg_stat_progress_vacuum |
Date | |
Msg-id | CAD21AoC-npaR1xsCn1JWAds1CEwF-ccqK-W+Lowa_rqym1qBwA@mail.gmail.com Whole thread Raw |
In response to | Re: Add index scan progress to pg_stat_progress_vacuum ("Imseih (AWS), Sami" <simseih@amazon.com>) |
Responses |
Re: Add index scan progress to pg_stat_progress_vacuum
|
List | pgsql-hackers |
On Sat, Nov 12, 2022 at 4:10 AM Imseih (AWS), Sami <simseih@amazon.com> wrote: > > > I don't think any of these progress callbacks should be done while pinning a > > buffer and ... > > Good point. > > > I also don't understand why info->parallel_progress_callback exists? It's only > > set to parallel_vacuum_progress_report(). Why make this stuff more expensive > > than it has to already be? > > Agree. Modified the patch to avoid the callback . > > > So each of the places that call this need to make an additional external > > function call for each page, just to find that there's nothing to do or to > > make yet another indirect function call. This should probably a static inline > > function. > > Even better to just remove a function call altogether. > > > This is called, for every single page, just to read the number of indexes > > completed by workers? A number that barely ever changes? > > I will take the initial suggestion by Sawada-san to update the progress > every 1GB of blocks scanned. > > Also, It sems to me that we don't need to track progress in brin indexes, > Since it is rare, if ever, this type of index will go through very heavy > block scans. In fact, I noticed the vacuum AMs for brin don't invoke the > vacuum_delay_point at all. > > The attached patch addresses the feedback. > Thank you for updating the patch! Here are review comments on v15 patch: + <para> + Number of indexes that wil be vacuumed. This value will be + <literal>0</literal> if there are no indexes to vacuum or + vacuum failsafe is triggered. I think that indexes_total should be 0 also when INDEX_CLEANUP is off. --- + /* + * Reset the indexes completed at this point. + * If we end up in another index vacuum cycle, we will + * start counting from the start. + */ + pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED, 0); I think we don't need to reset it at the end of index vacuuming. There is a small window before switching to the next phase. If we reset this value while showing "index vacuuming" phase, the user might get confused. Instead, we can reset it at the beginning of the index vacuuming. --- +void +parallel_wait_for_workers_to_finish(ParallelVacuumState *pvs) +{ + while (pg_atomic_read_u32(&(pvs->shared->idx_completed_progress)) < pvs->nindexes) + { + pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED, + pg_atomic_read_u32(&(pvs->shared->idx_completed_progress))); + + (void) WaitLatch(MyLatch, WL_LATCH_SET | WL_EXIT_ON_PM_DEATH, -1, + WAIT_EVENT_PARALLEL_FINISH); + ResetLatch(MyLatch); + } +} We should add CHECK_FOR_INTERRUPTS() at the beginning of the loop to make the wait interruptible. I think it would be better to update the counter only when the value has been increased. I think we should set a timeout, say 1 sec, to WaitLatch so that it can periodically check the progress. Probably it's better to have a new wait event for this wait in order to distinguish wait for workers to complete index vacuum from the wait for workers to exit. --- @@ -838,7 +867,12 @@ parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel, ivinfo.estimated_count = pvs->shared->estimated_count; ivinfo.num_heap_tuples = pvs->shared->reltuples; ivinfo.strategy = pvs->bstrategy; - + ivinfo.idx_completed_progress = pvs->shared->idx_completed_progress; and @@ -998,6 +998,9 @@ btvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats, if (info->report_progress) pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_DONE, scanblkno); + if (info->report_parallel_progress && (scanblkno % REPORT_PARALLEL_VACUUM_EVERY_PAGES) == 0) + pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED, + pg_atomic_read_u32(&(info->idx_completed_progress))); } I think this doesn't work, since ivinfo.idx_completed is in the backend-local memory. Instead, I think we can have a function in vacuumparallel.c that updates the progress. Then we can have index AM call this function. --- + if (!IsParallelWorker()) + ivinfo.report_parallel_progress = true; + else + ivinfo.report_parallel_progress = false; We can do like: ivinfo.report_parallel_progress = !IsParallelWorker(); Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: