Re: Add index scan progress to pg_stat_progress_vacuum - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Add index scan progress to pg_stat_progress_vacuum
Date
Msg-id 20221109030030.5wmvr76xj256cccw@awork3.anarazel.de
Whole thread Raw
In response to Re: Add index scan progress to pg_stat_progress_vacuum  ("Imseih (AWS), Sami" <simseih@amazon.com>)
Responses Re: Add index scan progress to pg_stat_progress_vacuum
List pgsql-hackers
Hi,

On 2022-11-04 13:27:34 +0000, Imseih (AWS), Sami wrote:
> diff --git a/src/backend/access/gin/ginvacuum.c b/src/backend/access/gin/ginvacuum.c
> index b4fa5f6bf8..3d5e4600dc 100644
> --- a/src/backend/access/gin/ginvacuum.c
> +++ b/src/backend/access/gin/ginvacuum.c
> @@ -633,6 +633,9 @@ ginbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
>          UnlockReleaseBuffer(buffer);
>          buffer = ReadBufferExtended(index, MAIN_FORKNUM, blkno,
>                                      RBM_NORMAL, info->strategy);
> +
> +        if (info->report_parallel_progress)
> +            info->parallel_progress_callback(info->parallel_progress_arg);
>      }
>  
>      /* right now we found leftmost page in entry's BTree */

I don't think any of these progress callbacks should be done while pinning a
buffer and ...

> @@ -677,6 +680,9 @@ ginbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
>          buffer = ReadBufferExtended(index, MAIN_FORKNUM, blkno,
>                                      RBM_NORMAL, info->strategy);
>          LockBuffer(buffer, GIN_EXCLUSIVE);
> +
> +        if (info->report_parallel_progress)
> +            info->parallel_progress_callback(info->parallel_progress_arg);
>      }
>  
>      MemoryContextDelete(gvs.tmpCxt);

... definitely not while holding a buffer lock.


I also don't understand why info->parallel_progress_callback exists? It's only
set to parallel_vacuum_progress_report(). Why make this stuff more expensive
than it has to already be?



> +parallel_vacuum_progress_report(void *arg)
> +{
> +    ParallelVacuumState *pvs = (ParallelVacuumState *) arg;
> +
> +    if (IsParallelWorker())
> +        return;
> +
> +    pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED,
> +                                 pg_atomic_read_u32(&(pvs->shared->idx_completed_progress)));
> +}

So each of the places that call this need to make an additional external
function call for each page, just to find that there's nothing to do or to
make yet another indirect function call. This should probably a static inline
function.

This is called, for every single page, just to read the number of indexes
completed by workers? A number that barely ever changes?

This seems all wrong.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: Perform streaming logical transactions by background workers and parallel apply
Next
From: Andres Freund
Date:
Subject: Re: Asynchronous and "direct" IO support for PostgreSQL.