Re: Add index scan progress to pg_stat_progress_vacuum - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Add index scan progress to pg_stat_progress_vacuum
Date
Msg-id CAD21AoDqVNhLyegNGGezXVhhkWnGHN84TJzeuRbqU077U_KRpw@mail.gmail.com
Whole thread Raw
In response to Re: Add index scan progress to pg_stat_progress_vacuum  ("Imseih (AWS), Sami" <simseih@amazon.com>)
Responses Re: Add index scan progress to pg_stat_progress_vacuum
Re: Add index scan progress to pg_stat_progress_vacuum
List pgsql-hackers
On Fri, May 6, 2022 at 4:26 AM Imseih (AWS), Sami <simseih@amazon.com> wrote:
>
> Thank you for the feedback!
>
> >    I think we can pass the progress update function to
> >   WaitForParallelWorkersToFinish(), which seems simpler. And we can call
>
> Directly passing the callback to WaitForParallelWorkersToFinish
> will require us to modify the function signature.
>
> To me, it seemed simpler and touches less code to have
> the caller set the callback in the ParallelContext.

Okay, but if we do that, I think we should add comments about when
it's used. The callback is used only when
WaitForParallelWorkersToFinish(), but not when
WaitForParallelWorkersToExit().

Another idea I came up with is that we can wait for all index vacuums
to finish while checking and updating the progress information, and
then calls WaitForParallelWorkersToFinish after confirming all index
status became COMPLETED. That way, we don’t need to change the
parallel query infrastructure. What do you think?

>
> >    the function after updating the index status to
> >    PARALLEL_INDVAC_STATUS_COMPLETED.
>
> I also like this better. Will make the change.
>
> >    BTW, currently we don't need a lock for touching index status since
> >    each worker touches different indexes. But after this patch, the
> >    leader will touch all index status, do we need a lock for that?
>
> I do not think locking is needed here. The leader and workers
> will continue to touch different indexes to update the status.
>
> However, if the process is a leader, it will call the function
> which will go through indstats and count how many
> Indexes have a status of PARALLEL_INDVAC_STATUS_COMPLETED.
> This value is then reported to the leaders backend only.

I was concerned that the leader process could report the wrong
progress if updating and checking index status happen concurrently.
But I think it should be fine since we can read PVIndVacStatus
atomically.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: Remove support for Visual Studio 2013
Next
From: Nathan Bossart
Date:
Subject: Re: Add --{no-,}bypassrls flags to createuser