Re: Add index scan progress to pg_stat_progress_vacuum - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Add index scan progress to pg_stat_progress_vacuum
Date
Msg-id CAD21AoDf4pf89i7fgzrHDdJueNHLm4x92xHjcDeYaV+pdhem-Q@mail.gmail.com
Whole thread Raw
In response to Re: Add index scan progress to pg_stat_progress_vacuum  ("Imseih (AWS), Sami" <simseih@amazon.com>)
Responses Re: Add index scan progress to pg_stat_progress_vacuum  ("Imseih (AWS), Sami" <simseih@amazon.com>)
List pgsql-hackers
On Wed, Mar 9, 2022 at 11:35 AM Imseih (AWS), Sami <simseih@amazon.com> wrote:
>
> >    Indeed.
>
> >    It might have already been discussed but other than using a new shmem
> >    hash for parallel vacuum, I wonder if we can allow workers to change
> >    the leader’s progress information. It would break the assumption that
> >    the backend status entry is modified by its own backend, though. But
> >    it might help for progress updates of other parallel operations too.
> >    This essentially does the same thing as what the current patch does
> >    but it doesn't require a new shmem hash.
>
> I experimented with this idea, but it did not work. The idea would have been to create a pgstat_progress_update
functionthat takes the leader pid, however infrastructure does not exist to allow one backend to manipulate another
backendsbackend status array. 
> pgstat_fetch_stat_beentry returns a local copy only.

I think if it's a better approach we can do that including adding a
new infrastructure for it.

>
> >    Another idea I come up with is that the parallel vacuum leader checks
> >    PVIndStats.status and updates how many indexes are processed to its
> >    progress information. The leader can check it and update the progress
> >    information before and after index vacuuming. And possibly we can add
> >    a callback to the main loop of index AM's bulkdelete and vacuumcleanup
> >    so that the leader can periodically make it up-to-date.
>
> >    Regards,
>
> The PVIndStats idea is also one I experimented with but it did not work. The reason being the backend checking the
progressneeds to do a shm_toc_lookup to access the data, but they are not prepared to do so. 

What I imagined is that the leader checks how many PVIndStats.status
is PARALLEL_INDVAC_STATUS_COMPLETED and updates the result to its
progress information as indexes_processed. That way, the backend
checking the progress can see it.

>
> I have not considered the callback in the index AM's bulkdelete and vacuumcleanup, but I can imagine this is not
possiblesince a leader could be busy vacuuming rather than updating counters, but I may be misunderstanding the
suggestion.

Checking PVIndStats.status values is cheap. Probably the leader can
check it every 1GB index block, for example.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



pgsql-hackers by date:

Previous
From: "Imseih (AWS), Sami"
Date:
Subject: Re: Add index scan progress to pg_stat_progress_vacuum
Next
From: "David G. Johnston"
Date:
Subject: Re: Naming of the different stats systems / "stats collector"